Compare commits

...

1355 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
d79a915583 app/vmui: fix last 6 months time range picker
Previously it was incorrectly selecting 6 minutes instead of 6 months.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1960
2022-01-18 23:28:10 +02:00
Aliaksandr Valialkin
8f5902dfcf docs/CHANGELOG.md: cut v1.72.0 2022-01-18 22:43:18 +02:00
Aliaksandr Valialkin
bce7d7ac60 deployment/docker: update Grafana from v8.3.2 to v8.3.4 2022-01-18 22:42:15 +02:00
Aliaksandr Valialkin
919ee73153 docs/vmbackup.md: make docs-sync after ca11def2a5 2022-01-18 22:29:17 +02:00
Aliaksandr Valialkin
f2cc4e0436 app/vmgateway/README.md: sync with enterprise vesion after adding extra_filters handling
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1863
2022-01-18 22:27:45 +02:00
Roman Khavronenko
0b0bf94c96 docs: update retention docs with additional details (#2060)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-01-18 22:25:48 +02:00
Aliaksandr Valialkin
672fcba223 app/vmui: properly calculate graph range for y axis
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2037
2022-01-18 22:22:24 +02:00
Aliaksandr Valialkin
5a77c86e97 app/vmui: reduce the refresh interval during graph scrolling/zooming from 1 second to 300 milliseconds
One second feels too laggy, so let's reduce the refresh interval to 300 milliseconds.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2064
2022-01-18 21:48:12 +02:00
Yury Molodov
8bdc45ba00 fix: remove buffer period (#2078)
* fix: remove buffer period

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the implemented feature

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2064

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-18 21:42:56 +02:00
Yury Molodov
70737ea4ac vmui: correct url encoding (#2067)
* fix: correct encode multi-line queries

* fix: change autocomplete for correct arrows work

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document the bugfix for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2039

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-18 21:31:46 +02:00
Aliaksandr Valialkin
dcadec65b6 docs/CHANGELOG.md: document 8e3f9c1fbb 2022-01-18 21:24:49 +02:00
Yury Molodov
8e3f9c1fbb vmui: correct calc axes limits (#2058)
* fix: correct calc axes limits

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-18 15:43:41 +02:00
Oct
ca11def2a5 docs: Correct config URL (#2077)
docs: update incorrect links
2022-01-18 15:06:00 +02:00
Aliaksandr Valialkin
e933e3150d docs/CHANGELOG.md: document fcd33fc409 2022-01-18 12:46:33 +02:00
Yury Molodov
fcd33fc409 vmui: change layout (#2054)
* fix: change query reset

* feat: replace @codemirror to text field

* feat: switch to Preact from React

* fix: optimize mui imports

* feat: move time selector to Header

* checkout

* fix: remove unused vars

* update package-lock.json

* fix: correct styles

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-18 12:44:22 +02:00
Aliaksandr Valialkin
c2a3911bb5 docs/CHANGELOG.md: document that vmgateway now supports extra_filters option
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1863
2022-01-18 12:39:25 +02:00
Aliaksandr Valialkin
dbfa1421ac docs/vmgateway.md: update docs, since vmgateway now supports extra_filters
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1863
2022-01-18 12:39:25 +02:00
Aziz Köksal
74a4c29729 docs: proofread FAQ.md (#2062)
* Added missing articles.
* Changed minus characters to long dashes.
* Rephrased some words.
* Corrected grammar mistakes.
* Set correct possessive apostrophes.
* "setup" -> "set up"
* "no need in" -> "no need for"
* "comparing" -> "compared"
* "less" -> "fewer"
* "additionally" -> "in addition to"
2022-01-17 21:56:19 +02:00
Aliaksandr Valialkin
44f4c4f9ba go.sum: missing update to go.sum after ce602827e5 2022-01-17 15:44:21 +02:00
Aliaksandr Valialkin
ce602827e5 vendor: make vendor-update 2022-01-17 15:43:08 +02:00
Aliaksandr Valialkin
dc7b63a793 app/vmselect/promql: properly keep metric names when optimized path is used for aggregate function calculations
For example, `sum(rate(...) keep_metric_names) by (__name__)` didn't leave the original metric name because of this issue.

This is a follup-up for 1bdc71d917

Udates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/949
2022-01-17 15:30:30 +02:00
dependabot[bot]
a5265e2a56 build(deps): bump @mui/material in /app/vmui/packages/vmui (#2075)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.2.7 to 5.2.8.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.8/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-17 10:06:06 +03:00
dependabot[bot]
060f17d1d8 build(deps-dev): bump @typescript-eslint/eslint-plugin (#2076)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.9.0 to 5.9.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.9.1/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-17 10:05:47 +03:00
dependabot[bot]
aba94ef4d6 build(deps): bump @mui/lab in /app/vmui/packages/vmui (#2074)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.63 to 5.0.0-alpha.64.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.old.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-17 09:52:54 +03:00
dependabot[bot]
e0314ad8ca build(deps-dev): bump @typescript-eslint/parser (#2073)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.9.0 to 5.9.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.9.1/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-17 09:52:43 +03:00
dependabot[bot]
fc76ecde13 build(deps): bump qs from 6.10.2 to 6.10.3 in /app/vmui/packages/vmui (#2072)
Bumps [qs](https://github.com/ljharb/qs) from 6.10.2 to 6.10.3.
- [Release notes](https://github.com/ljharb/qs/releases)
- [Changelog](https://github.com/ljharb/qs/blob/main/CHANGELOG.md)
- [Commits](https://github.com/ljharb/qs/compare/v6.10.2...v6.10.3)

---
updated-dependencies:
- dependency-name: qs
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-17 09:52:30 +03:00
Aliaksandr Valialkin
1bdc71d917 app/vmselect/promql: implement keep_metric_names modifier for transform and rollup functions
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/949
2022-01-14 04:14:59 +02:00
Aliaksandr Valialkin
f41846d002 app/vmselect/promql: add stale_samples_over_time() function 2022-01-14 01:48:04 +02:00
Aliaksandr Valialkin
96707223db docs/CHANGELOG.md: yet another attempt to fix formatting for yaml snippet 2022-01-14 01:14:55 +02:00
Aliaksandr Valialkin
d7d83d6d93 docs/CHANGELOG.md: fix formatting for scrape_configs example 2022-01-14 01:02:32 +02:00
Aliaksandr Valialkin
1d05444b33 lib/promscrape: expose promscrape_stale_samples_created_total metric for monitoring the number of created stale samples 2022-01-14 01:00:46 +02:00
Aliaksandr Valialkin
4e84c38b70 vendor: update github.com/valyala/gozstd from v1.15.0 to v1.15.1 2022-01-13 23:44:31 +02:00
Aliaksandr Valialkin
831b93a755 app/vmalert: add parseDuration function in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/8817
2022-01-13 23:30:41 +02:00
Aliaksandr Valialkin
80f03177c4 lib/promscrape/discovery/kubernetes: add __meta_kubernetes_node_provider_id label for discovered Kubernetes nodes in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/9603
2022-01-13 23:16:02 +02:00
Aliaksandr Valialkin
80f966b80c app/vmalert: add stripPort template function in the same way as Prometheus does
See https://github.com/prometheus/prometheus/pull/10002
2022-01-13 22:53:42 +02:00
Aliaksandr Valialkin
355a63733d lib/promscrape/discovery/kubernetes: add the ability to limit service discovery to the current namespace
See https://github.com/prometheus/prometheus/issues/9782 and https://github.com/prometheus/prometheus/pull/9881
2022-01-13 22:44:35 +02:00
Aliaksandr Valialkin
c883c15878 app/vmselect/promql: add support for @ modifier
Add support for `@` modifier in MetricsQL according to https://prometheus.io/docs/prometheus/latest/querying/basics/#modifier

Extend the support with the following features:
* Allow using `@` modifier everywhere in the query. For example, `sum(foo) @ end()`
* Allow using arbitrary expression as `@` modifier. For example, `foo @ (end() - 1h)`
  returns `foo` value at `end - 1 hour` timestamp on the selected time range `[start ... end]`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1348
2022-01-13 22:12:06 +02:00
Aliaksandr Valialkin
9469696e46 app/vmselect/promql: fix limit_offset() test
The test has been broken in addae7fc6a
2022-01-13 16:59:28 +02:00
Aliaksandr Valialkin
4e7026320a vendor: update github.com/valyala/gozstd from v1.14.2 to v1.15.0 2022-01-12 13:23:02 +02:00
Yury Molodov
7d5ed49d23 vmui: switching to Preact (#2053)
* feat: replace @codemirror to text field

* feat: switch to Preact from React

* fix: optimize mui imports

* fix: remove unused vars

* update package-lock.json

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-11 10:32:17 +02:00
Aliaksandr Valialkin
5c321c7178 vendor: make vendor-update 2022-01-11 10:15:42 +02:00
Aliaksandr Valialkin
17eb86a689 lib/promscrape/discovery/dockerswarm: follow up after 68a117a25a
- Document the bugfix at docs/CHANGELOG.md
- Set __address__ field after copying commonLabels to the resulting map of discovered labels.
  This makes sure that the correct __address__ label is used.
2022-01-11 09:20:10 +02:00
Alexander Shtuchkin
68a117a25a Fix for #2038: Make correct __address__ value for dockerswarm promscrape (#2041) 2022-01-11 08:59:06 +02:00
Aliaksandr Valialkin
7a2c46d951 docs/CHANGELOG.md: document 77bfa8181d 2022-01-11 08:53:49 +02:00
Dmitry Tolstoy
b434be3d2d docs: Correct config URL (#2051)
Missed /guides folder
2022-01-10 16:24:41 +02:00
dependabot[bot]
bd9e30c054 build(deps-dev): bump @typescript-eslint/eslint-plugin (#2048)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.8.1 to 5.9.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.9.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:46:45 +03:00
dependabot[bot]
90c844576e build(deps): bump @mui/lab in /app/vmui/packages/vmui (#2044)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.62 to 5.0.0-alpha.63.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:46:35 +03:00
dependabot[bot]
ae897372bc build(deps): bump @types/node in /app/vmui/packages/vmui (#2049)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 17.0.6 to 17.0.8.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:46:21 +03:00
dependabot[bot]
ad2fc75676 build(deps-dev): bump @typescript-eslint/parser (#2043)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.8.0 to 5.9.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.9.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:42:07 +03:00
dependabot[bot]
c2dbc642e7 build(deps): bump web-vitals in /app/vmui/packages/vmui (#2045)
Bumps [web-vitals](https://github.com/GoogleChrome/web-vitals) from 2.1.2 to 2.1.3.
- [Release notes](https://github.com/GoogleChrome/web-vitals/releases)
- [Changelog](https://github.com/GoogleChrome/web-vitals/blob/main/CHANGELOG.md)
- [Commits](https://github.com/GoogleChrome/web-vitals/compare/v2.1.2...v2.1.3)

---
updated-dependencies:
- dependency-name: web-vitals
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:41:45 +03:00
dependabot[bot]
2e7b537b68 build(deps): bump @mui/material in /app/vmui/packages/vmui (#2046)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.2.4 to 5.2.7.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.7/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:41:32 +03:00
dependabot[bot]
f847efe621 build(deps): bump @types/jest in /app/vmui/packages/vmui (#2047)
Bumps [@types/jest](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/jest) from 27.0.3 to 27.4.0.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/jest)

---
updated-dependencies:
- dependency-name: "@types/jest"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-10 10:41:02 +03:00
Andrey Afoninsky
77bfa8181d chore: add vmalert_remotewrite_total metric (#2040)
Co-authored-by: Andrey Afoninsky <andrey.afoninsky@booking.com>
2022-01-07 16:15:34 +02:00
Aliaksandr Valialkin
e47385d34a deployment/docker: update Go builder from v1.17.5 to v1.17.6
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.6+label%3ACherryPickApproved
2022-01-07 13:34:32 +02:00
Aliaksandr Valialkin
71fa1c8baf vendor: make vendor-update 2022-01-07 12:39:20 +02:00
Aliaksandr Valialkin
bdba50432b docs/CHANGELOG.md: add release dates for every release 2022-01-07 12:27:32 +02:00
Aliaksandr Valialkin
e4e36383e2 lib/promscrape: do not send staleness markers on graceful shutdown
This follows Prometheus behavior.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2013#issuecomment-1006994079
2022-01-07 01:17:57 +02:00
Aliaksandr Valialkin
bc03ab6688 docs/vmctl.md: make docs-sync 2022-01-07 01:17:30 +02:00
Aliaksandr Valialkin
46c310f62f docs/FAQ.md: add what is the difference between vmagent and Prometheus agent chapter 2022-01-06 23:13:11 +02:00
hagen1778
b8369e2f3e vmctl: add option to rate limit data transfer speed
The new flag `vm-rate-limit` defines data transfer speed limit
in bytes per second. Rate limiting is not applied if flag is omitted.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1405

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-01-06 12:21:42 +03:00
Denis Golius
dd1b789c15 removed not needed directory 2022-01-06 12:17:53 +03:00
Aliaksandr Valialkin
c70c064752 docs: follow-up after ae89b4e818 2022-01-05 16:31:45 +02:00
Denys Holius
ae89b4e818 Old links replaced for newest (#2033)
* replaced old links to the website

* fixed deletion main README.md file

* fix: added docs files after docs-sync
2022-01-05 16:30:13 +02:00
Aliaksandr Valialkin
dbe592597f docs/CHANGELOG.md: clarify the issue description, which is fixed by 38bf5fc136 2022-01-05 16:19:06 +02:00
Aliaksandr Valialkin
178dd87e26 lib/storage: follow-up for 38bf5fc136 2022-01-05 16:00:11 +02:00
weng zhao
38bf5fc136 vmstorage: fix query like {foo=~"bar|"} return extra timeseries cause by negative filter transformation malfunction (#2032)
1. L2749 make kb.B remain the value of comonPrefix instead of tf.prefix
2. L2762 avoid change tf.value from "bar|" to ".+r|"
2022-01-05 15:59:15 +02:00
Aliaksandr Valialkin
e1d7cbfc77 docs/CHANGELOG.md: document 60266078ca 2022-01-04 11:44:57 +02:00
Aliaksandr Valialkin
ced5f2e5e7 Revert "Add check-rebased Github action (#2002)"
This reverts commit 2104330d4c.

This check doesn't work well for community pull requests, since third-party users
aren't motivated to rebase pull requests to branch head after they are created.

This check is useful for private repositories though.
2022-01-04 11:38:16 +02:00
John Seekins
60266078ca Address some edge cases in OpenTSDB importer and speed it up (#2019)
* Simplify queries to OpenTSDB (and make them properly appear in OpenTSDB query stats) and also tweak defaults a bit

* Convert seconds to milliseconds before writing to VictoriaMetrics and increase subquery size

Signed-off-by: John Seekins <jseekins@datto.com>
2022-01-04 08:51:23 +02:00
Aliaksandr Valialkin
5ce94e1dd3 docs/CHANGELOG.md: document ac47733044 2022-01-03 21:15:44 +02:00
Roman Khavronenko
ac47733044 vmctl: improve logging during import cancels/errors (#2006)
On import process interruption `vmctl` now prints the max and min timestamps of:
* last failed batch if import ended with error;
* last sent batch if import was cancelled by user.

To get more details for each timeseries in batch user needs to specify `--verbose` flag.

The change does not relate to `vm-native` mode, since `vmctl` has no control over
transferred data in this mode.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1236
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2022-01-03 21:12:01 +02:00
Aliaksandr Valialkin
ceade70d4e app/vmselect/vmui: run make vmui-update after 89ff7b2465 2022-01-03 21:03:37 +02:00
Yury Molodov
89ff7b2465 vmui: replace @codemirror to text field (#2003)
* feat: replace @codemirror to text field

* update package-lock.json

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2022-01-03 21:00:54 +02:00
Yury Molodov
042570584f fix: change query reset (#2001) 2022-01-03 20:55:28 +02:00
dependabot[bot]
8262372d72 build(deps-dev): bump @typescript-eslint/eslint-plugin (#2028)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.8.0 to 5.8.1.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.8.1/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-03 21:02:03 +03:00
dependabot[bot]
6e75129c77 build(deps): bump @types/node in /app/vmui/packages/vmui (#2027)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 17.0.1 to 17.0.6.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-03 20:59:05 +03:00
dependabot[bot]
cbeaa000ef build(deps-dev): bump @babel/plugin-proposal-nullish-coalescing-operator (#2026)
Bumps [@babel/plugin-proposal-nullish-coalescing-operator](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-proposal-nullish-coalescing-operator) from 7.16.5 to 7.16.7.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.16.7/packages/babel-plugin-proposal-nullish-coalescing-operator)

---
updated-dependencies:
- dependency-name: "@babel/plugin-proposal-nullish-coalescing-operator"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-03 20:58:09 +03:00
dependabot[bot]
72d127e187 build(deps-dev): bump react-app-rewired in /app/vmui/packages/vmui (#2025)
Bumps [react-app-rewired](https://github.com/timarney/react-app-rewired) from 2.1.9 to 2.1.11.
- [Release notes](https://github.com/timarney/react-app-rewired/releases)
- [Commits](https://github.com/timarney/react-app-rewired/compare/v2.1.9...v2.1.11)

---
updated-dependencies:
- dependency-name: react-app-rewired
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-03 20:56:54 +03:00
dependabot[bot]
70bd94b50b build(deps): bump @mui/lab in /app/vmui/packages/vmui (#2024)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.61 to 5.0.0-alpha.62.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-01-03 20:51:21 +03:00
Aliaksandr Valialkin
b1b67169f1 docs/Quick-Start.md: simplify quick start guide 2022-01-03 17:28:38 +02:00
Aliaksandr Valialkin
a8d74e15dd LICENSE: update year from 2021 to 2022 2022-01-03 16:58:33 +02:00
dependabot[bot]
7339645a29 build(deps-dev): bump eslint-plugin-react in /app/vmui/packages/vmui (#2017)
Bumps [eslint-plugin-react](https://github.com/yannickcr/eslint-plugin-react) from 7.27.1 to 7.28.0.
- [Release notes](https://github.com/yannickcr/eslint-plugin-react/releases)
- [Changelog](https://github.com/yannickcr/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yannickcr/eslint-plugin-react/compare/v7.27.1...v7.28.0)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:47:30 +03:00
dependabot[bot]
12d0a59074 build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#2014)
Bumps [@mui/icons-material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-icons-material) from 5.2.4 to 5.2.5.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.5/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:45:47 +03:00
dependabot[bot]
923bb42cb9 build(deps-dev): bump @typescript-eslint/parser (#2015)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.7.0 to 5.8.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.8.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:45:39 +03:00
dependabot[bot]
a6a39a2591 build(deps-dev): bump react-app-rewired in /app/vmui/packages/vmui (#2009)
Bumps [react-app-rewired](https://github.com/timarney/react-app-rewired) from 2.1.8 to 2.1.9.
- [Release notes](https://github.com/timarney/react-app-rewired/releases)
- [Commits](https://github.com/timarney/react-app-rewired/compare/2.1.8...v2.1.9)

---
updated-dependencies:
- dependency-name: react-app-rewired
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:41:16 +03:00
dependabot[bot]
5721804047 build(deps-dev): bump @typescript-eslint/eslint-plugin (#2010)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.7.0 to 5.8.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.8.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:41:04 +03:00
dependabot[bot]
498b166e5f build(deps): bump @mui/lab in /app/vmui/packages/vmui (#2012)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.60 to 5.0.0-alpha.61.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:40:45 +03:00
dependabot[bot]
cc63f80193 build(deps): bump @types/react in /app/vmui/packages/vmui (#2011)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.37 to 17.0.38.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-27 16:38:33 +03:00
Dima Lazerka
2104330d4c Add check-rebased Github action (#2002)
It will prevent merging in a branch that's not based on its base branch HEAD, leading to streamlined history.

Note it will not prevent squash commits, nor commits directly to base branch.
2021-12-24 11:38:06 +03:00
Yury Molodov
681a800086 vmui: legend fixes (#1995)
* feat: add a reset query by clicking the logo

* feat: add sequence number for query fields

* feat: invert behavior on the graph's legend

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-23 12:14:16 +02:00
Yurii Kravets
f0c331c724 Update README.md (#1996)
* Update README.md

go 1.16 -> 1.17

* Update README.md

* Update README.md

* Update Cluster-VictoriaMetrics.md

* Update Single-server-VictoriaMetrics.md

* Update vmauth.md

* Update vmbackup.md

* Update vmrestore.md

* Update vmagent.md

* Update vmctl.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md
2021-12-23 12:09:59 +02:00
Aliaksandr Valialkin
b5ce35dfc8 docs/CHANGELOG.md: document 543bd0ea0c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1999
2021-12-23 12:08:06 +02:00
Roman Khavronenko
543bd0ea0c vmselect: update /query_exemplars placeholder (#2000)
Grafana expects `data` in response to be a slice and logs an err
if it is not:
```
err="[]v1.ExemplarQueryResult: decode slice: expect [ or n, but found , error found in #0 byte of ...||..., bigger context ...||..."
```

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1999
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-23 11:53:50 +02:00
Aliaksandr Valialkin
cbaa2af280 lib/promscrape: scrape replicated targets at different offsets in vmagent replicated clustering mode
This guarantees that the deduplication consistently leaves samples from the same vmagent replica.

See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets
2021-12-23 00:20:39 +02:00
Aliaksandr Valialkin
c7826ab36e docs/FAQ.md: link to managed VictoriaMetrics at AWS
See https://docs.victoriametrics.com/FAQ.html#what-is-the-pricing-for-victoriametrics
2021-12-22 23:14:47 +02:00
Nikolay
8ff7da7202 adds restore.lock (#1988)
* adds restore.lock
it must prevent from running storage after incomplete restore process
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958

* return back flock file deletion

* Apply suggestions from code review

* wip

* docs/CHANGELOG.md: document https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1958

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-22 13:10:15 +02:00
Aliaksandr Valialkin
f40b1e7e9f vendor: make vendor-update 2021-12-22 12:36:27 +02:00
Aliaksandr Valialkin
9e17b51d45 go.mod: update minimum Go version from Go 1.16 to Go 1.17
VictoriaMetrics code uses features from Go 1.17, so the minimum Go version must be increased from Go 1.16 to Go 1.17

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1987
2021-12-22 12:27:02 +02:00
Aliaksandr Valialkin
0f97c34204 Revert "Add .github/workflows/check-based-on-master (#1991)"
This reverts commit 06cf4e0f70.

This break merge requests to non-master branches - see https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1993#issuecomment-999403963
2021-12-22 11:18:11 +02:00
Dima Lazerka
06cf4e0f70 Add .github/workflows/check-based-on-master (#1991) 2021-12-21 20:27:41 +02:00
Roman Khavronenko
9bb7905d26 vmalert: check if remoteWrite is configured for replay mode (#1990)
* vmalert: check if remoteWrite is configured for replay mode

The purpose of `replay` mode is to backfill results of recording
or alerting rules. So `remoteWrite.url` should be required.
Otherwise, process can fail on attempt to send data.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* Update app/vmalert/main.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 20:25:47 +02:00
Yury Molodov
4b40acd964 vmui: add custom start range (#1989)
* feat: add custom start range

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 20:19:33 +02:00
Aliaksandr Valialkin
ce333f28d8 all: use logger.WithThrottler() where appropriate 2021-12-21 17:03:25 +02:00
Aliaksandr Valialkin
3cfb90b227 docs/CHANGELOG.md: document 34fdc8881b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911
2021-12-21 16:40:39 +02:00
Roman Khavronenko
34fdc8881b vmagent: add error log for skipped data block when rejected by receiv… (#1956)
* vmagent: add error log for skipped data block when rejected by receiving side

Previously, rejected data blocks were silently dropped - only metrics were update.
From operational perspective, having an additional logging for such cases is preferable.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1911

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmagent: throttle log messages about skipped blocks

The new type of logger was added to logger pacakge.
This new type supposed to control number of logged messages
by time.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* lib/logger: make LogThrottler public, so its methods can be inspected by external packages

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-21 16:36:09 +02:00
Denys Holius
5dc9ab5829 bump revision of dashboards to latest (#1986) 2021-12-21 12:11:15 +02:00
Denys Holius
d44cc14c6b added packer build for DigitalOcean Droplets (#1917)
* added packer build for DigitalOcean Droplets

* fixed typo

* added packer RELEASE_GUIDE.md, Makefile

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* added corrections amd improvements

* added packer link & templating for sed version

* fixed typo

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2021-12-21 12:09:14 +02:00
Aliaksandr Valialkin
ee17516afd docs/Single-server-VictoriaMetrics.md: mention that recording rules in vmalert can be used for reducing the number of time series 2021-12-20 20:03:42 +02:00
Aliaksandr Valialkin
4701c108ff docs/CHANGELOG.md: cut v1.71.0 2021-12-20 19:10:24 +02:00
Aliaksandr Valialkin
b9363d9726 lib/promscrape: take into account the original job_name when creating an unique key per each scrape target
This should handle the case when the original job_name has been changed in -promscrape.config ,
while the resulting job label remains the same because it is overriden via relabeling.
2021-12-20 18:38:05 +02:00
Aliaksandr Valialkin
afafeb379a all: typo fix: unexected -> unexpected 2021-12-20 17:39:52 +02:00
Yury Molodov
718c352946 vmui: graph fixes (#1982)
* fix: remove disabling custom step when zooming

* feat: add a dynamic calc of the width of the graph

* fix: add validate y-axis limits

* fix: correct axis limits for value 0

* fix: change logic create time series

* fix: change types for tooltip

* fix: correct points on the line

* fix: change the logic for set graph width

* fix: stop checking the period when auto-refresh is enabled

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-20 17:37:02 +02:00
Roman Khavronenko
871528fedb dashboards/vmagent: fix cached datasource uid (#1984)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-20 17:32:41 +02:00
Roman Khavronenko
52a3b2d77e Dashboards vmsingle (#1980)
* dashboards/vmsingle: add "Merges deferred" panel

The new panel supposed to show if there were deferred merges
due to insufficient disk space.
It goes within alerting rule which suppose to send a signal
in such cases.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmsingle: add "Cache usage" panel

The new panel supposed to show the % of the used cache
compared to allowed size by type.
It should help to determine underutilized types of caches.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmsingle: bump version requirement

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmsingle: rm alert for `vm_merge_need_free_disk_space`

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-20 17:28:35 +02:00
Aliaksandr Valialkin
5a36e241f4 lib/persistentqueue: check that readerOffset doesnt exceed writerOffset after each readerOffset increase
This should help detecting the source of the panic from https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1981
2021-12-20 17:25:11 +02:00
Aliaksandr Valialkin
ad388ecd78 docs: mention that downsampling can be evaluated for free by running enterprise binaries 2021-12-20 15:57:25 +02:00
Aliaksandr Valialkin
6d77cc9b08 app/vmselect/vmui: make vmui-update 2021-12-20 13:51:02 +02:00
Yurii Kravets
40073bbcb5 Update FAQ.md (#1765)
* Update FAQ.md

Adding explanation "Why do same metrics have differences in VictoriaMetrics and Prometheus dashboards?"

* Update FAQ

* Update FAQ.md

* Apply suggestions from code review

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2021-12-20 13:43:04 +02:00
Aliaksandr Valialkin
974d9c0eee app/vmselect/promql: follow-up after 177e345d8a
* Document changes_prometheus(), increase_prometheus() and delta_prometheus() functions.
* Simplify their implementation
* Mention these functions in docs/CHANGELOG.md
2021-12-20 13:19:44 +02:00
dependabot[bot]
46eee933b7 build(deps): bump react-scripts in /app/vmui/packages/vmui (#1977)
Bumps [react-scripts](https://github.com/facebook/create-react-app/tree/HEAD/packages/react-scripts) from 4.0.3 to 5.0.0.
- [Release notes](https://github.com/facebook/create-react-app/releases)
- [Changelog](https://github.com/facebook/create-react-app/blob/main/CHANGELOG.md)
- [Commits](https://github.com/facebook/create-react-app/commits/react-scripts@5.0.0/packages/react-scripts)

---
updated-dependencies:
- dependency-name: react-scripts
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 14:16:45 +03:00
dependabot[bot]
4ba1f62507 build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#1978)
Bumps [@mui/icons-material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-icons-material) from 5.2.1 to 5.2.4.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.4/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:59:09 +03:00
dependabot[bot]
e11c09be82 build(deps): bump @types/node in /app/vmui/packages/vmui (#1979)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 16.11.12 to 17.0.1.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:54:26 +03:00
dependabot[bot]
d56bd7df19 build(deps): bump @mui/material in /app/vmui/packages/vmui (#1976)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.2.3 to 5.2.4.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.4/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:44:44 +03:00
dependabot[bot]
52335bb48e build(deps-dev): bump @babel/plugin-proposal-nullish-coalescing-operator (#1975)
Bumps [@babel/plugin-proposal-nullish-coalescing-operator](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-proposal-nullish-coalescing-operator) from 7.16.0 to 7.16.5.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.16.5/packages/babel-plugin-proposal-nullish-coalescing-operator)

---
updated-dependencies:
- dependency-name: "@babel/plugin-proposal-nullish-coalescing-operator"
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:44:19 +03:00
dependabot[bot]
7171ce767f build(deps): bump @codemirror/basic-setup in /app/vmui/packages/vmui (#1974)
Bumps [@codemirror/basic-setup](https://github.com/codemirror/basic-setup) from 0.19.0 to 0.19.1.
- [Release notes](https://github.com/codemirror/basic-setup/releases)
- [Changelog](https://github.com/codemirror/basic-setup/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/basic-setup/compare/0.19.0...0.19.1)

---
updated-dependencies:
- dependency-name: "@codemirror/basic-setup"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:36:12 +03:00
匠心零度
177e345d8a add Prometheus semantics function :changes_prometheus、delta_prometheus、increase_prometheus (#1972)
Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-12-20 12:32:43 +02:00
Aliaksandr Valialkin
d87414c57c docs/FAQ.md: describe main reasons for high churn rate 2021-12-20 12:30:09 +02:00
Roman Khavronenko
bc79bdf68a Dashboards vmagent updates (#1973)
* dashboards/vmagent: shuffle panels for better visibility

More important error/dropped panels were moved higher on the main row.
Network usage panel moved to Resource usage row.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add Troubleshooting row to show top 5 instances/jobs by churn rate

New panels are supposed to show top 5 jobs or targets which generate the most
of the churn rate. They were placed into a new row "Troubleshooting".

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add panels for showing persistent queue saturation

New panels were added to Torubleshooting row to show the persistent queue
saturation. The corresponding alerts were added and linked to these
panels as well.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* dashboards/vmagent: add alert "RejectedRemoteWriteDataBlocksAreDropped"

New alert suppose to send a notification when vmagent starts to drop
data blocks rejected by configured remote write destiantion.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-20 12:16:53 +02:00
dependabot[bot]
36f4130cf1 build(deps-dev): bump @typescript-eslint/eslint-plugin (#1968)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.6.0 to 5.7.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.7.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:15:15 +03:00
dependabot[bot]
2fe74069be build(deps): bump uplot from 1.6.17 to 1.6.18 in /app/vmui/packages/vmui (#1967)
Bumps [uplot](https://github.com/leeoniya/uPlot) from 1.6.17 to 1.6.18.
- [Release notes](https://github.com/leeoniya/uPlot/releases)
- [Commits](https://github.com/leeoniya/uPlot/compare/1.6.17...1.6.18)

---
updated-dependencies:
- dependency-name: uplot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 13:15:00 +03:00
Aliaksandr Valialkin
7749b47d6a vendor: make vendor-update 2021-12-20 12:07:22 +02:00
dependabot[bot]
a6b86941a1 build(deps-dev): bump @typescript-eslint/parser (#1965)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.6.0 to 5.7.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.7.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 12:02:54 +03:00
dependabot[bot]
76bb135181 build(deps): bump @mui/lab in /app/vmui/packages/vmui (#1966)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.59 to 5.0.0-alpha.60.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 12:02:33 +03:00
dependabot[bot]
be17387682 build(deps): bump typescript in /app/vmui/packages/vmui (#1969)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 4.5.3 to 4.5.4.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v4.5.3...v4.5.4)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-20 12:02:07 +03:00
John Seekins
b9c41ff051 OpenTSDB Migration Fix (#1946)
* Simplify queries to OpenTSDB (and make them properly appear in OpenTSDB query stats) and also tweak defaults a bit

Signed-off-by: John Seekins <jseekins@datto.com>
2021-12-20 09:35:51 +02:00
Aliaksandr Valialkin
16636a458f docs/CHANGELOG.md: document 6814cc6809
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
2021-12-17 20:16:22 +02:00
Aliaksandr Valialkin
8a7f08ded3 lib/storage: properly update per-part min_dedup_interval file contents after merge
Previously 0s was always written even if -dedup.minScrapeInterval was set to non-zero value

This is a follow-up for 4ff647137a
2021-12-17 20:13:24 +02:00
Roman Khavronenko
6814cc6809 vmalert: always convert step value to seconds for better compatibility (#1955)
When using `vmalert` with older Prometheus versions, the passed
`step=2m` may be parsed by Prometheus with an err: "cannot parse \"2m0s\" to a valid duration".
In order to improve compatibility vmalert will always convert step duration to seconds.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-17 15:26:34 +02:00
Aliaksandr Valialkin
e6d4641bf0 app/vmselect/vmui: make vmui-update 2021-12-17 11:01:09 +02:00
Aliaksandr Valialkin
193331d522 app/vmselect: de-duplicate data exported via /api/v1/export/csv by default
Previously the exported data wasn't de-duplicated.
Now it is possible to export the raw data without deduplication
by passing reduce_mem_usage=1 query arg to /api/v1/export/csv

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1837
2021-12-17 10:57:39 +02:00
Roman Khavronenko
f30ed13155 docs: add Benchmarks section (#1950)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-16 19:15:19 +03:00
Roman Khavronenko
8b0d340c18 docs: update MetricsQL.Subquery section description (#1951)
* simplify sentences;
* fix typo.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-16 19:09:25 +03:00
Yury Molodov
eaf82fe411 fix: return query for app mode (#1954) 2021-12-16 11:44:46 +02:00
Aliaksandr Valialkin
a3adf24527 lib/promscrape: allow up to 5 redirects when scraping a target by default
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1945
2021-12-16 00:14:14 +02:00
Aliaksandr Valialkin
5efe377a26 app/vmselect/promql: add timestamp_with_name(m[d]) function
This function works the same as `timestamp()`, but doesn't remove source time series names.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/949#issuecomment-995222388
2021-12-15 23:37:07 +02:00
Aliaksandr Valialkin
27a1ae57e5 docs: mention -storage.minFreeDiskSpaceBytes command-line flag at capacity planning section
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-12-15 21:41:30 +02:00
Yury Molodov
9baad51004 vmui: introduce application mode (#1949)
* feat: add a label for the Query field

* fix: change zoom position

* fix: add description and error code to alerts

* fix: correct logic query history

* fix: correct update query history

* feat: add custom step

* update package-lock.json

* feat: introduce application mode

* build vmui

* Revert "build vmui"

This reverts commit c0e2415550.

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-15 21:33:25 +02:00
Aliaksandr Valialkin
65bef771f6 docs/Cluster-VictoriaMetrics.md: mention about the added downsampling support 2021-12-15 16:40:15 +02:00
Aliaksandr Valialkin
8e1a87491a docs: document the added dowsnampling support in VictoriaMetrics enterprise
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36
2021-12-15 16:25:51 +02:00
Aliaksandr Valialkin
4ff647137a lib/storage: deduplicate samples more thoroughly
Previously some duplicate samples may be left on disk for time series with high churn rate.
This may result in higher disk space usage.
2021-12-15 15:59:58 +02:00
Aliaksandr Valialkin
92070cbb67 lib/storage: return dedup interval in milliseconds from GetDedupInterval()
This removes duplicate .Milliseconds() calls after GetDedupInterval() calls.
2021-12-15 13:26:38 +02:00
Aliaksandr Valialkin
acd56603b0 app/victoria-metrics: mention https://docs.victoriametrics.com/#downsampling in the description for -dedup.minScrapeInterval command-line flag 2021-12-15 13:18:04 +02:00
Aliaksandr Valialkin
1d20a19c7d lib/storage: explicitly pass dedupInterval to DeduplicateSamples() and deduplicateSamplesDuringMerge()
This improves the code readability and debuggability, since the output of these functions
stops depending on global state.
2021-12-14 20:49:12 +02:00
Aliaksandr Valialkin
e1a715b0f5 lib/storage: convert alternate regexps into Graphite wildcards inside __graphite__ pseudo-label
For example, `{__graphite__=~"foo.(bar|baz)"}` is automatically converted to `{__graphite__=~"foo.{bar,baz}"}` before execution.
This allows using multi-value Grafana template variables such as `{__graphite__=~"foo.($app)"}`.
2021-12-14 19:51:49 +02:00
Aliaksandr Valialkin
496b6e4d3d deployment/docker/docker-compose.yml: update Grafana version from 8.2.2 to 8.3.2
See https://grafana.com/blog/2021/12/10/grafana-8.3.2-and-7.5.12-released-with-moderate-severity-security-fix/
2021-12-14 15:09:49 +02:00
Aliaksandr Valialkin
d456af7499 docs/CHANGELOG.md: link to the issue about unaligned 64-bit atomic opertion panic on 32-bit architectures 2021-12-14 15:00:10 +02:00
Aliaksandr Valialkin
80996d916b docs/CHANGELOG.md: an attempt to properly show $labels.alertname at https://docs.victoriametrics.com/CHANGELOG.html 2021-12-14 14:56:57 +02:00
Yury Molodov
49e6a921df vmui: custom step (#1942)
* feat: add a label for the Query field

* fix: change zoom position

* fix: add description and error code to alerts

* fix: correct logic query history

* fix: correct update query history

* feat: add custom step

* update package-lock.json

* docs: document that VMUI now supports overriding of `step` query arg, which is passed to `/api/v1/query_range`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-14 14:51:45 +02:00
Yury Molodov
b5b701d590 vmui: minor fixes (#1936)
* feat: add a label for the Query field

* fix: change zoom position

* fix: add description and error code to alerts

* fix: correct logic query history

* fix: correct update query history

* app/vmselect/vmui: `make vmui-update`

* docs/CHANGELOG.md: document bugfixes

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-13 13:42:37 +02:00
Aliaksandr Valialkin
ce80a0ce5e app/vmselect/vmui: make vmui-update 2021-12-13 13:32:54 +02:00
Aliaksandr Valialkin
0a157f65bd docs/Single-server-VictoriaMetrics.md: added a link to Graphite paths and wildcards - https://graphite.readthedocs.io/en/latest/render_api.html#paths-and-wildcards 2021-12-13 12:01:17 +02:00
dependabot[bot]
f6f1e1821e build(deps): bump @mui/lab in /app/vmui/packages/vmui (#1935)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.58 to 5.0.0-alpha.59.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 12:00:19 +03:00
dependabot[bot]
c522630f72 build(deps): bump @codemirror/commands in /app/vmui/packages/vmui (#1933)
Bumps [@codemirror/commands](https://github.com/codemirror/commands) from 0.19.5 to 0.19.6.
- [Release notes](https://github.com/codemirror/commands/releases)
- [Changelog](https://github.com/codemirror/commands/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/commands/compare/0.19.5...0.19.6)

---
updated-dependencies:
- dependency-name: "@codemirror/commands"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:56:47 +03:00
dependabot[bot]
e98a863f91 build(deps-dev): bump @typescript-eslint/parser (#1934)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.5.0 to 5.6.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.6.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:52:43 +03:00
dependabot[bot]
1d2708b147 build(deps): bump @codemirror/view in /app/vmui/packages/vmui (#1931)
Bumps [@codemirror/view](https://github.com/codemirror/view) from 0.19.26 to 0.19.29.
- [Release notes](https://github.com/codemirror/view/releases)
- [Changelog](https://github.com/codemirror/view/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/view/compare/0.19.26...0.19.29)

---
updated-dependencies:
- dependency-name: "@codemirror/view"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:52:26 +03:00
dependabot[bot]
51e7ba65ff build(deps): bump @emotion/react in /app/vmui/packages/vmui (#1930)
Bumps [@emotion/react](https://github.com/emotion-js/emotion) from 11.7.0 to 11.7.1.
- [Release notes](https://github.com/emotion-js/emotion/releases)
- [Changelog](https://github.com/emotion-js/emotion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/emotion-js/emotion/compare/@emotion/react@11.7.0...@emotion/react@11.7.1)

---
updated-dependencies:
- dependency-name: "@emotion/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:52:13 +03:00
dependabot[bot]
f583b7bdcf build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#1929)
Bumps [@mui/icons-material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-icons-material) from 5.2.0 to 5.2.1.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.1/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:51:56 +03:00
dependabot[bot]
9929f08968 build(deps): bump @types/node in /app/vmui/packages/vmui (#1932)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 16.11.11 to 16.11.12.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:50:01 +03:00
dependabot[bot]
60f734ecce build(deps): bump @mui/material in /app/vmui/packages/vmui (#1925)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.2.2 to 5.2.3.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.3/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:49:28 +03:00
dependabot[bot]
325758317f build(deps): bump @testing-library/jest-dom in /app/vmui/packages/vmui (#1924)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 5.16.0 to 5.16.1.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v5.16.0...v5.16.1)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:49:08 +03:00
dependabot[bot]
b5cec7fdb9 build(deps): bump typescript in /app/vmui/packages/vmui (#1926)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 4.5.2 to 4.5.3.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v4.5.2...v4.5.3)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:46:05 +03:00
dependabot[bot]
e37152d74e build(deps): bump @mui/styles in /app/vmui/packages/vmui (#1927)
Bumps [@mui/styles](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-styles) from 5.2.2 to 5.2.3.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.3/packages/mui-styles)

---
updated-dependencies:
- dependency-name: "@mui/styles"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:45:54 +03:00
dependabot[bot]
b02d655dcf build(deps-dev): bump @typescript-eslint/eslint-plugin (#1928)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.5.0 to 5.6.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.6.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-13 11:45:42 +03:00
Aliaksandr Valialkin
c82cc9cd11 docs/CHANGELOG.md: document 7c3b6365f0 2021-12-12 19:09:20 +02:00
Yury Molodov
7c3b6365f0 vmui: add a label for the Query field (#1923)
* feat: add a label for the Query field

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-12 19:06:39 +02:00
Aliaksandr Valialkin
a8ad870bd0 deployment/docker: update Go builder from v1.17.4 to v1.17.5
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.5+label%3ACherryPickApproved
2021-12-12 18:17:49 +02:00
Aliaksandr Valialkin
7d58f57a52 vendor: make vendor-update 2021-12-12 18:10:09 +02:00
Aliaksandr Valialkin
d1f8915ed1 app/vmselect/promql: preserve the order of time series passed to limit_offset() function
Previously time series passed to `limit_offset()` were shuffled according to hash for their labels.
This was unexpected behaviour for most users.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1920 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/951
2021-12-12 18:04:58 +02:00
Aliaksandr Valialkin
3d4349343d docs/CHANGELOG.md: document 2851709745
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1921
2021-12-10 12:13:34 +02:00
Roman Khavronenko
2851709745 vmalert: update the order of service labels attaching (#1922)
Service labels like `alertname` or `alertgroup` were attached
after template expanding for `labels` section. Because of this,
labels `alertname` or `alertgroup` weren't available for templating
in `labels` section of alert's definition.
This commit changes the order of labels attaching and adds a test
for verifying these labels availability.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1921
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-10 12:10:26 +02:00
Aliaksandr Valialkin
b85e88e2db app/vmui/README.md: remove features chapter, since it lists unimportant and/or misleading features
The main user-visible features for vmui are documented at https://docs.victoriametrics.com/#vmui .
2021-12-09 19:50:25 +02:00
Aliaksandr Valialkin
b42981c465 docs/Single-server-VictoriaMetrics.md: add a link to https://github.com/denisgolius/victoriametrics-ru-links 2021-12-09 19:42:27 +02:00
Aliaksandr Valialkin
a2e0275f14 deployment/docker: update Go builder from v1.17.3 to v1.17.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.4+label%3ACherryPickApproved
2021-12-09 18:51:58 +02:00
Aliaksandr Valialkin
52eb9c99e2 docs: document the ability to investigate correlation between two queries at /vmui
This is a follow-up for c1fd93e8a0
2021-12-08 17:24:01 +02:00
Yury Molodov
c1fd93e8a0 vmui: multiple queries (#1916)
* feat: change duration by "enter"

* fix: optimize data processing for chart

* feat: set minimum step to 1ms

* update dependencies

* feat: remove save the last query to local storage

* fix: handle an error in a table with subqueries

* feat: store display type in URL

* Revert "feat: store display type in URL"

This reverts commit ccc242c69a.

* feat: store display type in URL

* refactor: move the time setting to a folder

* refactor: move the query configurator to a folder

* refactor: move the auth settings to a folder

* feat: improve styles

* feat: add multi query

* update package-lock

* feat: add display multiple queries

* feat: add limits for multiple queries

* update dependencies

* feat: add history for multiple queries

* feat: add line type to legend

* feat: change style for switch

* feat: change the logic for axes limits for multiple queries

* update package-lock.json

* update dependencies

* feat: add the filter to legend

* wip

* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'

This allows copy-pasting the url to http server from logs.

* lib/httpserver: add missing 127.0.0.1 hostname to the logged address for http and pprof server if the address starts with ':'

This allows copy-pasting the url to http server from logs.

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-08 16:40:15 +02:00
Aliaksandr Valialkin
0288078cfb docs/Single-server-VictoriaMetrics.md: add LinkedIn public channel 2021-12-08 13:15:24 +02:00
Aliaksandr Valialkin
64da5c2bf6 docs/Release-Guide.md: refresh the list of channels for publishing release notes 2021-12-08 13:15:05 +02:00
Nikolay
a581a93c9b reworks snap build with docker (#1910) 2021-12-08 13:05:35 +02:00
Aliaksandr Valialkin
896fa9bb7c app/vmalert/config: sort extra_filter labels before passing them to query args in order to get consistent order of query args across runs
This fixes TestGroupParams test - see https://github.com/VictoriaMetrics/VictoriaMetrics/runs/4432510244?check_suite_focus=true#step:5:288
2021-12-08 13:02:49 +02:00
Aliaksandr Valialkin
2711d2ea55 docs/Single-server-VictoriaMetrics.md: move features chapter above the case studies chapter 2021-12-08 12:49:09 +02:00
Aliaksandr Valialkin
ff15a752c1 app/vmselect: accept optional extra_filters[] query args for all the supported Prometheus querying APIs
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1863
2021-12-06 17:07:09 +02:00
Aliaksandr Valialkin
45d082bbe2 app/vminsert: add -maxLabelValueLen command-line flag
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1908
2021-12-06 11:40:34 +02:00
Aliaksandr Valialkin
732a0cd3e1 app/vmselect/vmui: make vmui-update 2021-12-06 10:19:09 +02:00
Aliaksandr Valialkin
e2d9bf3b57 vendor: make vendor-update 2021-12-06 10:19:09 +02:00
dependabot[bot]
e06d01f0eb build(deps-dev): bump @typescript-eslint/eslint-plugin (#1902)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.5.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-06 11:01:58 +03:00
dependabot[bot]
0c614f3e9d build(deps): bump @codemirror/view in /app/vmui/packages/vmui (#1901)
Bumps [@codemirror/view](https://github.com/codemirror/view) from 0.19.21 to 0.19.26.
- [Release notes](https://github.com/codemirror/view/releases)
- [Changelog](https://github.com/codemirror/view/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/view/compare/0.19.21...0.19.26)

---
updated-dependencies:
- dependency-name: "@codemirror/view"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-06 10:51:53 +03:00
dependabot[bot]
2f74d17297 build(deps): bump @types/node in /app/vmui/packages/vmui (#1903)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 16.11.10 to 16.11.11.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-06 10:51:26 +03:00
dependabot[bot]
4931da89f0 build(deps): bump @testing-library/jest-dom in /app/vmui/packages/vmui (#1904)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 5.15.1 to 5.16.0.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v5.15.1...v5.16.0)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-06 10:51:11 +03:00
dependabot[bot]
27faaec2b9 build(deps-dev): bump @typescript-eslint/parser (#1905)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.4.0 to 5.5.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.5.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-12-06 10:51:00 +03:00
Aliaksandr Valialkin
da402fbdfa lib/workingsetcache: fix unaligned 64-bit atomic operation panic on 32-bit architectures
The panic has been introduced in 7275ebf91a
2021-12-03 01:21:51 +02:00
Aliaksandr Valialkin
4888e2c232 docs/Cluster-VictoriaMetrics.md: sync with cluster branch 2021-12-03 00:13:02 +02:00
Aliaksandr Valialkin
06642d97f5 app: allow specifying http and https urls in the following command-line flags
* -promscrape.config
* -relabelConfig
* -remoteWrite.relabelConfig
* -remoteWrite.urlRelabelConfig
2021-12-03 00:10:02 +02:00
Aliaksandr Valialkin
62b4efb3e7 app/vmauth: follow-up for 13368bed18
* Document the ability to specify http or https urls in `-auth.config` at docs/CHANGELOG.md
* Move the ReadFileOrHTTP to lib/fs, so it can be re-used in other places where a file
  should be read from the given path. For example, in `-promscrape.config` at `vmagent`.
2021-12-02 23:32:05 +02:00
Tiago Magalhães
13368bed18 vmauth: support for reading remote auth config file (#1898)
* add support for reading remote auth_config file via http

* fix lint

* fix defer on close body

Co-authored-by: Tiago Magalhães <tmagalhaes@wavecom.pt>
2021-12-02 23:19:05 +02:00
Aliaksandr Valialkin
7f2f26b25f docs/CHANGELOG.md: cut v1.70.0 2021-12-02 15:01:25 +02:00
Aliaksandr Valialkin
ed9ef7733b docs/CHANGELOG.md: document 0afd14a14a 2021-12-02 14:47:20 +02:00
Roman Khavronenko
0afd14a14a vmalert: introduce additional HTTP URL params per-group configuration (#1892)
* vmalert: introduce additional HTTP URL params per-group configuration

The new group field `params` allows to configure custom HTTP URL params
per each group. These params will be applied to every request before
executing rule's expression. Hot config reload is also supported.

Field `extra_filter_labels` was deprecated in favour of `params` field.
vmalert will print deprecation log message if config file contains
the deprecated field.

`params` fields are supported by both Prometheus and Graphite datasource types.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: provide more examples for `params` field

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmalert: set higher priority for `params` setting

If there would be a conflict between URL params set in `datasource.url` flag
and params in group definition the latter will have higher priority.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:45:08 +02:00
Thomas Danielsson
77e19b3f87 Fix vmsingle dashboard link (#1894) 2021-12-02 14:43:30 +02:00
Roman Khavronenko
e5b451a66a ci: bump go version to 1.17 (#1895)
The bump was required for `vmalert` package.
`vmalert` docs now also contain an updated description.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 14:42:25 +02:00
Aliaksandr Valialkin
394a345ae0 lib/httpserver: expose /-/healthy and /-/ready endpoints as Prometheus does
This improves integration with third-party solutions, which rely on these endpoints.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1833
2021-12-02 14:36:58 +02:00
Aliaksandr Valialkin
90c542af12 app: use relative paths instead of absolute paths for the supported http handlers on the main page
This allows hiding VictoriaMetrics components behind proxies, which serve pages at different path prefixes

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1858
2021-12-02 13:52:39 +02:00
Aliaksandr Valialkin
03f5ad3060 lib/protoparser/graphite: allow multiple separators between metric name, value and timestamp 2021-12-02 13:43:49 +02:00
Aliaksandr Valialkin
49a18b8660 lib/protoparser/graphite: properly parse Graphite line with whitespace after the timestamp
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1865
2021-12-02 13:33:26 +02:00
Aliaksandr Valialkin
91d8873d86 docs/FAQ.md: mention that VictoriaMetrics can be queried via vmui 2021-12-02 13:16:30 +02:00
Aliaksandr Valialkin
9c66848c32 vendor: make vendor-update 2021-12-02 12:42:35 +02:00
Aliaksandr Valialkin
c0cbf0de2a app/{vmbackup,vmrestore}: export internal metrics at /metrics http handler 2021-12-02 11:55:58 +02:00
Aliaksandr Valialkin
7275ebf91a app/vmstorage: export vm_cache_size_max_bytes metrics for determining capacity of various caches
The vm_cache_size_max_bytes metric can be used for determining caches which reach their capacity via the following query:

   vm_cache_size_bytes / vm_cache_size_max_bytes > 0.9
2021-12-02 10:30:43 +02:00
Aliaksandr Valialkin
2f63dec2e3 lib/fs: add vm_filestream_read_duration_seconds_total and vm_filestream_write_duration_seconds_total metrics
These metrics help determining persistent disk saturation with `rate(vm_filestream_read_duration_seconds_total) > 0.9`
2021-12-02 10:30:42 +02:00
Roman Khavronenko
d052c8c81e vmalert: adjust topologies docs in README (#1893)
Commit changes images width and order in topologies section
for better readability.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-12-02 11:27:46 +03:00
Denys Holius
23a03d23aa added guide-vmcluster-multiple-retention-setup to guides list (#1891) 2021-12-02 10:10:25 +02:00
Roman Khavronenko
866b6a842b Vmalert docs upd (#1890)
* vmalert: add topology examples in docs

* vmalert: docs typo fix
2021-12-01 18:33:06 +03:00
Aliaksandr Valialkin
2c41f25fb8 README.md: add a link to https://docs.victoriametrics.com/guides/guide-vmcluster-multiple-retention-setup.html for multi-retention setups 2021-12-01 12:48:30 +02:00
Yurii Kravets
0afec0259b Create guide-vmcluster-multiple-retention-setup.md (#1888)
* Create guide-vmcluster-multiple-retention-setup.md

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-01 12:42:16 +02:00
Aliaksandr Valialkin
ed54b27b85 docs/CHANGELOG.md: document 06eff5a72c
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/487
2021-12-01 12:28:06 +02:00
Aliaksandr Valialkin
2fb5a6ca78 lib/storage: do not take into account -storage.minFreeDiskSpaceBytes during background merges 2021-12-01 11:02:36 +02:00
Nikolay
06eff5a72c removes FileSize from backup part key (#1872)
* removes FileSize from backup part key
it should fix download restoration for backups

* Update lib/backup/common/part.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-12-01 11:01:28 +02:00
Yurii Kravets
5881c4ae48 Add retention-scheme.png 2021-11-30 15:57:44 +02:00
Aliaksandr Valialkin
e625436bca docs/Articles.md: added a link to https://mist.io/blog/2021-11-26-kubernetes-and-victoriametrics-in-Mist-v4-6 2021-11-30 14:59:38 +02:00
Aliaksandr Valialkin
f06495c50a docs/Single-server-VictoriaMetrics.md: document that the deduplication is applied only when exporting data in JSON line format
The exported data isn't de-duplicated by default due to performance reasons.
It is expected that the de-duplication is applied during importing the exported data.

The deduplication is applied only when exporting data via /api/v1/export if `reduce_mem_usage=1` query arg isn't passed to the request.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1837
2021-11-30 13:06:06 +02:00
Aliaksandr Valialkin
d666755159 lib/storage: take into account -storage.minFreeDiskSpaceBytes when performing big merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-11-30 12:56:35 +02:00
Aliaksandr Valialkin
f67427ae61 app/vmselect/vmui: make vmui-update 2021-11-30 01:38:24 +02:00
Roman Khavronenko
40fcf667b0 vmalert: continue to print errors for bad config during hot reload (#1871)
Previously, vmalert would print an err message and set vmalert_config_last_reload_successful=0
only once during a hot reload of a bad config. Such behaviour may result into non noticed
event of a bad config reload attempt

Now, it continues to print error messages and keep vmalert_config_last_reload_successful state
until successful attempt will be made or config state will be rolled back to prev state.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-11-30 01:23:49 +02:00
Aliaksandr Valialkin
10bd8b1d86 docs/CHANGELOG.md: document 852a895b70 2021-11-30 01:22:37 +02:00
Roman Khavronenko
852a895b70 vmalert: make notifier.Addr optional (#1870)
For a long time notifier.Addr flag was required. The assumption was that vmalert will
be always used for alerting. However, practice shows that some users need only
recording rules. In this case, requirement of notifier.Addr is ambigious.

The change verifies if loaded config contains recording or alerting rules and
if there are corresponding flags set. This is true for initial config load
and hot reload.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-11-30 01:18:48 +02:00
Aliaksandr Valialkin
ca6fc0265e docs/CHANGELOG.md: document f05cddd2fc
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1830
2021-11-30 01:15:41 +02:00
guidao
f05cddd2fc fix #1830 (#1861)
Co-authored-by: wangfeng <wangfeng@zhihu.com>
2021-11-30 01:12:24 +02:00
Aliaksandr Valialkin
841647643a docs/CHANGELOG.md: document 9dd650f67f 2021-11-30 01:09:10 +02:00
Yury Molodov
9dd650f67f feat: store display type in URL (#1855) 2021-11-30 01:06:26 +02:00
Aliaksandr Valialkin
7e79fc6e3c docs/CHANGELOG.md: document 624ad73705 2021-11-30 01:05:05 +02:00
Yury Molodov
624ad73705 vmui: handle an error in a table with subqueries (#1854)
* fix: handle an error in a table with subqueries

* feat: store display type in URL

* Revert "feat: store display type in URL"

This reverts commit ccc242c69a.
2021-11-30 01:02:11 +02:00
Aliaksandr Valialkin
98d244b288 docs/CHANGELOG.md: document c6d5927281 2021-11-30 01:00:36 +02:00
Yury Molodov
c6d5927281 feat: remove save the last query to local storage (#1853) 2021-11-30 00:58:27 +02:00
Yury Molodov
1b58d126c0 vmui: optimize render (#1852)
* feat: change duration by "enter"

* fix: optimize data processing for chart

* feat: set minimum step to 1ms

* update dependencies

* update package-lock

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-11-30 00:56:48 +02:00
Aliaksandr Valialkin
ba927d1c77 lib/protoparser/prometheus: follow-up for 8e338632a3
Do not spend CPU time on error message formatting if error logger is disabled
2021-11-30 00:50:11 +02:00
Nikolay
8e338632a3 Changes unmarshallRow logger to noop for getRowsDiff (#1835) 2021-11-30 00:48:13 +02:00
Aliaksandr Valialkin
91243ad5cd app/vmselect/vmui: make vmui-update 2021-11-29 21:55:08 +02:00
Aliaksandr Valialkin
d44c585ca4 lib/protoparser: do not log connection reset by peer error when reading the data via InfluxDB, Graphite and OpenTSDB protocols over plain TCP connections
This error is expected, so there is no need in spamming the log with this error.
2021-11-29 21:47:56 +02:00
dependabot[bot]
ee79ab46bb build(deps): bump @types/jest in /app/vmui/packages/vmui (#1886)
Bumps [@types/jest](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/jest) from 27.0.2 to 27.0.3.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/jest)

---
updated-dependencies:
- dependency-name: "@types/jest"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 22:03:28 +03:00
dependabot[bot]
11308767a2 build(deps): bump @mui/material in /app/vmui/packages/vmui (#1885)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.2.1 to 5.2.2.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.2/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 22:03:09 +03:00
dependabot[bot]
079ede79a3 build(deps): bump typescript in /app/vmui/packages/vmui (#1884)
Bumps [typescript](https://github.com/Microsoft/TypeScript) from 4.4.4 to 4.5.2.
- [Release notes](https://github.com/Microsoft/TypeScript/releases)
- [Commits](https://github.com/Microsoft/TypeScript/compare/v4.4.4...v4.5.2)

---
updated-dependencies:
- dependency-name: typescript
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 22:00:32 +03:00
dependabot[bot]
f977aaee41 build(deps): bump @testing-library/jest-dom in /app/vmui/packages/vmui (#1883)
Bumps [@testing-library/jest-dom](https://github.com/testing-library/jest-dom) from 5.15.0 to 5.15.1.
- [Release notes](https://github.com/testing-library/jest-dom/releases)
- [Changelog](https://github.com/testing-library/jest-dom/blob/main/CHANGELOG.md)
- [Commits](https://github.com/testing-library/jest-dom/compare/v5.15.0...v5.15.1)

---
updated-dependencies:
- dependency-name: "@testing-library/jest-dom"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 22:00:19 +03:00
dependabot[bot]
f56456a45c build(deps-dev): bump eslint-plugin-react in /app/vmui/packages/vmui (#1882)
Bumps [eslint-plugin-react](https://github.com/yannickcr/eslint-plugin-react) from 7.26.1 to 7.27.1.
- [Release notes](https://github.com/yannickcr/eslint-plugin-react/releases)
- [Changelog](https://github.com/yannickcr/eslint-plugin-react/blob/master/CHANGELOG.md)
- [Commits](https://github.com/yannickcr/eslint-plugin-react/compare/v7.26.1...v7.27.1)

---
updated-dependencies:
- dependency-name: eslint-plugin-react
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:58:30 +03:00
dependabot[bot]
4da6e28802 build(deps): bump @codemirror/state in /app/vmui/packages/vmui (#1881)
Bumps [@codemirror/state](https://github.com/codemirror/state) from 0.19.4 to 0.19.6.
- [Release notes](https://github.com/codemirror/state/releases)
- [Changelog](https://github.com/codemirror/state/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/state/compare/0.19.4...0.19.6)

---
updated-dependencies:
- dependency-name: "@codemirror/state"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:57:56 +03:00
dependabot[bot]
f85480bb3c build(deps): bump @mui/icons-material in /app/vmui/packages/vmui (#1880)
Bumps [@mui/icons-material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-icons-material) from 5.0.5 to 5.2.0.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.0/packages/mui-icons-material)

---
updated-dependencies:
- dependency-name: "@mui/icons-material"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:57:41 +03:00
dependabot[bot]
bfea7271d5 build(deps): bump @types/react in /app/vmui/packages/vmui (#1879)
Bumps [@types/react](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/react) from 17.0.34 to 17.0.37.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/react)

---
updated-dependencies:
- dependency-name: "@types/react"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:56:35 +03:00
dependabot[bot]
50dac0bd8f build(deps): bump @codemirror/view in /app/vmui/packages/vmui (#1878)
Bumps [@codemirror/view](https://github.com/codemirror/view) from 0.19.20 to 0.19.21.
- [Release notes](https://github.com/codemirror/view/releases)
- [Changelog](https://github.com/codemirror/view/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/view/compare/0.19.20...0.19.21)

---
updated-dependencies:
- dependency-name: "@codemirror/view"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:56:18 +03:00
dependabot[bot]
cb508e9678 build(deps): bump @emotion/react in /app/vmui/packages/vmui (#1877)
Bumps [@emotion/react](https://github.com/emotion-js/emotion) from 11.6.0 to 11.7.0.
- [Release notes](https://github.com/emotion-js/emotion/releases)
- [Changelog](https://github.com/emotion-js/emotion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/emotion-js/emotion/compare/@emotion/react@11.6.0...@emotion/react@11.7.0)

---
updated-dependencies:
- dependency-name: "@emotion/react"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:55:48 +03:00
dependabot[bot]
b78fe28f0b build(deps): bump @mui/lab in /app/vmui/packages/vmui (#1876)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.55 to 5.0.0-alpha.58.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:55:09 +03:00
dependabot[bot]
26777abd02 build(deps): bump @emotion/styled in /app/vmui/packages/vmui (#1814)
Bumps [@emotion/styled](https://github.com/emotion-js/emotion) from 11.3.0 to 11.6.0.
- [Release notes](https://github.com/emotion-js/emotion/releases)
- [Changelog](https://github.com/emotion-js/emotion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/emotion-js/emotion/compare/@emotion/styled@11.3.0...@emotion/styled@11.6.0)

---
updated-dependencies:
- dependency-name: "@emotion/styled"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:53:31 +03:00
dependabot[bot]
fc67ca5cfa build(deps): bump uplot from 1.6.16 to 1.6.17 in /app/vmui/packages/vmui (#1848)
Bumps [uplot](https://github.com/leeoniya/uPlot) from 1.6.16 to 1.6.17.
- [Release notes](https://github.com/leeoniya/uPlot/releases)
- [Commits](https://github.com/leeoniya/uPlot/compare/1.6.16...1.6.17)

---
updated-dependencies:
- dependency-name: uplot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:53:16 +03:00
dependabot[bot]
6e9f753057 build(deps): bump @mui/styles in /app/vmui/packages/vmui (#1875)
Bumps [@mui/styles](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-styles) from 5.0.2 to 5.2.2.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.2/packages/mui-styles)

---
updated-dependencies:
- dependency-name: "@mui/styles"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:52:36 +03:00
dependabot[bot]
ca3106f3bd build(deps): bump @codemirror/autocomplete in /app/vmui/packages/vmui (#1874)
Bumps [@codemirror/autocomplete](https://github.com/codemirror/autocomplete) from 0.19.4 to 0.19.9.
- [Release notes](https://github.com/codemirror/autocomplete/releases)
- [Changelog](https://github.com/codemirror/autocomplete/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/autocomplete/compare/0.19.4...0.19.9)

---
updated-dependencies:
- dependency-name: "@codemirror/autocomplete"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:51:56 +03:00
dependabot[bot]
b36fe59dd6 build(deps): bump @types/node in /app/vmui/packages/vmui (#1868)
Bumps [@types/node](https://github.com/DefinitelyTyped/DefinitelyTyped/tree/HEAD/types/node) from 16.11.6 to 16.11.10.
- [Release notes](https://github.com/DefinitelyTyped/DefinitelyTyped/releases)
- [Commits](https://github.com/DefinitelyTyped/DefinitelyTyped/commits/HEAD/types/node)

---
updated-dependencies:
- dependency-name: "@types/node"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:51:21 +03:00
dependabot[bot]
5a86354aaa build(deps): bump @mui/material in /app/vmui/packages/vmui (#1866)
Bumps [@mui/material](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-material) from 5.0.6 to 5.2.1.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/v5.2.1/packages/mui-material)

---
updated-dependencies:
- dependency-name: "@mui/material"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-29 21:50:52 +03:00
Aliaksandr Valialkin
e6a0c87c7e vendor: make vendor-update 2021-11-29 12:35:40 +02:00
Aliaksandr Valialkin
ce31e837eb app/vmselect/vmui: make vmui-update 2021-11-29 12:22:59 +02:00
Aliaksandr Valialkin
03509025bc docs/CHANGELOG.md: document 695cb617b2 2021-11-29 12:13:11 +02:00
Denis Golius
37faf1f426 Bumped Alpine linux version to 3.15.0 2021-11-28 20:53:48 +02:00
dependabot[bot]
083044c3e2 build(deps-dev): bump @typescript-eslint/parser (#1843)
Bumps [@typescript-eslint/parser](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/parser) from 5.3.0 to 5.4.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/main/packages/parser/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.4.0/packages/parser)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/parser"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 16:43:37 +03:00
dependabot[bot]
96e6f9ecb6 build(deps): bump @emotion/react in /app/vmui/packages/vmui (#1817)
Bumps [@emotion/react](https://github.com/emotion-js/emotion) from 11.5.0 to 11.6.0.
- [Release notes](https://github.com/emotion-js/emotion/releases)
- [Changelog](https://github.com/emotion-js/emotion/blob/main/CHANGELOG.md)
- [Commits](https://github.com/emotion-js/emotion/compare/@emotion/react@11.5.0...@emotion/react@11.6.0)

---
updated-dependencies:
- dependency-name: "@emotion/react"
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 16:42:21 +03:00
dependabot[bot]
ad7e225193 build(deps): bump @codemirror/view in /app/vmui/packages/vmui (#1841)
Bumps [@codemirror/view](https://github.com/codemirror/view) from 0.19.14 to 0.19.20.
- [Release notes](https://github.com/codemirror/view/releases)
- [Changelog](https://github.com/codemirror/view/blob/main/CHANGELOG.md)
- [Commits](https://github.com/codemirror/view/compare/0.19.14...0.19.20)

---
updated-dependencies:
- dependency-name: "@codemirror/view"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 16:41:50 +03:00
dependabot[bot]
ee2405b042 build(deps): bump @mui/lab in /app/vmui/packages/vmui (#1842)
Bumps [@mui/lab](https://github.com/mui-org/material-ui/tree/HEAD/packages/mui-lab) from 5.0.0-alpha.53 to 5.0.0-alpha.55.
- [Release notes](https://github.com/mui-org/material-ui/releases)
- [Changelog](https://github.com/mui-org/material-ui/blob/master/CHANGELOG.md)
- [Commits](https://github.com/mui-org/material-ui/commits/HEAD/packages/mui-lab)

---
updated-dependencies:
- dependency-name: "@mui/lab"
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-22 16:41:16 +03:00
Michael Fuller
cf8c171f85 vmselect: in promql evaluation, return bytes requested when rollup memory limiter is unable to satisfy the request (#1838)
Co-authored-by: Michael Fuller <mfuller@digitalocean.com>
2021-11-22 13:20:42 +03:00
John Seekins
695cb617b2 Simplify queries to OpenTSDB for migration (#1809)
* Simplify queries to OpenTSDB (and make them properly appear in OpenTSDB query stats) and also tweak defaults a bit

Signed-off-by: John Seekins <jseekins@datto.com>

* remove extraneous printlns

Signed-off-by: John Seekins <jseekins@datto.com>

* remove empty line

Signed-off-by: John Seekins <jseekins@datto.com>

* fix bug in offset calcuation and closer to working with simpler queries

Signed-off-by: John Seekins <jseekins@datto.com>

* fix boolean eval

Signed-off-by: John Seekins <jseekins@datto.com>

* fix casting and check for multiple series

Signed-off-by: John Seekins <jseekins@datto.com>
2021-11-18 20:18:15 +03:00
Aliaksandr Valialkin
9bee043ff2 app/vmselect/promql: consistently return zero from deriv(const) 2021-11-17 18:02:05 +02:00
Aliaksandr Valialkin
b688960db0 lib/persistentqueue: add vm_persistentqueue_read_duration_seconds_total and vm_persistentqueue_write_duration_seconds_total metrics for determining disk usage saturation at vmagent 2021-11-17 16:41:35 +02:00
Aliaksandr Valialkin
b900560b83 app/vmselect/promql: add now() function, which returns the current timestamp as a floating-point value in seconds 2021-11-17 16:35:30 +02:00
Aliaksandr Valialkin
b3c6334fbb go.mod: add missing update after 4b660a7fc9 2021-11-17 13:38:23 +02:00
Aliaksandr Valialkin
4b660a7fc9 vendor: make vendor-update 2021-11-17 13:37:42 +02:00
Aliaksandr Valialkin
284fec8fcd app/vmauth: accept requests with Basic Auth username which is equal to bearer_token value from the -auth.config 2021-11-17 13:31:19 +02:00
Aliaksandr Valialkin
52e19a0577 docs: document -s3ForcePathStyle command-line option
This is a follow-up for b72eed1f5e
2021-11-17 01:09:32 +02:00
Lan
b72eed1f5e Add flag of S3ForcePathStyle (#1802) 2021-11-17 01:03:03 +02:00
vic
1fb3dbcbda Update Cluster-VictoriaMetrics.md (#1806)
replicationFactor flag should be passed to vmselect instead of vminsert for improving query speed:)
2021-11-17 00:59:43 +02:00
Aliaksandr Valialkin
fc534a1e7f app/vmalert/README.md: sync with docs/vmalert.md
This is a follow-up after d8c70903ec
2021-11-17 00:56:04 +02:00
Florian Klink
d8c70903ec docs/vmalert.md: document vmalert url flags a bit more cleanly (#1823)
Describe remoteWrite.url is used to persist rules and alerts state info,
and add an additional paragraph explaining the separation between
-remoteRead.url and -datasource.url.

Fixes #1810.
2021-11-17 00:54:06 +02:00
Aliaksandr Valialkin
f3ac945d74 app/vmauth: add ability to override the username label value for vmauth_user_requests_total metric by specifying name option in -auth.config
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1805
2021-11-17 00:47:34 +02:00
Aliaksandr Valialkin
7fda5d52ae docs/Single-server-VictoriaMetrics.md: add a link to vmalert rules backfilling at Backfilling chapter 2021-11-17 00:23:23 +02:00
Aliaksandr Valialkin
5a180c6659 docs/CHANGELOG.md: document the addition of vm_tenant_used_tenant_bytes metric, which shows the per-tenant disk usage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1605
2021-11-16 23:36:48 +02:00
Aliaksandr Valialkin
09b0641ccb vendor: make vendor-update 2021-11-14 14:06:53 +02:00
Aliaksandr Valialkin
f43586c63c app/vmselect/promql: arrange function names in the code in alphabetical order
This should simplify code maintenance in the future
2021-11-14 13:55:06 +02:00
Aliaksandr Valialkin
b585a550ba app/vmui/Dockerfile-web: update Go builder for vmui from v1.17.1 to v1.17.3 2021-11-14 13:55:05 +02:00
Aliaksandr Valialkin
129b0d2b22 deployment/docker: allow using / chars in ROOT_IMAGE when running make package-*
This fixes the following command:

ROOT_IMAGE=gcr.io/distroless/static make package-victoria-metrics
2021-11-14 13:55:05 +02:00
Denys Holius
49ee952e9a Bumped Alpine linux version to the latest (#1811)
See this https://alpinelinux.org/posts/Alpine-3.14.3-released.html
2021-11-14 12:59:27 +03:00
Aliaksandr Valialkin
c77ff2d293 docs/Articles.md: add a linkt to OSA Con talk about how clickhouse inspired us to build victoriametrics 2021-11-12 14:21:06 +02:00
Aliaksandr Valialkin
9fa098d8e3 app/vmselect/promql: prevent from incorrect calculations for deriv() over multiple samples with identical timestamps 2021-11-12 13:50:43 +02:00
Aliaksandr Valialkin
8b6c89423d docs/CHANGELOG.md: document bugfixes in enteprise versions of vmagent and vmalert 2021-11-12 13:24:07 +02:00
Aliaksandr Valialkin
e2f823fffc docs/Single-server-VictoriaMetrics.md: mention that it is possible to send gzipped data to /api/v1/import/prometheus 2021-11-09 20:45:14 +02:00
Aliaksandr Valialkin
e5d4c7f4a7 app/vmauth: initialize reverse proxy only after flag.Parse() is called
This should properly take into accoun the `-maxIdleConnsPerBackend` command-line flag value.
Previously it was hardcoded to 100.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-11-09 19:22:34 +02:00
Aliaksandr Valialkin
e5ac9d8e57 all: consistently return application/json content-type without charset=utf-8
The `application/json` content-type has utf-8 encoding by default.
See https://stackoverflow.com/questions/9254891/what-does-content-type-application-json-charset-utf-8-really-mean

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/897
2021-11-09 18:04:44 +02:00
Aliaksandr Valialkin
802f05f73f dashboards: consistently use regexp filters for template vars (#1798)
Template vars may contain regexp when `all` is selected (.*) or when multiple values are selected (foo|bar).
So they must be passed to regexp filters.
2021-11-09 16:50:21 +02:00
Aliaksandr Valialkin
5046efb94b docs/vmalert.md: improve wording in Multitenancy chapter 2021-11-09 14:19:52 +02:00
Aliaksandr Valialkin
840ac283ef app/vmselect/promql: properly return durations smaller than one second from duration_over_time() function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1780
2021-11-09 11:41:56 +02:00
Aliaksandr Valialkin
a67518fc6d docs: mention that graphs on the official dashboards contain useful hints 2021-11-08 19:54:10 +02:00
Aliaksandr Valialkin
f39ee8dc95 docs/MetricsQL.md: mention than tlast_over_time() is an alias for timestamp() 2021-11-08 18:29:24 +02:00
Aliaksandr Valialkin
69e655ba7f docs/CHANGELOG.md: cut v1.69.0 2021-11-08 15:47:36 +02:00
Yury Molodov
b78ab88a1c vmui: migration MUI Core v4 to v5 (#1795)
* migration MUI Core v4 to v5

* app/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-11-08 15:45:59 +02:00
Aliaksandr Valialkin
fd596945e7 lib/promscrape: improve logging for scrape_config_files parse errors
Log the actual file path, which led to the parse error.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1789
2021-11-08 13:34:12 +02:00
Aliaksandr Valialkin
3419ac1d36 app/vmselect/promql: add duration_over_time(m[d], max_interval) function
This function calculates the actual lifetime of the time series on the given lookbehdind window `d`

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1780
2021-11-08 13:14:09 +02:00
Aliaksandr Valialkin
1be4838ca0 vendor: make vendor-update 2021-11-08 12:39:57 +02:00
Aliaksandr Valialkin
e44137d46b docs/MetricsQL.md: clarify documentation for lifetime function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1780
2021-11-08 12:35:17 +02:00
Aliaksandr Valialkin
b07010839c Makefile: add TAG=v... make publish-release rule for building and publishing a release for the given TAG 2021-11-08 12:29:10 +02:00
Aliaksandr Valialkin
5edf695bc9 docs/CHANGELOG.md: document b9cdbcb5046315db96e1e7ca9923d09d0f30dc25 2021-11-08 12:11:30 +02:00
Yury Molodov
6d1d558c4f vmui: fix graph reset (#1788)
* feat: add query history

* fix: change detect keyUp for nav query history

* feat: set default query history

* feat: change graph legend

* update dependencies

* update codemirror version

* fix: correct update period time after zoom/pan

* fix: optimize data processing for the graph

* fix: eliminate memory leaks related to mouse events

* fix: correct display of straight line

* Merge branch 'master' into vmui-fix-reset-graph

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-11-05 20:56:57 +02:00
Aliaksandr Valialkin
34b5414ba8 app/{vmalert,vmbackup}/README.md: sync with docs after the commit 47d1612bf8 2021-11-05 20:45:38 +02:00
João Paulo
47d1612bf8 docs: fix multiple typos (#1787)
Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-11-05 20:44:02 +02:00
Aliaksandr Valialkin
237885e0d2 docs/vmalert.md: document the addition of -defaultTenant.prometheus and -defaultTenant.graphite command-line options to enterprise version of vmalert 2021-11-05 20:04:09 +02:00
Aliaksandr Valialkin
24dce03aaa app/vmalert/datasource: use plain string literals instead of constants
This removes the unneeded level of indirection and improves code readability.

The "prometheus" and "graphite" constants aren't going to change in the future, so there is no sense in hiding them behind constants.
2021-11-05 19:57:47 +02:00
Aliaksandr Valialkin
bf814320b0 app/vmalert: remove rule.type config, since it doesnt play well with the upcoming default tenants for -clusterMode
It is better from the consistency point of view to set up rule types at group level where tenant config is set up.
2021-11-05 19:52:32 +02:00
Aliaksandr Valialkin
c43bcdb5fb app/vmagent: allow bigger number of in-memory blocks for big values of -remoteWrite.queues
This should improve the maximum data ingestion speed for highly-loaded vmagent instances
which run on beefy servers with many CPU cores and big amounts of RAM
2021-11-05 15:16:05 +02:00
Aliaksandr Valialkin
cbfc7b7c92 app/{vminsert,vmagent}: hide passwords and auth tokens by default at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-05 14:41:16 +02:00
Aliaksandr Valialkin
e73a82f7a5 lib/promauth: do not show empty values in oauth2 config section at /config page 2021-11-05 12:53:39 +02:00
Aliaksandr Valialkin
3db1f2d550 deployment/dm: update Go builder from Go1.17.2 to Go1.17.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.3+label%3ACherryPickApproved
2021-11-05 11:51:38 +02:00
Denys Holius
cd966bf552 bumped grafana dashboards revisions for guides (#1784) 2021-11-05 11:43:26 +02:00
Aliaksandr Valialkin
faa0eb6b52 docs/FAQ.md: mention that VictoriaMetrics can be queried via Graphite API 2021-11-04 22:37:56 +02:00
Aliaksandr Valialkin
4839d07f34 app/vmagent/remotewrite: fix parallel data sending to remote storage systems at e0d2ba5608 2021-11-04 16:58:28 +02:00
Aliaksandr Valialkin
a69264e885 app/vmagent: add -remoteWrite.maxRowsPerBlock command-line option, which may be used for improving data ingestion performance under high load 2021-11-04 15:39:14 +02:00
Aliaksandr Valialkin
e0d2ba5608 app/vmagent/remotewrite: send data to remote storage systems in parallel
This should improve data ingestion speed when many `-remoteWrite.url` command-line flags are configured
2021-11-04 15:04:16 +02:00
dependabot[bot]
558f77c259 build(deps-dev): bump @typescript-eslint/eslint-plugin (#183)
Bumps [@typescript-eslint/eslint-plugin](https://github.com/typescript-eslint/typescript-eslint/tree/HEAD/packages/eslint-plugin) from 5.2.0 to 5.3.0.
- [Release notes](https://github.com/typescript-eslint/typescript-eslint/releases)
- [Changelog](https://github.com/typescript-eslint/typescript-eslint/blob/master/packages/eslint-plugin/CHANGELOG.md)
- [Commits](https://github.com/typescript-eslint/typescript-eslint/commits/v5.3.0/packages/eslint-plugin)

---
updated-dependencies:
- dependency-name: "@typescript-eslint/eslint-plugin"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-04 12:53:20 +02:00
Aliaksandr Valialkin
2178335618 app/vmselect: make vmui-update 2021-11-04 12:13:12 +02:00
dependabot[bot]
ebaa4e7256 build(deps-dev): bump @babel/plugin-proposal-nullish-coalescing-operator (#1769)
Bumps [@babel/plugin-proposal-nullish-coalescing-operator](https://github.com/babel/babel/tree/HEAD/packages/babel-plugin-proposal-nullish-coalescing-operator) from 7.14.5 to 7.16.0.
- [Release notes](https://github.com/babel/babel/releases)
- [Changelog](https://github.com/babel/babel/blob/main/CHANGELOG.md)
- [Commits](https://github.com/babel/babel/commits/v7.16.0/packages/babel-plugin-proposal-nullish-coalescing-operator)

---
updated-dependencies:
- dependency-name: "@babel/plugin-proposal-nullish-coalescing-operator"
  dependency-type: direct:development
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-04 12:09:34 +02:00
dependabot[bot]
9ed7ead84f build(deps): bump @date-io/dayjs in /app/vmui/packages/vmui (#1770)
Bumps [@date-io/dayjs](https://github.com/dmtrKovalenko/date-io) from 1.3.13 to 2.11.0.
- [Release notes](https://github.com/dmtrKovalenko/date-io/releases)
- [Commits](https://github.com/dmtrKovalenko/date-io/compare/v1.3.13...v2.11.0)

---
updated-dependencies:
- dependency-name: "@date-io/dayjs"
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-04 12:09:11 +02:00
Denys Holius
06114c7bb2 bumped golangci-lint to the latest 1.43 (#1781) 2021-11-04 11:34:08 +02:00
Roman Khavronenko
1e84339df0 docs: make link to logos zip absolute (#1782)
The relative link won't work for github-docs website,
so we're changing it to absolute link.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-11-04 11:32:49 +02:00
Aliaksandr Valialkin
aa534c2582 lib/promscrape: add -promscrape.maxResponseHeadersSize command-line flag for tuning the maximum http response headers size from Prometheus scrape targets 2021-11-03 22:26:56 +02:00
Aliaksandr Valialkin
27044b84d2 app/vmselect/promql: add limit_offset(limit, offset, q) function, which can be used for paging over big number of time series
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1778
2021-11-03 16:02:27 +02:00
Aliaksandr Valialkin
43a58bd618 app/vmselect/promql: add label_graphite_group() function for extracting groups from Graphite metric names 2021-11-03 13:19:08 +02:00
Aliaksandr Valialkin
da2e0e29a4 docs/CHANGELOG.md: document e3a91b186a 2021-11-02 18:39:14 +02:00
Aliaksandr Valialkin
d1eb87c831 app/{vmagent,vminsert}: add ability to restrict access to /config page with authKey query arg
The authKey can be configured via `-configAuthKey` command-line flag.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1764
2021-11-01 16:44:54 +02:00
Aliaksandr Valialkin
28b6456f3b vendor: make vendor-update 2021-11-01 15:59:38 +02:00
Aliaksandr Valialkin
cb3819d44e vendor: update github.com/VictoriaMetrics/metrics from v1.18.0 to v1.18.1 2021-11-01 15:52:53 +02:00
Aliaksandr Valialkin
701973877f docs/Articles.md: add a link to https://valyala.medium.com/how-to-optimize-promql-and-metricsql-queries-85a1b75bf986 2021-10-29 14:02:51 +03:00
Aliaksandr Valialkin
1a16dab9e1 docs/vmauth.md: typo fix 2021-10-28 14:06:00 +03:00
Aliaksandr Valialkin
bb87949d5c lib/protoparser/influx: automatically detect timestamp precision depending on the number of decimal digits in the timestamp 2021-10-28 12:47:22 +03:00
Aliaksandr Valialkin
d0e7c0535e lib/logger: show only explicitly set command-line flags in logs
This reduces initial verbosity in logs
2021-10-28 11:00:52 +03:00
Aliaksandr Valialkin
acfda6d8fd app/vmbackupmanager: fix links to images
This is a follow-up after bd6b8f7e31
2021-10-27 21:35:52 +03:00
Yury Molodov
47ee3744f2 vmui: correct migration material-ui (#1758)
* migration material-ui

* fix: rollback popover

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-27 20:41:26 +03:00
Aliaksandr Valialkin
74b8af9891 lib/promscrape: add collapse and expand buttons per each group of targets from the same scrape job 2021-10-27 20:03:24 +03:00
Aliaksandr Valialkin
6608705652 app/{vmalert,vmagent}: improve the distribution of scrape offsets among targets / rules
Previously only the lower part of 64-bit hash was used for calculating the offset.
This may give uneven distribution in some cases. So let's use all the available 64 bits from the hash
for calculating the offset.
2021-10-27 19:59:16 +03:00
Aliaksandr Valialkin
e3a91b186a lib/protoparser/prometheus: optimize GetRowsDiff() function
This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1745 ,
since the provided profile shows that the majority of CPU and memory is spent in this function
during `streamParse` when `-promscrape.noStaleMarkers` wasn't set.
2021-10-27 18:54:45 +03:00
Aliaksandr Valialkin
95d44157fc lib/protoparser/prometheus: add a benchmark for GetRowsDiff 2021-10-27 18:53:54 +03:00
Aliaksandr Valialkin
1952ab99aa all: fix build issues and tests for Apple M1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1653
2021-10-27 15:06:34 +03:00
Aliaksandr Valialkin
1ae7ca848c .github/workflows/main.yml: checkout code before installing dependencies
Dependencies depend on Makefile rules from the code, so code checkout must run first
2021-10-26 22:08:58 +03:00
Aliaksandr Valialkin
9ec0175e83 docs/CHANGELOG.md: mention the issue about missing proxy_url config option at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755
2021-10-26 22:06:35 +03:00
Aliaksandr Valialkin
c560a338e8 .github/workflows/main.yml: re-use makefile rules for installing goling, errcheck and golangci-lint 2021-10-26 21:26:39 +03:00
Aliaksandr Valialkin
4821adfd95 lib/promscrape: properly show proxy_url option value at /config page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1755
2021-10-26 21:23:54 +03:00
Aliaksandr Valialkin
51641c0840 vendor: make vendor-update 2021-10-26 19:36:50 +03:00
Yury Molodov
956cf83e7b vmui: update dependencies (#1754)
* update dependencies

* update codemirror version

* app/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-26 19:31:20 +03:00
Aliaksandr Valialkin
88d42c3ac1 app/vmbackup/README.md: sync with docs/vmbackup.md after e706fb5686 2021-10-26 19:20:47 +03:00
Dima Lazerka
e706fb5686 Fix doc: vmbackup splits by 1 GiB not 100 MB (#1756)
This is a follow-up for bdd0a1cdb2
2021-10-26 19:19:49 +03:00
Denys Holius
d282a7593b fixed wrong path for npm dependabot checks (#1744) 2021-10-26 11:04:32 +03:00
Aliaksandr Valialkin
a7e3cbd6ad docs/CHANGELOG.md: document 3dbdf1632e 2021-10-25 12:16:37 +03:00
Roman Khavronenko
3dbdf1632e vmalert: allow groups with empty rules for compatibility reasons (#1742)
Prometheus allows to have groups with no rules, so we should support
it in vmalert as well for compatibility reasons.
It is also allowed to hot-reload empty groups by adding or removing rules.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-25 12:15:02 +03:00
Aliaksandr Valialkin
d5825f13d3 docs/Cluster-VictoriaMetrics.md: add links with the explanation of active time series and series churn rate 2021-10-24 18:40:19 +03:00
Aliaksandr Valialkin
6b6a4ca51d docs/CaseStudies.md: fix a link to AbisoGaming case study 2021-10-24 18:36:58 +03:00
Aliaksandr Valialkin
df8f967040 app/vmselect/promql: reduce the precision from 15 significant digits to 13 significant digits when comparing float64 results in tests
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1738
2021-10-24 13:31:14 +03:00
Aliaksandr Valialkin
8d0fafc377 docs/CHANGELOG.md: typo fix 2021-10-22 21:11:54 +03:00
Aliaksandr Valialkin
f64f626927 go.mod: remove outdated replacement 2021-10-22 19:46:54 +03:00
Aliaksandr Valialkin
7f7cac20c1 docs/CHANGELOG.md: cut v1.68.0 2021-10-22 19:37:48 +03:00
Aliaksandr Valialkin
b76db7c772 deployment/docker: update Grafana from v8.2.0 to v8.2.2 2021-10-22 19:33:22 +03:00
Aliaksandr Valialkin
8124f202a4 vendor: make vendor-update 2021-10-22 19:27:06 +03:00
Aliaksandr Valialkin
a69f1baa13 docs/vmauth.md: make docs-sync 2021-10-22 19:21:34 +03:00
Aliaksandr Valialkin
013d626889 app/vmauth: add ability to specify http headers to send in requests to backends
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1736
2021-10-22 19:10:29 +03:00
Aliaksandr Valialkin
7fa15f7f86 lib/promscrape: do not populate response body to memory in stream parsing mode if -promscrape.noStaleMarkers is set
The response body isn't used if -promscrape.noStaleMarkers is set after the commit 2876137c92 ,
so there is no sense in pupulating it in memory. This should reduce memory usage when scraping big responses.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728#issuecomment-949630694
2021-10-22 16:44:44 +03:00
Aliaksandr Valialkin
7e88713ca3 docs/CHANGELOG.md: document 43a7984cd8 2021-10-22 14:00:20 +03:00
Aliaksandr Valialkin
6106d4069d lib/promscrape: do not sort original labels and do not intern label string for the original labels before the sharding code is executed
This should reduce CPU and memory usage in shard mode when service discovery finds big number of scrape targets with many long labels.
See https://docs.victoriametrics.com/vmagent.html#scraping-big-number-of-targets

This is a follow-up after 9882cda8b9

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1728
2021-10-22 13:54:30 +03:00
Aliaksandr Valialkin
2876137c92 lib/promscrape: reduce memory usage if -promscrape.noStaleMarkers command-line flag is passed
Do not store in memory the response from the last scrape per each target if -promscrape.noStaleMarkers option is enabled.
This should reduce memory usage when the scraped targets return large responses.
2021-10-22 13:10:29 +03:00
Roman Khavronenko
43a7984cd8 vmalert: correctly calculate alert ID including extra labels (#1734)
Previously, ID for alert entity was generated without alertname or groupname.
This led to collision, when multiple alerting rules within the same group
producing same labelsets. E.g. expr: `sum(metric1) by (job) > 0` and
expr: `sum(metric2) by (job) > 0` could result into same labelset `job: "job"`.

The issue affects only UI and Web API parts of vmalert, because alert ID is used
only for displaying and finding active alerts. It does not affect state restore
procedure, since this label was added right before pushing to remote storage.

The change now adds all extra labels right after receiving response from the datasource.
And removes adding extra labels before pushing to remote storage.

Additionally, change introduces a new flag `Restored` which will be displayed in UI
for alerts which have been restored from remote storage on restart.
2021-10-22 12:30:38 +03:00
Aliaksandr Valialkin
8568003bb1 docs/CHANGELOG.md: document a3684fe3de 2021-10-22 12:28:01 +03:00
Nikolay
a3684fe3de adds tab as second separator for graphite text protocol (#1733)
* adds tab as second separator for graphite text protocol

* changes indexFunc for indexAny

* Update lib/protoparser/graphite/parser_test.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:23:45 +03:00
Yury Molodov
2b266cb87e vmui: query history (#1732)
* feat: add query history

* fix: change detect keyUp for nav query history

* feat: set default query history

* app/vmselect/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-22 12:21:22 +03:00
Aliaksandr Valialkin
8991c8b589 lib/flagutil: do not expose sensitive info (passwords, keys and urls) at /flags page 2021-10-20 00:51:26 +03:00
Aliaksandr Valialkin
8ad95f0db7 lib/httpserver: expose command-line flags at /flags page
This should simplify debugging.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-20 00:45:09 +03:00
Aliaksandr Valialkin
676ad70d9f lib/envflag: use flag.Set for setting the flags from env vars
This should make visible the set flags at flag.Visit(), which is used later for logging
and exporting the `is_set` label for these flags at /metrics page
2021-10-20 00:41:08 +03:00
Aliaksandr Valialkin
53bb58ed2a lib/storage: log a warning when the -storageDataPath has less than -storage.minFreeDiskSpaceBytes
This should improve the debuggability of the readonly feature.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1727
2021-10-19 23:59:13 +03:00
Roman Khavronenko
bdfac4ff53 vmalert: make group.ID() thread-safe (#1726)
Commit fixes potential race condition when group update
and generating of ID() happens simultaneously.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:44:13 +03:00
Roman Khavronenko
dcd881bb7a vmalert: properly init SIGHUP listener before starting group manager (#1725)
Regression was introduced during code refactoring. It potentially
could lead to situation when SIGHUP signals were ignored while
vmalert was still busy with initing group manager.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-19 16:35:27 +03:00
Aliaksandr Valialkin
b8123b862a app/vmauth: fix metric name prefixes: vmagent -> vmauth 2021-10-19 15:29:07 +03:00
Aliaksandr Valialkin
35a5eaeeb1 docs/Single-server-VictoriaMetrics.md: add a link to VMUI at VictoriaMetrics playground 2021-10-19 14:41:53 +03:00
Aliaksandr Valialkin
3408a05d12 lib/promscrape/discovery/kubernetes: log a warning if role: endpoints discovers more than 1000 targets per a single endpoint
In this case `role: endpointslice` must be used instead.

See the following references:

* https://kubernetes.io/docs/reference/labels-annotations-taints/#endpoints-kubernetes-io-over-capacity
* https://github.com/kubernetes/kubernetes/pull/99975
* https://github.com/prometheus/prometheus/issues/7572#issuecomment-934779398
2021-10-19 13:20:40 +03:00
Aliaksandr Valialkin
0d48b89afe docs/CHANGELOG.md: document 146a5b504c 2021-10-19 11:25:02 +03:00
Aliaksandr Valialkin
c64a134146 docs/CHANGELOG.md: document cbcc622786 2021-10-19 08:56:23 +03:00
Aliaksandr Valialkin
ec40affb59 deployment/docker/alerts.yml: formatting fixes after 865a60f13e 2021-10-19 08:53:03 +03:00
Nikolay
cbcc622786 changes job source for /target api (#1723)
use jobNameOriginal instead of relabeled as prometheus does

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1707
2021-10-19 08:49:36 +03:00
Roman Khavronenko
ea8f625b53 dashboards: add cardnilaity limiter panels for vmagent (#1720)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 19:15:33 +03:00
Yurii Kravets
865a60f13e Update alerts.yml
Added Series Limit day\hour alerts
2021-10-18 18:14:49 +03:00
Aliaksandr Valialkin
f744c1c6d9 vendor: return back the previous google.golang.org/genproto version, since the latest version leads to compile errors
The following errors:

    vendor/cloud.google.com/go/storage/storage.go:1447:53: o.GetCustomerEncryption().GetKeySha256 undefined (type *"google.golang.org/genproto/googleapis/storage/v2".Object_CustomerEncryption has no field or method GetKeySha256)
    vendor/cloud.google.com/go/storage/writer.go:439:10: q.GetCommittedSize undefined (type *"google.golang.org/genproto/googleapis/storage/v2".QueryWriteStatusResponse has no field or method GetCommittedSize)
2021-10-18 15:37:18 +03:00
Aliaksandr Valialkin
dea8521ab9 vendor: make vendor-update 2021-10-18 15:25:11 +03:00
Yury Molodov
a3e09a57c2 vmui: features (#1711)
* feat: initial uPlot graph

* feat: add zoom/pan for graph

* fix: add zoom by ctrl/mac

* fix: remove unused code

* feat: add toggle cache for fetch

* feat: add fix y-axis limits

* fix: stop point events while panning

* fix: change getting cursor position when scaling

* feat: add cursor tooltip to graph

* fix: uninstall chart.js

* fix: change link for create an issue

* fix: set default cache value to true

* app/vmalert: follow-up after 0e2486df56

* docs/CHANGELOG.md: document 5416e18007

* app/vmui: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-18 15:16:57 +03:00
Roman Khavronenko
146a5b504c vmalert: remove extra / from path in WEB interface (#1717)
The extra `/` may cause issues when additional path prefixes
are configured. Also, removing it makes it consistent
with the rest of declarations.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:12:47 +03:00
Roman Khavronenko
478854d36d vmctl: follow-up after 95d1d38595 (#1718)
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-18 15:10:44 +03:00
Miro Prasil
5416e18007 vmctl influx convert bool to number (#1714)
vmctl: properly convert influx bools into integer representation

When using vmctl influx, the import would fail importing boolean fields
with:

```
failed to convert value "some".0 to float64: unexpected value type true
```

This converts `true` to `1` and `false` to `0`.

Fixes #1709
2021-10-18 10:29:34 +03:00
Alexander Rickardsson
0e2486df56 vmalert: add disablePathAppend to remote read (#1712)
* vmalert: add disablePathAppend to remoteRead

* docs: add docs for remoteRead.disablePathAppend
2021-10-18 10:24:52 +03:00
Alexander Rickardsson
c0e58ade45 vmalert: Redact passwords from error messages (#1713) 2021-10-18 10:20:26 +03:00
Aliaksandr Valialkin
da97e58979 app/vmselect/promql: randomize the static selection of time series returned from limitk()
Sort series by a hash calculated from the series labels. This should guarantee "random" selection of the returned time series.
Previously the selection could be biased, since time series were sorted alphabetically by label names and label values.
2021-10-16 21:16:49 +03:00
Aliaksandr Valialkin
c37f285466 lib/promscrape: set honor_timestamps: true by default if this option isnt set explicitly in scrape configs
This aligns the behavior to Prometheus - see https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-10-16 20:49:08 +03:00
Aliaksandr Valialkin
dfc719e012 docs/CaseStudies.md: add Smarkets case study 2021-10-16 19:57:23 +03:00
Aliaksandr Valialkin
a1e54fa2c9 docs/CaseStudies.md: add Fly.io case study 2021-10-16 19:45:44 +03:00
Aliaksandr Valialkin
47c6baf5ea docs/CaseStudies.md: add a case study for Razorpay 2021-10-16 19:36:33 +03:00
Aliaksandr Valialkin
3e9ffb6e33 docs/CaseStudies.md: add AbiosGaming 2021-10-16 19:26:04 +03:00
Aliaksandr Valialkin
ede9dd43e8 docs/CaseStudies.md: add Percona case study 2021-10-16 19:10:38 +03:00
Aliaksandr Valialkin
c055bc478c lib/promscrape: expose promscrape_series_limit_max_series and promscrape_series_limit_current_series metrics per each scrape target with the enabled unique series limiter 2021-10-16 18:47:13 +03:00
Aliaksandr Valialkin
9761b7f3ef vendor: update github.com/valyala/gozstd from v1.13.0 to v1.14.1
This should reduce memory usage in vmagent when compressing large scrape responses in stream parsing mode
2021-10-16 18:20:03 +03:00
Aliaksandr Valialkin
06b0982d6b lib/promscrape: always initialize http client for stream parsing mode
Stream parsing mode can be automatically enabled when scraping targets with big response bodies
exceeding the -promscrape.minResponseSizeForStreamParse , so it must be always initialized.
2021-10-16 13:18:23 +03:00
Aliaksandr Valialkin
cae174b11c app/vmselect/promql: typo fix in comment: didsn't -> didn't 2021-10-16 13:00:34 +03:00
Aliaksandr Valialkin
32793adbd9 lib/promscrape: store the last scraped response in compressed form if its size exceeds -promscrape.minResponseSizeForStreamParse
This should reduce memory usage when scraping targets with big response bodies.
2021-10-16 13:00:30 +03:00
Aliaksandr Valialkin
9866dd95c1 lib/promscrape: store the full response in stream parsing mode in scrapeWork.lastScrape byte slice
This allows sending staleness marks and properly calculate scrape_series_added metric in stream parsing mode
at the cost of the increased memory usage, since now the potentially big response is kept
in the lastScrape byte slice per each scrapeWork.

In practice the memory usage increase shouldn't be big, since the response size
is usually much smaller than the parsed metrics from this response after the relabeling,
which usually adds a big pile of target-specific labels per each metric.
2021-10-15 15:39:23 +03:00
Aliaksandr Valialkin
f6d33596ff lib/promscrape/discovery/kubernetes: rename endpointslices.go -> endpointslice.go in order to be consistent with EndpointSlice struct name
This is a follow-up for 31b42b30b6
2021-10-15 12:27:12 +03:00
Aliaksandr Valialkin
0db0410237 docs/FAQ.md: improve wording on why MetricsQL isnt 100% compatible with PromQL 2021-10-14 16:22:43 +03:00
Aliaksandr Valialkin
78425561ce docs/CHANGELOG.md: document the change at 7fcbd3fa4b 2021-10-14 14:37:44 +03:00
Aliaksandr Valialkin
1ac12597fa docs/FAQ.md: add an entry explaining why MetricsQL isn't 100% compatible with PromQL 2021-10-14 12:50:31 +03:00
Aliaksandr Valialkin
bbd34fa15e lib/promscrape: add -promscrape.minResponseSizeForStreamParse command-line option for automatic switching to stream parsing mode when scraping targets with big responses
This should reduce memory usage when vmagent scrapes targets with non-uniform response sizes.
This is common case in Kubernetes monitoring.
2021-10-14 12:29:35 +03:00
Aliaksandr Valialkin
1a7287c408 lib/promscrape: return error if sample_limit or series_limit options are set when stream parsing mode is enabled 2021-10-14 12:11:23 +03:00
Roman Khavronenko
7fcbd3fa4b Adjust http.Transport.MaxIdleConns setting for vmauth/vmalert services (#1704)
* vmalert: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `datasource.maxIdleConnections` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* vmauth: adjust `http.Transport.MaxIdleConns` value accordingly to `http.Transport.MaxIdleConnsPerHost`

`http.Transport.MaxIdleConnsPerHost` setting is controlled by `maxIdleConnsPerBackend` flag,
while `http.Transport.MaxIdleConns` is inherited from DefaultTransport and is equal to `100`.
The fix adjusts `http.Transport.MaxIdleConns` value if it is lower than `http.Transport.MaxIdleConnsPerHost`.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-10-13 17:29:28 +03:00
Aliaksandr Valialkin
1c17fe70e0 docs/CHANGELOG.md: document e3c8304deb 2021-10-13 16:00:50 +03:00
Aliaksandr Valialkin
e3c8304deb lib/promscrape: add ability to show the original labels for discovered targets at /targets page
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1698
2021-10-13 15:59:58 +03:00
Roman Khavronenko
8df3c569c7 vmalert: add Source link to alerts UI (#1701)
The source link is controlled by `external.url` and `external.alert.source`
flags, in the same way as for alertmanager notifications.
The source link is added to Alerts list view, and specific Alert view.
2021-10-13 15:25:11 +03:00
Aliaksandr Valialkin
3d61a10367 docs/MetricsQL.md: add missing blank line before the link to github.com/VictoriaMetrics/metricsql package 2021-10-13 15:10:54 +03:00
Roman Khavronenko
c0a932a55f lib/promscrape: make errcheck happy (#1703) 2021-10-13 14:57:30 +03:00
Aliaksandr Valialkin
9882cda8b9 lib/promscrape: shard targets among cluster nodes after relabeling is applied
This guarantees that targets with the same set of labels go to the same vmagent node.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1687#issuecomment-940629495
2021-10-12 17:06:00 +03:00
Aliaksandr Valialkin
5a58c041c2 app/vmagent: expose -promscrape.config contents at /config page as Prometheus does
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1695
2021-10-12 16:25:37 +03:00
Aliaksandr Valialkin
4f242980be docs/FAQ.md: add a solution against high churn rate 2021-10-12 10:32:39 +03:00
Aliaksandr Valialkin
1eaaf8ad51 vendor: make vendor-update 2021-10-11 21:51:44 +03:00
Roman Khavronenko
2c6d86226f docs: mention "PromQL compliance" in MetricsQL docs (#1691) 2021-10-11 21:21:29 +03:00
Aliaksandr Valialkin
a5001b9c20 app/vmselect/promql: add atan2 binary operator, which is going to be added in Prometheus 2.31
See https://github.com/prometheus/prometheus/pull/9248
2021-10-11 21:15:53 +03:00
Aliaksandr Valialkin
81c6720392 app/vmselect/promql: add missing trigonometric functions, which are going to be added in Prometheus 2.31
See https://github.com/prometheus/prometheus/issues/9233
2021-10-11 21:01:33 +03:00
Aliaksandr Valialkin
8679ba71dd docs/MetricsQL.md: clarify docs for union() function 2021-10-11 17:40:44 +03:00
Aliaksandr Valialkin
873aac584e lib/promscrape: use Prometheus format for target labels at /targets page
This should simplify copy-pasting the labels to/from PromQL / MetricsQL
2021-10-11 12:41:37 +03:00
Denys Holius
dd4038f0e5 Added some fixes (#1690)
* removed not needed description

* added some fixes and fixed typos
2021-10-11 11:21:07 +03:00
Aliaksandr Valialkin
986bed8261 docs/MetricsQL.md: add a link to https://medium.com/@romanhavronenko/victoriametrics-promql-compliance-d4318203f51e 2021-10-11 11:00:56 +03:00
Roman Khavronenko
9b557a88fc docs: add "PromQL compliance" article (#1689)
* docs: add "PromQL compliance" article

* Update docs/Articles.md

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-10-11 10:58:34 +03:00
Aliaksandr Valialkin
83a2a9f2f7 deployment/docker/docker-compose.yml: upgrade Grafana from v8.1.2 to v8.2.0 2021-10-08 20:37:40 +03:00
Aliaksandr Valialkin
92b92d4d2c app/vmselect/promql: consistently return the same set of time series from limitk() function
This is the expected behaviour by most users.
2021-10-08 19:53:52 +03:00
Aliaksandr Valialkin
001750c239 lib/storage: fix unaligned access on 32-bit architectures.
The bug has been introduced at a171916ef5
2021-10-08 19:43:03 +03:00
Denys Holius
4b0cefc4bd Added fixes and improvements (#1677)
* added guide for VM operator

* Update docs/guides/getting-started-with-vm-operator.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/getting-started-with-vm-operator.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Fixed different typos and added improvements from proposals

* move remoteWrite.url to other place

* fixed typo

* rephrased vminsert explanation

* remove not needed parameters for default setup

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2021-10-08 18:57:36 +03:00
Aliaksandr Valialkin
00fe5230e9 deployment/docker: update Go builder version from Go1.17.1 to Go1.17.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.2+label%3ACherryPickApproved
2021-10-08 17:42:57 +03:00
Aliaksandr Valialkin
6058edb0d1 vendor: make vendor-update 2021-10-08 16:04:56 +03:00
Aliaksandr Valialkin
0a3a774202 docs/CHANGELOG.md: cut v1.67.0 2021-10-08 16:00:33 +03:00
Aliaksandr Valialkin
0ff8fcac6a app/vmui: follow-up after 7bfb44113e
* Run `vmui-update`
* Document the changes in README.md and CHANGELOG.md
2021-10-08 15:09:29 +03:00
Yury Molodov
7bfb44113e vmui: use uPlot as default engine for graph (#1683)
* feat: initial uPlot graph

* feat: add zoom/pan for graph

* fix: add zoom by ctrl/mac

* fix: remove unused code
2021-10-08 15:07:35 +03:00
Aliaksandr Valialkin
cf5cbd1c70 app/{vminsert,vmstorage}: follow-up after a171916ef5
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269
2021-10-08 14:35:49 +03:00
Nikolay
4290b46e8c Adds read-only mode for vmstorage node (#1680)
* adds read-only mode for vmstorage
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/269

* changes order a bit

* moves isFreeDiskLimitReached var to storage struct
renames functions to be consistent
change protoparser api - with optional storage limit check for given openned storage

* renames freeSpaceLimit to ReadOnly
2021-10-08 14:35:48 +03:00
Aliaksandr Valialkin
2748255c8b app/vmselect/promql: substitute rollupFuncsCannotAdjustWindow with rollupFuncsCanAdjustWindow
The list of functions, which can adjust lookbehind window is more limited than the rest of functions,
so it is better from maintainability and readability PoV using the allowlist instead of blocklist.
2021-10-07 13:18:42 +03:00
Aliaksandr Valialkin
c45210a6f9 app/vmselect/promql: return back the behaviour for deriv() function when the lookbehind window doesnt contain enough points
It is expected that the `deriv(m[d])` returns non-empty value if the lookbehind window `d`
contains less than 2 samples in the same way as `rate()` does.

This is a follow-up after 3e084be06b .
2021-10-07 12:52:27 +03:00
Roman Khavronenko
3e084be06b app/vmselect: make predict_linear and deriv compatible with Prometheus (#1681)
Previously, `predict_linear` returned slightly different results comparing
to Prometheus. The change makes linear regression algorithm compatible
with Prometheus.

`deriv` was excluded from the list of functions which can adjust the time
window for the same reasons.
2021-10-07 12:50:49 +03:00
Aliaksandr Valialkin
a19e7c7ce8 app/vminsert: fix uneven distribution of time series among storage nodes
Use distinct seed for distribution hash calculations on the second level of vminsert nodes.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1672
2021-10-07 12:23:39 +03:00
Aliaksandr Valialkin
a71c9ad650 docs/guides: follow-up after 05a1396247 2021-10-06 14:46:46 +03:00
Thomas Danielsson
05a1396247 fix: typo metric_relabel_configs (#1674)
metric_ralabel_configs -> metric_relabel_configs
2021-10-06 14:45:39 +03:00
Ziqi Zhao
402c995d6d fix some typos (#1678)
Co-authored-by: 柘远 <zzq237937@alibaba-inc.com>
2021-10-06 14:43:10 +03:00
Aliaksandr Valialkin
ec3a87bb46 vendor: make vendor-update 2021-10-05 10:29:12 +03:00
Aliaksandr Valialkin
c7c966d0e9 docs/vmagent.md: update docs after 3e9a939a990c8b608414388c96f68eb062364ae7 2021-10-05 10:23:33 +03:00
Aliaksandr Valialkin
3dea9e02d0 vendor: make vendor-update 2021-09-30 17:52:02 +03:00
Aliaksandr Valialkin
9515e58e28 docs/vmagent.md: document how to write data to Kafka 2021-09-30 17:45:53 +03:00
Aliaksandr Valialkin
6ee66fb6b1 lib/promscrape: reduce memory allocations in mergeLabels() after 48e3e6c8df 2021-09-30 16:56:12 +03:00
Aliaksandr Valialkin
0e3de5a0cc app/vmselect/promql: add topk_last and bottomk_last functions 2021-09-30 13:22:52 +03:00
Roman Khavronenko
a31407006c app/vmselect: fix binary comparison func (#1667)
The fix makes the binary comparison func to check for NaNs
before executing the actual comparison. This prevents VM
to return values for non-existing samples for expressions
which contain bool comparisons. Please see added test
for example.
2021-09-30 12:24:17 +03:00
Roman Khavronenko
344490d89b app/vmselect: fix testRowsEqual func NaN checks (#1666)
It appeared, that `testRowsEqual` NaN comparison was incorrect.
The fix caused some tests to fail. Please see the change and
tests updated.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-30 12:08:47 +03:00
Aliaksandr Valialkin
463a5bf76e lib/protoparser: go fmt 2021-09-29 21:19:00 +03:00
Aliaksandr Valialkin
6d9f1d4227 app/vminsert: document that -relabelConfig is reloaded on SIGHUP signal 2021-09-29 21:18:58 +03:00
Aliaksandr Valialkin
58964d52a5 lib/protoparser/prometheus: compare invalid Prometheus lines in full 2021-09-29 19:41:28 +03:00
Aliaksandr Valialkin
2b623ae302 docs/CHANGELOG.md: link to Kafka integration docs 2021-09-29 12:31:23 +03:00
Aliaksandr Valialkin
d80d72efec app/{vmbackup,vmrestore}: switch from gcs://... to gs://... urls for backups to GCS
The `gs://` urls are commonly used, so prefer them instead of `gcs://` urls,
while leaving support for `gcs://` urls for backwards compatibility.
2021-09-29 12:10:29 +03:00
Aliaksandr Valialkin
396e233ac1 docs/vmagent.md: update Telegraf config in the section about Kafka 2021-09-29 11:21:15 +03:00
Aliaksandr Valialkin
0e5ab52908 docs/vmagent.md: add docs about reading metrics from Kafka 2021-09-29 01:46:12 +03:00
Yury Molodov
893af0a92c vmui: fixed bug with time range (time zone) (#1661)
* fix: set date in query string in utc format

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 01:00:44 +03:00
Nikolay
cc72f9428d changes vmagent api (#1656)
* changes vmagent api
adds auth.Token to promremotewrite InsertHandlerReader
changes remoteWrite client constructor, allows to use multiple remoteWriteUrl schemes, like kafka://
changes url path concatenation for tenant remoteWrite

Update app/vmagent/remotewrite/client.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>

* Update app/vmagent/remotewrite/remotewrite.go

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:52:07 +03:00
Roman Khavronenko
5dc84bf210 app/vmselect: disable time-window adjustment for min/max_over_time funcs (#1658)
Adjustment results into discrepancy between Prometheus and VM on time windows
smaller than scrape interval.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:43:21 +03:00
Aliaksandr Valialkin
ead59bdebf docs/CHANGELOG.md: document the bugfix from de810031bf 2021-09-29 00:41:35 +03:00
Roman Khavronenko
de810031bf app/vmselect: always return zero for stddev func if there is only one value (#1659)
The fix will always return zero if received set of items consists of one
element only, which also means no deviation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:38:55 +03:00
Roman Khavronenko
dd536b475c app/vmselect: return NaN instead of 0 for empty value sets (#1660)
The change affects `count/stddev/stdvar_over_time` funcs and makes
them to return NaN instead of zero when there is no datapoints
in a time window.
This is needed for improving compatibility with Prometheus.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:37:04 +03:00
Roman Khavronenko
03cd93bf1a app/vmselect: rm quantile_over_time fast-path optimisations (#1662)
The removed fast path optimisations weren't consistent with
`quantile` function behavior and results into discrepancy.
Specifically, results didn't match in cases when:
* 0 < phi > 1;
* values contain only one element.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-29 00:35:14 +03:00
Aliaksandr Valialkin
50ec259750 docs/CHANGELOG.md: document 3d17112a7e 2021-09-29 00:33:08 +03:00
Nikolay
3d17112a7e changes auth validation for openstack (#1663)
* changes auth validation for openstack
must fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1655

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-29 00:28:49 +03:00
Aliaksandr Valialkin
91b3c601bc app/{vminsert,vmagent}: add ability to ingest data via DataDog "submit metrics" API
See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/206
2021-09-29 00:13:08 +03:00
Yury Molodov
a64155d91e vmui: use Chart.js as default engine for graph (#1634)
* feat: add Plotly as default engine for graph

* fix: remove unused components

* feat: use Chart.js as default engine graph

* fix: correct styles for loader

* feat: add zoom/pan for chart

* feat: add height for chart

* fix: remove unused code

* fix: remove empty units from duration

* fix: change debounce for pan to 500ms

* fix: add utility for plugins register globally

* fix: optimize render graph

* feat: add buffer data for zoom

* fix: add limits for zoom in/out

* fix: change update data while zooming

* app/vmselect: `make vmui-update`

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-27 22:26:14 +03:00
Aliaksandr Valialkin
8c0283381d app/victoria-metrics/testdata/graphite/max_lookback_unset.json: fix the test after c4c77aa2dd
The commit c4c77aa2dd slightly changed how scrape_interval is detected per-time series,
so the max_lookback_unset test should be updated accordingly.
2021-09-27 21:41:14 +03:00
Aliaksandr Valialkin
2efe0acfc9 app/vmselect/promql: add rollup_scrape_interval(m[d]) function
It calculates the min, max and avg scrape intervals for m over the given lookbehind window d
2021-09-27 19:21:24 +03:00
Aliaksandr Valialkin
c4c77aa2dd app/vmselect/promql: follow-up after 526dd93b32
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1625
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1612
2021-09-27 18:55:39 +03:00
Roman Khavronenko
526dd93b32 app/vmselect: quantile func compatiblity with Prometheus (#1646)
* app/vmselect: `quantile` func compatiblity with Prometheus

The `quantile` func was previously calculated by https://github.com/valyala/histogram
package. The result of such calculation was always the closest real value to
requested quantile. While in Prometheus implementation interpolation is used.
Such difference may result into discrepancy in output between Prometheus and
VictoriaMetrics.

This commit adds a Prometheus-like `quantile` function. It also used by other
functions which depend on it, such as `quantiles`, `quantile_over_time`, `median` etc.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1625

Signed-off-by: hagen1778 <roman@victoriametrics.com>

* app/vmselect: `quantile` review fixes

* quantile functions were split into multiple to provide
different API for already sorted data;
* float64sPool is used for reducing allocations. Items in pool may have
different sizes, but defining a new pool was complicates due to name collisions;

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-27 18:02:41 +03:00
Aliaksandr Valialkin
80b0b92d2f vendor: make vendor-update 2021-09-27 17:57:40 +03:00
Aliaksandr Valialkin
1b23224f9c docs/BestPractices.md
docs/BestPractices.md: update the doc
2021-09-27 17:51:21 +03:00
Aliaksandr Valialkin
8ed95e82c6 app/vmselect/promql: follow-up after 57b3320478 2021-09-24 01:24:18 +03:00
Roman Khavronenko
57b3320478 app/vmselect: make sorting for query result similar to Prometheus (#1647)
* app/vmselect: make sorting for query result similar to Prometheus

Updated sorting allows to get the order of series in result similar or equal
to what Prometheus returns.
The change is needed for compatibility reasons.

* Update app/vmselect/promql/exec_test.go

Co-authored-by: Aliaksandr Valialkin <valyala@victoriametrics.com>
2021-09-24 01:03:12 +03:00
Aliaksandr Valialkin
e564411a62 app/vmselect/promql: align the behavior of or, and and unless operators with on (labels) modifier to Prometheus
Previously VictoriaMetrics could return unexpected result of the right-hand side operand
had multiple time series with the given set of labels mentioned in `on(labels)`.

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1643
2021-09-24 00:46:25 +03:00
Nikolay
3f1e6da1d7 moves prod images build into alpine container with musl (#1640)
adds gcc and musl-dev to builder container
2021-09-24 00:14:11 +03:00
Aliaksandr Valialkin
9f19649672 docs/CHANGELOG.md: cut v1.66.2 2021-09-23 22:53:36 +03:00
Aliaksandr Valialkin
718eca33ab lib/storage: properly handle {__name__=~"prefix(suffix1|suffix2)",other_label="..."} queries
They were broken in the commit 00cbb099b6

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1644
2021-09-23 21:48:51 +03:00
Aliaksandr Valialkin
c5bb95a417 docs: make docs-sync 2021-09-23 20:51:35 +03:00
Aliaksandr Valialkin
f5896b7420 docs/CHANGELOG.md: document 0e35fc9538
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641
2021-09-23 20:46:23 +03:00
Roman Khavronenko
9dc4d16664 app/vmctl: fix misleading comment about cluster version for native mode (#1648)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1637
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-23 17:57:25 +03:00
Roman Khavronenko
0e35fc9538 app/vmalert: remove unnecessary omitempty tag for interval param (#1649)
`omitempty` tag resulted into skipping this param on marshaling,
which was used as a checksum for groups configuration. Since on
config reload checksums are compared before applying changes,
any change to `interval` only didn't trigger config reload.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1641
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2021-09-23 17:55:59 +03:00
Aaron France
6f061dab19 fix: typo in vmagent.md (#1642) 2021-09-23 16:51:02 +03:00
Aliaksandr Valialkin
176348cbcc vendor: make vendor-update 2021-09-23 15:05:27 +03:00
Aliaksandr Valialkin
a0313c046b lib/promscrape: add vm_promscrape_max_scrape_size_exceeded_errors_total metric for counting of the failed scrapes due to the exceeded response size
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1639
2021-09-23 14:47:54 +03:00
Aliaksandr Valialkin
d5c2741e8f docs/CaseStudies.md: add a case study for Grammarly 2021-09-23 13:11:48 +03:00
Aliaksandr Valialkin
73cd74075d docs/Articles.md: add https://cer6erus.medium.com/superset-bi-with-victoria-metrics-a109d3e91bc6 2021-09-23 10:43:29 +03:00
Aliaksandr Valialkin
00277583f9 vendor: update github.com/valyala/gozstd from v1.12.0 to v1.13.0 2021-09-22 20:06:44 +03:00
Aliaksandr Valialkin
99a6c212e8 docs/CaseStudies.md: fix a link to third-party articles 2021-09-22 03:52:35 +03:00
Aliaksandr Valialkin
9b3d1a1996 docs/CHANGELOG.md: cut v1.66.1 2021-09-22 01:47:05 +03:00
Aliaksandr Valialkin
a13c3de36f docs/CHANGELOG.md: document 9ca1cbced1
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1635
2021-09-21 23:17:08 +03:00
Aliaksandr Valialkin
9ca1cbced1 lib/httpserver: add -enterprise and/or -cluster suffixes to short_version label of vm_app_version metric
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1635
2021-09-21 23:12:42 +03:00
Aliaksandr Valialkin
207c5760ce lib/promrelabel: fix parsing regex: true in relabeling rules 2021-09-21 23:00:53 +03:00
Aliaksandr Valialkin
9884a55f3c vendor: temporarily stick to v0.93.3 for cloud.google.com/go until the binary size bloat issue is resolved
This returns back VictoriaMetrics binary size from 24Mb to 18Mb.

See https://github.com/googleapis/google-cloud-go/issues/4783
2021-09-21 18:13:47 +03:00
Roman Khavronenko
ac1abe2faf app/vmalert: support http.pathPrefix flag in UI (#1636)
The change makes UI to respect `http.pathPrefix` flag
for API or navigation items links.
2021-09-21 14:41:01 +03:00
Aliaksandr Valialkin
a22aa0608b app/vmselect: fix accessing /graphite/* endpoints 2021-09-21 13:56:35 +03:00
Aliaksandr Valialkin
94148d5ad7 docs/vmagent.md: typo fix 2021-09-20 16:49:20 +03:00
Aliaksandr Valialkin
51657b1e04 docs/vmagent.md: typo fixes in Prometheus staleness markers docs 2021-09-20 16:44:09 +03:00
Aliaksandr Valialkin
76811c2f60 docs/CHANGELOG.md: cut v1.66.0 2021-09-20 15:20:25 +03:00
Nikolay
ad08d9dfc0 changes protoparser apis for accepting reading from io.Reader (#1624)
adds InsertHandlerForReader apis to vmagent
2021-09-20 14:49:28 +03:00
Aliaksandr Valialkin
15ea4c6dae vendor: make vendor-update 2021-09-20 14:38:55 +03:00
n4mine
1ac8d55147 fix: typo, dddresses -> addresses (#1630) 2021-09-20 14:28:59 +03:00
Aliaksandr Valialkin
a06ff456f8 docs/Articles.md: add a link to Open-source strategy at VictoriaMetrics 2021-09-20 11:46:10 +03:00
Aliaksandr Valialkin
9a3d0c43b5 app/vmselect/promql: add quantiles_over_time("phiLabel", phi1, ..., phiN, m[d]) function for calculating multiple quantiles at once 2021-09-17 23:35:10 +03:00
Aliaksandr Valialkin
e1e5a20b36 docs/CHANGELOG.md: document 0e09fdb8b0 2021-09-17 18:47:06 +03:00
Nikolay
0e09fdb8b0 makes filters optional for ec2 api requests (#1627)
filters can be applied only for DescribeInstances requests, like prometheus does.
related issue https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1626
2021-09-17 18:00:37 +03:00
Aliaksandr Valialkin
2951dd0a57 app/vmselect/promql: add histogram_quantiles("phiLabel", phi1, ..., phiN, buckets) function
This function calculates multiple quantiles over the given buckets at once

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1573
2021-09-17 13:32:39 +03:00
Aliaksandr Valialkin
8c504d6efa docs/CHANGELOG.md: document the change in enterprise apps, which allows passing -version without -eula flag
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1621
2021-09-17 12:37:45 +03:00
Aliaksandr Valialkin
5a44be0e52 app/vmselect/promql: optimize quantiles() calculation
Calculate quantiles in one go instead of calculating each quantile individually
2021-09-17 12:33:42 +03:00
Aliaksandr Valialkin
948fb638f5 docs/FAQ.md: extend VictoriaMetrics vs TimescaleDB section with real user experience
See also https://github.com/timescale/promscale/issues/427 , which is mentioned in the https://abiosgaming.com/press/high-cardinality-aggregations/
2021-09-16 20:46:25 +03:00
Roman Khavronenko
b75455c650 vmalert: add new metric vmalert_remotewrite_flush_duration_seconds (#1622) 2021-09-16 14:00:16 +03:00
Roman Khavronenko
f83fa31985 docs: fix indentation for FAQ document (#1620) 2021-09-16 13:59:22 +03:00
f41gh7
9375b60c5f adds stub for functions api 2021-09-16 13:49:52 +03:00
Aliaksandr Valialkin
e60dfc96ff app/vmselect/promql: add mad(q) and outliers_mad(tolerance, q) functions to MetricsQL 2021-09-16 13:33:53 +03:00
Aliaksandr Valialkin
eca75cc650 app/vmselect/prometheus: make more clear log messages for errors during sending data to remote clients 2021-09-16 12:56:58 +03:00
Aliaksandr Valialkin
26cd0d36b4 vendor: make vendor-update 2021-09-15 18:22:59 +03:00
Aliaksandr Valialkin
44b01fff13 app/{vminsert,vmselect}: automatically add missing port in -storageNode lists passed to vminsert and vmselect
This should simplify manual setup of the cluster according to https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#cluster-setup
2021-09-15 18:08:30 +03:00
Aliaksandr Valialkin
06ed694ad9 docs/CHANGELOG.md: document 777ff75874 2021-09-15 17:45:08 +03:00
Aliaksandr Valialkin
2f86d4cf38 app/vmui: follow-up after 777ff75874
The commit contains the following changes:

- Show vmui when requesting /graph page in order to be compatible with Prometheus datasource in Grafana.
- Properly encode query args at vmui url.
- Set the number of points on the graph to the number of horizontal pixels divided by 2. Previously it was hardcoded to 30.
- Do not save server url to persistent storage at browser, since it should be always obtained from the url.
- Run `make vmui-update` for updating vmui embedded into VictoriaMetrics.
2021-09-15 17:40:48 +03:00
Yury Molodov
777ff75874 vmui: change query params compatible with prometheus (#1619)
* feat: change url params for compatible prometheus

* style: add comment for TimeParams

* fix: change get default server for single version

* fix: change function for get query string value
2021-09-15 09:42:49 +03:00
Aliaksandr Valialkin
cf9efde50c vendor: update github.com/valyala/quicktemplate from v1.6.3. to v1.7.0 2021-09-15 09:34:07 +03:00
Aliaksandr Valialkin
3cba77765a vendor: update github.com/VictoriaMetrics/fastcache from v1.6.0 to v1.7.0 2021-09-15 09:34:07 +03:00
Aliaksandr Valialkin
77682f516a vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.16 to v1.1.0 2021-09-15 09:34:07 +03:00
Aliaksandr Valialkin
68ea3d18f7 vendor: update github.com/valyala/histogram from v1.1.2 to v1.2.0
This fixes the non-repeatable quantile_over_time() results when the number of input samples exceeds 1000.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1612
2021-09-15 09:34:07 +03:00
Roman Khavronenko
ecd3069b6c vmalert: create basic auth config only if args aren't empty (#1618)
* vmalert: create basic auth config only if args aren't empty

follow-up after 68721f6

* vmalert: make lint happy
2021-09-15 01:53:31 +03:00
Roman Khavronenko
84b41e498f docs: add "Choosing a Time Series Database for High Cardinality Aggregations" article (#1617) 2021-09-15 01:51:56 +03:00
Aliaksandr Valialkin
3e1683756b docs/vmalert.md: follow-up after 68721f6e7d
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1608
2021-09-14 14:47:47 +03:00
Roman Khavronenko
68721f6e7d vmalert: support bearer token for datasource, remotewrite and remoteread (#1614)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1608
2021-09-14 14:32:06 +03:00
Dima Lazerka
56069a3022 Remove port 3000 and add https to play-grafana link (#1616)
* Remove port 3000 and add https to play-grafana link

* Fix typo

Co-authored-by: Dzmitry Lazerka <dlazerka@gmail.com>
2021-09-14 14:24:30 +03:00
Aliaksandr Valialkin
8f685d81c6 lib/storage: follow up after 00cbb099b6 2021-09-14 14:16:25 +03:00
faceair
00cbb099b6 lib/storage: optimize convert multiple values regexp filter to composite tag filter (#1610)
* lib/storage: optimize convert multiple values regexp filter to composite tag filter

* Apply suggestions from code review

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-09-14 12:47:07 +03:00
dependabot[bot]
bc2d05be8e build(deps): bump codecov/codecov-action from 2.0.3 to 2.1.0 (#1615)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.0.3 to 2.1.0.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v2.0.3...v2.1.0)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-14 12:23:01 +03:00
Aliaksandr Valialkin
adedc83b3b app/vmauth: do not log invalid auth tokens by default for security reasons
The logging can be enabled by passing `-logInvalidAuthTokens` command-line flag to vmauth
2021-09-14 12:20:03 +03:00
Aliaksandr Valialkin
e46bd9e47f docs/Single-server-VictoriaMetrics.md: link to cardinality limiter docs in vmagent 2021-09-13 21:26:59 +03:00
Aliaksandr Valialkin
07b9c7994f docs/vmagent.md: mention out of order sample errors, which are typically emitted by Thanos, Cortex or Prometheus 2021-09-13 19:36:31 +03:00
Aliaksandr Valialkin
8a6a36429a app/vminsert/netstorage: disable rerouting by default
Production clusters work more stable with the disabled rerouting during rolling restarts and/or
during spikes in time series churn rate. So it would be better disabling the rerouting by default.

The re-routing can be enabled by passing `-disableRerouting=false` command-line flag to `vminsert` nodes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/791
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1054
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1165
2021-09-13 18:51:56 +03:00
Aliaksandr Valialkin
c4f11a49f8 docs/CHANGELOG.md: document 5494bc02a6 2021-09-13 17:11:23 +03:00
Aliaksandr Valialkin
7f0a8d4bdb docs: consistency renaming: Influx -> InfluxDB 2021-09-13 17:05:16 +03:00
Aliaksandr Valialkin
143a3b34ee app/vmui/Dockerfile-web: update Go builder from 1.16.7 to 1.17.1 and Alpine base image from 3.14.1 to 3.14.2 2021-09-13 17:05:16 +03:00
Roman Khavronenko
5494bc02a6 vmalert: add flag to limit the max value for auto-resovle duration for alerts (#1609)
* vmalert: add flag to limit the max value for auto-resovle duration for alerts

The new flag `rule.maxResolveDuration` suppose to limit max value for
alert.End param, which is used by notifiers like Alertmanager for alerts auto resolve.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1586
2021-09-13 15:48:18 +03:00
Aliaksandr Valialkin
b9727a36dc docs/vmbackup.md: update the outdated link to vmbackupmanager 2021-09-13 14:16:53 +03:00
Roman Khavronenko
75f35c3b11 vmalert: display extra filter labels in UI (#1613) 2021-09-13 14:11:38 +03:00
Aliaksandr Valialkin
d1a16e0891 app/vmselect/promql: use Prometheus-compatible label value formatting for count_values function 2021-09-13 13:48:06 +03:00
Aliaksandr Valialkin
fb6ed0ce19 lib/promscrape/discovery/docker: support host networking mode
See https://github.com/prometheus/prometheus/issues/9116
2021-09-13 13:30:16 +03:00
Aliaksandr Valialkin
6295861acd lib/promscrape/discovery/kubernetes: properly use https scheme for wildcard TLS certificates in ingress target discovery 2021-09-13 13:03:42 +03:00
Aliaksandr Valialkin
2814388891 vendor: make vendor-update 2021-09-12 15:26:44 +03:00
Aliaksandr Valialkin
2394b5018b deployment/docker: update Go builder from v1.17.0 to v1.17.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.17.1+label%3ACherryPickApproved
2021-09-12 15:23:53 +03:00
Aliaksandr Valialkin
728c4c3841 lib/promscrape: generate scrape_timeout_seconds metric per each scrape target in the same way as Prometheus 2.30 does
See https://github.com/prometheus/prometheus/pull/9247
2021-09-12 15:20:44 +03:00
Aliaksandr Valialkin
0b4eb0fa7d lib/promscrape: make fmt 2021-09-12 13:34:15 +03:00
Aliaksandr Valialkin
48e3e6c8df lib/promscrape: add ability to configure scrape_timeout and scrape_interval via relabeling
See https://github.com/prometheus/prometheus/pull/8911
2021-09-12 13:33:41 +03:00
Aliaksandr Valialkin
f3e89754a9 lib/promscrape: reduce CPU usage for common case when calculating scrape_series_added metric
Also reduce CPU usage when applying `series_limit` to scrape targets with constant set of metrics.

The main idea is to perform the calculations on scrape_series_added and series_limit
only if the set of metrics exposed by the target has been changed.
Scrape targets rarely change the set of exposed metrics,
so this optimization should reduce CPU usage in general case.
2021-09-12 12:53:14 +03:00
Aliaksandr Valialkin
674a6eee6c docs/Single-server-VictoriaMetrics.md: refer to relabeling section for vmagent
This removes duplicate docs about additional relabeling actions supported by VictoriaMetrics components
2021-09-12 11:39:19 +03:00
Aliaksandr Valialkin
77168e3e94 docs/vmagent.md: sync with app/vmagent/README.md by running make docs-sync 2021-09-11 11:04:42 +03:00
Aliaksandr Valialkin
cebcb15ba4 lib/storage: verify that the tsidsFound contain the needed tsids in tests added at f4dead529f 2021-09-11 10:57:13 +03:00
Aliaksandr Valialkin
9286107e82 lib/promscrape: send stale markers for disappeared metrics like Prometheus does 2021-09-11 10:51:04 +03:00
Aliaksandr Valialkin
cfed015bb6 docs/vmalert.md: typo fix in Multitenancy chapter 2021-09-10 17:57:14 +03:00
Aliaksandr Valialkin
f4dead529f lib/storage: properly search series by multiple tag filters matching empty labels such as foo{bar=~"baz|",x=~"y|"}
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1601
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/395
2021-09-09 21:09:21 +03:00
Aliaksandr Valialkin
ea943911bc app/vmselect/promql: keep metric name in rollup_candlestick results, since they don't change the original series meaning
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1600
2021-09-09 19:21:18 +03:00
Aliaksandr Valialkin
f27980dcb3 docs/CHANGELOG.md: typo fix 2021-09-09 18:57:30 +03:00
Aliaksandr Valialkin
4aeb8db83f lib/promscrape: add ability to set series_limit and stream_parse options via relabeling
This allows managing these options on a per-target basis.

Typical use case: to manage these options for pods via Kubernetes annotations.
2021-09-09 18:49:39 +03:00
Aliaksandr Valialkin
468f941f7e lib/promscrape: add the actual job name to the labels of promscrape_series_limit_rows_dropped_total metric 2021-09-09 17:37:37 +03:00
Aliaksandr Valialkin
086b5d0cf1 lib/promscrape: add scrape_ prefix to job and target labels exported by promscrape_series_limit_rows_dropped_total metric
This is needed in order to prevent from possible clash with the corresponding (job, target) labels for the job, which scrapes this metric.
2021-09-09 17:29:21 +03:00
Aliaksandr Valialkin
d2708a1fb7 docs/vmagent.md: typo fix in Relabeling chapter 2021-09-09 16:39:40 +03:00
Denys Holius
abba6e8370 Bump alpine linux to latest (#1607) 2021-09-09 16:29:15 +03:00
Aliaksandr Valialkin
d6bd956930 lib/promrelabel: add keep_metrics and drop_metrics actions to relabeling rules
These actions simlify metrics filtering. For example,

- action: keep_metrics
  regex: 'foo|bar|baz'

would leave only metrics with `foo`, `bar` and `baz` names, while the rest of metrics will be deleted.

The commit also makes possible to split long regexps into multiple lines. For example, the following config is equivalent to the config above:

- action: keep_metrics
  regex:
  - foo
  - bar
  - baz
2021-09-09 16:18:21 +03:00
Aliaksandr Valialkin
3a827b98cd docs/vmalert.md: make docs-sync after 21f022e5f0 2021-09-09 16:16:25 +03:00
Aliaksandr Valialkin
a8053d9fc6 docs/MetricsQL.md: add a link to VictoriaMetrics github 2021-09-08 00:14:59 +03:00
Aliaksandr Valialkin
e84fa9eb38 app/vmalert: document GroupAlerts
This makes golint happy
2021-09-07 22:50:08 +03:00
Aliaksandr Valialkin
e6c9869d86 app/vmalert: follow-up after 21f022e5f0 2021-09-07 22:43:37 +03:00
Roman Khavronenko
21f022e5f0 vmalert: add initial UI implementation (#1602)
New UI pages:
/ - welcome page with API handlers list;
/groups - list of all rules per group;
/alerts - list of all active alerts;
/groupID/alertID/status - status of the active alert;
2021-09-07 22:39:22 +03:00
Aliaksandr Valialkin
6fbaf8f978 docs/CHANGELOG.md: document 42e07cfaea 2021-09-07 22:34:39 +03:00
dependabot[bot]
0c7110d1a5 build(deps): bump github.com/aws/aws-sdk-go from 1.40.34 to 1.40.37 (#1598)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.34 to 1.40.37.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.34...v1.40.37)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-07 20:55:49 +03:00
Aliaksandr Valialkin
e34f7081d0 docs/vmagent.md: add API path for Prometheus text exposition format 2021-09-07 16:14:51 +03:00
Aliaksandr Valialkin
0166eae7c4 docs/Articles.md: add a link to case studies 2021-09-06 10:57:59 +03:00
Aliaksandr Valialkin
5e3ef376b5 docs/Articles.md: add an url to https://www.vultr.com/docs/install-and-configure-victoriametrics-on-debian 2021-09-03 11:49:37 +03:00
Aliaksandr Valialkin
ef27786e37 docs/Single-server-VictoriaMetrics.md: fix a link multitenancy docs for cluster version in VictoriaMetrics 2021-09-02 17:43:35 +03:00
Aliaksandr Valialkin
f529058d3a docs/CHANGELOG.md: cut v1.65.0 2021-09-01 17:12:32 +03:00
Aliaksandr Valialkin
bddd1c35e2 docs/FAQ.md: add questions on how to migrate data from various systems (Prometheus, InfluxDB, OpenTSDB, Graphite) to VictoriaMetrics 2021-09-01 16:47:30 +03:00
Aliaksandr Valialkin
ed818fceef docs: update -help output for victoria-metrics and vmagent after f77dde837a 2021-09-01 16:34:32 +03:00
Aliaksandr Valialkin
ae90225b46 .github/dependabot.yml: increase check intervals for gomod and docker ecosystems from daily to weekly
Daily checks are too verbose and result into too many automatic pull requests and commits
2021-09-01 16:07:00 +03:00
Aliaksandr Valialkin
f77dde837a lib/promscrape: add the ability to limit the number of unique series per each scrape target
The number of series per target can be limited with the following options:

* Global limit with `-promscrape.maxSeriesPerTarget` command-line option.
* Per-target limit with `max_series: N` option in `scrape_config` section.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1561
2021-09-01 16:03:59 +03:00
Roman Khavronenko
867e426070 dashboards: bump vmagent version requirement 2021-09-01 14:20:50 +03:00
dependabot[bot]
c2d17ec655 build(deps): bump github.com/aws/aws-sdk-go from 1.40.33 to 1.40.34 (#1591)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.33 to 1.40.34.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.33...v1.40.34)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-01 12:53:10 +03:00
dependabot[bot]
4bebafd885 build(deps): bump google.golang.org/api from 0.55.0 to 0.56.0 (#1590)
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.55.0 to 0.56.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/master/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.55.0...v0.56.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-01 12:52:17 +03:00
dependabot[bot]
1a6b9157e2 build(deps): bump cloud.google.com/go/storage from 1.16.0 to 1.16.1 (#1589)
Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.16.0 to 1.16.1.
- [Release notes](https://github.com/googleapis/google-cloud-go/releases)
- [Changelog](https://github.com/googleapis/google-cloud-go/blob/master/CHANGES.md)
- [Commits](https://github.com/googleapis/google-cloud-go/compare/pubsub/v1.16.0...storage/v1.16.1)

---
updated-dependencies:
- dependency-name: cloud.google.com/go/storage
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-09-01 12:51:42 +03:00
Aliaksandr Valialkin
111ea89a7d docs: make docs-sync 2021-09-01 12:02:34 +03:00
Aliaksandr Valialkin
9e41b05401 docs/CHANGELOG.md: document eff940aa76 2021-09-01 12:00:02 +03:00
Aliaksandr Valialkin
fce87bfe8d docs/CHANGELOG.md: document 7c70dcbe3b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1471
2021-09-01 11:56:23 +03:00
Roman Khavronenko
0f4bcc00b2 Single dashboards upd (#1593)
* dasbhoard: replace `null` datasources

null datasource value may confuse Grafana and make it drop panel query in some
versions.

* docker: bump grafana image version

* dashboards: add URL variable selector to vmagent dashboard

* dashboards: add new panel `Remote write connection saturation` to vmagent dashboard

* alerts: add new alert for `Remote write connection saturation` panel of vmagent dashboard

* dashboards: add "Logging rate" panel to vmagent dashboard
2021-09-01 11:46:22 +03:00
Roman Khavronenko
de26b1d4a2 vmctl: update README and flags description (#1588)
The purpose of update is to make README and flags description more
clear to the reader. Especially, show that vm-account-id flag is required
for clustered version of VM.
2021-09-01 09:31:44 +03:00
Roman Khavronenko
2ed2878a57 docs: fix the link for cluster docker compose 2021-09-01 09:21:45 +03:00
Roman Khavronenko
0d6735106b docs: update docker env description 2021-09-01 09:18:56 +03:00
Roman Khavronenko
cfb6436be5 Vmalert extra params (#1587)
* vmalert: allow extra GET params in datasource package

ExtraParams will be added as GET params to every HTTP request made by datasource.
The `roundDigits` param, for example, was substituted by corresponding extra param.

* vmalert: add nocache=1 param for replay process

The `nocache=1` param is VictoriaMetrics specific parameter which prevents it
from caching and boundaries aligning for queries. We set it to avoid cache
pollution in `replay` mode and also to avoid unnecessary time range boundaries
alignment.

* vmalert: mention nocache=1 in replay description

* vmalert: fix bug with unused param
2021-08-31 14:57:47 +03:00
Nikolay
7c70dcbe3b adds external_labels per group for vmalert (#1485)
* adds external_label per group for vmalert
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1471
2021-08-31 14:52:34 +03:00
Roman Khavronenko
eff940aa76 Vmalert metrics update (#1580)
* vmalert: remove `vmalert_execution_duration_seconds` metric

The summary for `vmalert_execution_duration_seconds` metric gives no additional
value comparing to `vmalert_iteration_duration_seconds` metric.

* vmalert: update config reload success metric properly

Previously, if there was unsuccessfull attempt to reload config and then
rollback to previous version - the metric remained set to 0.

* vmalert: add Grafana dashboard to overview application metrics

* docker: include vmalert target into list for scraping

* vmalert: extend notifier metrics with addr label

The change adds an `addr` label to metrics for alerts_sent and alerts_send_errors
to identify which exact address is having issues.
The according change was made to vmalert dashboard.

* vmalert: update documentation and docker environment for vmalert's dashboard

Mention Grafana's dashboard in vmalert's README in a new section #Monitoring.

Update docker-compose env to automatically add vmalert's dashboard.
Update docker-compose README with additional info about services.
2021-08-31 12:28:02 +03:00
Aliaksandr Valialkin
f41b3d6118 vendor: make vendor-update 2021-08-31 12:03:21 +03:00
Aliaksandr Valialkin
6e085e6dac docs/Single-server-VictoriaMetrics.md: remove outdated link to VictoriaMetrics wiki
VictoriaMetrics wiki became outdated after publishing all the docs at https://docs.victoriametrics.com
2021-08-31 11:48:32 +03:00
Aliaksandr Valialkin
8b228a5873 docs/CHANGELOG.md: add a link to Prometheus staleness tracking 2021-08-31 11:48:32 +03:00
dependabot[bot]
6c388f63b3 build(deps): bump github.com/aws/aws-sdk-go from 1.40.30 to 1.40.33 (#1582)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.30 to 1.40.33.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.30...v1.40.33)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:10:31 +03:00
dependabot[bot]
525e6ae1b8 build(deps): bump google.golang.org/api from 0.54.0 to 0.55.0 (#1583)
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.54.0 to 0.55.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/master/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.54.0...v0.55.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:09:55 +03:00
dependabot[bot]
462fa70967 build(deps): bump github.com/klauspost/compress from 1.13.4 to 1.13.5 (#1584)
Bumps [github.com/klauspost/compress](https://github.com/klauspost/compress) from 1.13.4 to 1.13.5.
- [Release notes](https://github.com/klauspost/compress/releases)
- [Changelog](https://github.com/klauspost/compress/blob/master/.goreleaser.yml)
- [Commits](https://github.com/klauspost/compress/compare/v1.13.4...v1.13.5)

---
updated-dependencies:
- dependency-name: github.com/klauspost/compress
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-31 11:01:58 +03:00
Aliaksandr Valialkin
5c63d69454 lib/promscrape/discovery/kubernetes: return back support role: endpointslices, since it is used by VictoriaMetrics operator
This is a follow up commit after 31b42b30b6
2021-08-29 12:37:03 +03:00
Aliaksandr Valialkin
db330232ac lib/protoparser/opentsdb: follow-up after 8ee75ca45a 2021-08-29 11:49:21 +03:00
envzhu
8ee75ca45a lib/protoparser/opentsdb: accept multiple spaces between fields in a row as a deliminator. (#1575) 2021-08-29 11:38:32 +03:00
Aliaksandr Valialkin
31b42b30b6 lib/promscrape/discovery/kubernetes: rename role: endpointslices to role: endpointslice to be consistent with Prometheus
See 2ec6c7dbb8/discovery/kubernetes/kubernetes.go (L99)
2021-08-29 11:23:08 +03:00
Aliaksandr Valialkin
2e001db4de lib/promscrape/discovery/kubernetes: use v1 API instead of v1beta1 API for role: ingress and role: endpointslices
This should fix service discovery for these roles in Kubernetes v1.22 and newer versions.
See https://kubernetes.io/docs/reference/using-api/deprecation-guide/#ingress-v122

The corresponding change in Prometheus - https://github.com/prometheus/prometheus/pull/9205
2021-08-29 11:16:59 +03:00
Aliaksandr Valialkin
189507d9d0 docs/Single-server-VictoriaMetrics.md: mention that downsampling doesnt improve query performance on high churn rate 2021-08-27 18:50:26 +03:00
Aliaksandr Valialkin
5ea689d61b app/vmselect/promql: add quantile("phiLabel", phi1, ..., phiN, q) aggregate function to MetricsQL
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1573
2021-08-27 18:37:20 +03:00
Aliaksandr Valialkin
bec18e4fe9 app/vmselect: add -search.disableAutoCacheReset command-line option for disabling automatic cache reset when a sample with old timestamp outside -search.cacheTimestampOffset is inserted
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1570
2021-08-27 17:15:31 +03:00
Aliaksandr Valialkin
67189be1cb docs/{vmgateway,vmbackupmanager}: mention that enterprise binaries are free for download and evaluation 2021-08-27 14:54:09 +03:00
Aliaksandr Valialkin
321da535fa docs: link to active time series, churn rate and high cardinality questions 2021-08-27 14:44:53 +03:00
Aliaksandr Valialkin
c8c153fb91 docs/CHANGELOG.md: document the bugfix for possible timeout error in vmbackupmanager when making snapshots
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1571
2021-08-27 13:04:06 +03:00
Aliaksandr Valialkin
2bc79042f6 docs: mention that enterprise binaries can be downloaded and evaluated for free 2021-08-27 12:48:14 +03:00
Aliaksandr Valialkin
4b3877b798 vendor: make vendor-update 2021-08-26 09:42:23 +03:00
Aliaksandr Valialkin
2bef940add docs/vmagent.md: document the ability to load scrape configs from multiple files
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 09:13:14 +03:00
Aliaksandr Valialkin
10f960fa0c lib/promscrape: add ability to load scrape configs from multiple files
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1559
2021-08-26 08:51:16 +03:00
Aliaksandr Valialkin
25ee4a3644 vendor: make vendor-update 2021-08-25 13:41:02 +03:00
dependabot[bot]
66626db92f build(deps): bump codecov/codecov-action from 2.0.2 to 2.0.3 (#1563)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.0.2 to 2.0.3.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v2.0.2...v2.0.3)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-25 13:40:21 +03:00
Aliaksandr Valialkin
e24203cde8 docs/CHANGELOG.md: document 48f33d098b 2021-08-25 13:31:35 +03:00
benclive
48f33d098b Remove trailing slash for URLPrefixes with specific path (#1554) 2021-08-25 13:28:50 +03:00
Aliaksandr Valialkin
9fc9d76a7f docs/Cluster-VictoriaMetrics.md: mention that the -replicationFactor at vmselect is an optional parameter 2021-08-25 13:10:57 +03:00
Aliaksandr Valialkin
c27ee35c5c lib/promscrape: expose promscrape_discovery_http_errors_total metric for tracking errors per each http_sd config 2021-08-25 13:05:49 +03:00
Aliaksandr Valialkin
ffc0ab1774 lib/{mergeset,storage}: improve the detection of the needed free space for background merge
This should prevent from possible out of disk space crashes during big merges.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1560
2021-08-25 09:35:44 +03:00
Aliaksandr Valialkin
a287a48634 docs/FAQ.md: add more entries for frequently asked questions
The following topics are covered:

* Active time series
* High cardinality
* High churn rate
* Slow inserts
2021-08-24 11:34:44 +03:00
Aliaksandr Valialkin
8358890e33 docs/MetricsQL.md: typo fix: histogram_qunatile -> histogram_quantile 2021-08-23 23:08:16 +03:00
Aliaksandr Valialkin
6c5760db9c app/vmselect/promql: make fmt after 0078486ea7 2021-08-23 23:06:00 +03:00
Aliaksandr Valialkin
7c57745f40 docs/MetricsQL.md: fix the indentation for median function 2021-08-23 12:04:31 +03:00
Aliaksandr Valialkin
a78672f95a docs/MetricsQL.md: typo fix: convesions->conversions 2021-08-23 12:02:03 +03:00
Aliaksandr Valialkin
17a1241022 docs/MetricsQL.md: typo fixes 2021-08-23 11:59:12 +03:00
Aliaksandr Valialkin
60ac3e1e46 docs/MetricsQL.md: rehaul the documentation on MetricsQL
* Document all the functions supported by MetricsQL, including PromQL functions
* Group functions by their type: rollup functions, transform functions, label manipulation functions and aggregate functions.
* Document implicit query transformations.
2021-08-23 11:45:52 +03:00
Aliaksandr Valialkin
0078486ea7 app/vmselect/promql: rename sign() function to sgn() in order to be consistent with Prometheus
See https://github.com/prometheus/prometheus/pull/8457 for details.
2021-08-23 11:45:51 +03:00
Aliaksandr Valialkin
69c291353b deployment/docker: update Go builder from Go1.16.0 to Go1.17.0
This improves data ingestion and query performance by up to 5% according to benchmarks.

See https://go.dev/blog/go1.17
2021-08-21 22:20:49 +03:00
Aliaksandr Valialkin
d5622b32e2 lib/promscrape: reduce memory and CPU usage when Prometheus staleness tracking is enabled for metrics from deleted / disappeared scrape targets
Store the scraped response body instead of storing the parsed and relabeld metrics.
This should reduce memory usage, since the response body takes less memory than the parsed and relabeled metrics.
This is especially true for Kubernetes service discovery, which adds many long labels for all the scraped metrics.

This should also reduce CPU usage, since the marshaling of the parsed
and relabeld metrics has been substituted by response body copying.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-21 21:17:26 +03:00
Aliaksandr Valialkin
2288e75f03 docs/vmalert.md: run make docs-sync after 9ee3d0378f 2021-08-21 20:24:56 +03:00
Aliaksandr Valialkin
6ffbb46aef docs/CHANGELOG.md: document 9ee3d0378f 2021-08-21 20:20:08 +03:00
Aliaksandr Valialkin
89a4e8fd9b vendor: make vendor-update 2021-08-21 20:16:19 +03:00
Roman Khavronenko
9ee3d0378f vmalert: add flag disableAlertgroupLabel for disabling extra label added to series (#1534)
The new label added in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/611
may negatively impact deduplication in Alertmanager. The new flag supposed to give
an option to disable adding this label.

To enable flag just add `-disableAlertgroupLabel` to binary execution command.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1532
2021-08-21 20:08:55 +03:00
Aliaksandr Valialkin
4f3a5742eb app/vmselect/prometheus: do not extend [d] to the detected interval between samples for first_over_time(m[d])
This is for the sake of consistency with similar change for the last_over_time(m[d]) at a724229b5d
2021-08-21 19:56:14 +03:00
dependabot[bot]
41fdfdb895 build(deps): bump github.com/aws/aws-sdk-go from 1.40.25 to 1.40.26 (#1551)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.25 to 1.40.26.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.25...v1.40.26)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-21 19:52:46 +03:00
Alexander Rickardsson
f4cecaf296 vmalert: accept http.StatusOK for remotewrite (#1550) 2021-08-20 11:58:32 +03:00
Aliaksandr Valialkin
f46a73dcdd lib/promscrape: use scrapeTimestamp when storing stale markers for failed scrape
This will make timestamps for stale markers more consistent for timestamps for other samples
2021-08-19 14:18:05 +03:00
Aliaksandr Valialkin
c14edc860b docs/CHANGELOG.md: document b5d6a0e499 2021-08-19 14:03:20 +03:00
Roman Khavronenko
b5d6a0e499 vmselect: update vm_request_duration_seconds value when request fails (#1537)
Before, metric `vm_request_duration_seconds` was update only on successful
attempts which could be misleading. For example, timeout errors on netstorage
request may be not accounted in the metric and won't be visible on dashboards.
Using `defer` statement to update the metric after query arguments validation
may improve the situation.
2021-08-19 13:58:54 +03:00
dependabot[bot]
cbab5f3b42 build(deps): bump github.com/aws/aws-sdk-go from 1.40.22 to 1.40.25 (#1548)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.22 to 1.40.25.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.22...v1.40.25)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-19 13:56:42 +03:00
Aliaksandr Valialkin
80ddade4ed docs/CHANGELOG.md: clarify the change, which adds -search.noStaleMarkers command-line flag 2021-08-19 13:56:04 +03:00
Aliaksandr Valialkin
a724229b5d app/vmselect/promql: do not override [d] at last_over_time(m[d]) if [d] is smaller than scrape_interval
Since most users do not expect the overriding of explicitly set `[d]`.
2021-08-19 10:31:48 +03:00
Aliaksandr Valialkin
ce0c270e75 docs/CHANGELOG.md: cut v1.64.1
This is mostly bugfix release, which includes fixes for staleness handling and a security update for Alpine base image
2021-08-18 22:06:05 +03:00
Aliaksandr Valialkin
c09446a9aa lib/promscrape: send stale markers for the previously scraped metrics on failed scrapes like Prometheus does 2021-08-18 21:59:03 +03:00
Aliaksandr Valialkin
f6e6056c17 vendor: update github.com/valyala/gozstd from v1.11.0 to v1.12.0
This should improve query scalability on systems with big number of CPU cores
2021-08-18 14:57:19 +03:00
Aliaksandr Valialkin
04c3e9916d docs/CHANGELOG.md: document 06bf21c21b 2021-08-18 14:01:04 +03:00
Aliaksandr Valialkin
cdc372bb98 app/vmselect: add -search.noStaleMarkers command-line flag for disabling stale markers handling in queries
This option allows reducing CPU usage a bit when VictoriaMetrics is used
for collecting and processing non-Prometheus data. For example, InfluxDB line protocol, Graphite, OpenTSDB, CSV, etc.
2021-08-18 13:59:02 +03:00
Aliaksandr Valialkin
226143f31b lib/promscrape: add ability to disable sending Prometheus staleness markers with -promscrape.disableStaleMarkers command-line flag
This option can be useful when vmagent consumes too much additional memory
for staleness markers functionality and when staleness markers aren't needed.
2021-08-18 13:43:21 +03:00
Aliaksandr Valialkin
06bf21c21b deployment/docker: upgrade Alpine base docker image from v3.14.0 to v3.14.1
See https://www.alpinelinux.org/posts/Alpine-3.14.1-released.html

This fixes https://vuldb.com/?source_cve.180051
See also https://vuldb.com/?id.180051 and https://snyk.io/vuln/SNYK-ALPINE314-APKTOOLS-1533752
2021-08-18 11:04:11 +03:00
Aliaksandr Valialkin
db1e62495b app/vmselect/promql: add bitmap_and(), bitmap_or() and bitmap_xor() functions to MetricsQL
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1541
2021-08-17 13:21:21 +03:00
Aliaksandr Valialkin
277538d655 docs/Single-server-VictoriaMetrics.md: mention that vmctl can migrate data from OpenTSDB to VictoriaMetrics 2021-08-17 11:12:16 +03:00
Aliaksandr Valialkin
bd14b0887e app/vmselect/promql: move common condition to dropStaleNaNs in order to improve code maintainability 2021-08-17 11:01:16 +03:00
Aliaksandr Valialkin
03c959f1df lib/promscrape: stop scrapers for the removed targets before starting scrapers for the added targets
This should prevent from possible time series overlap when old target is substituted by new target (for example, during Kubernetes deployments).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
2021-08-17 00:55:51 +03:00
Aliaksandr Valialkin
90434ba25b app/vmalert: mention -remoteWrite.disablePathAppend in the description for -remoteWrite.url 2021-08-16 15:22:47 +03:00
Aliaksandr Valialkin
f37b963619 app/vmalert: follow-up for 2400f85761 2021-08-16 15:20:22 +03:00
Aliaksandr Valialkin
4547d4f692 docs/CHANGELOG.md: update urls to Prometheus 2.29 release
Previously these urls were pointing to rc0 release
2021-08-16 14:53:38 +03:00
Aliaksandr Valialkin
ae9f923449 docs/CHANGELOG.md: typo fix: satureated -> saturated 2021-08-16 14:53:38 +03:00
Alexander Rickardsson
2400f85761 vmalert: enable configuring explicit path (#1536)
* vmalert: allow to disable automatically added path to remote write address via disablePathAppend flag
* docs: update docs to include remoteWrite.disablePathAppend
2021-08-16 14:20:57 +03:00
dependabot[bot]
9af8c71975 build(deps): bump github.com/aws/aws-sdk-go from 1.40.21 to 1.40.22 (#1539)
Bumps [github.com/aws/aws-sdk-go](https://github.com/aws/aws-sdk-go) from 1.40.21 to 1.40.22.
- [Release notes](https://github.com/aws/aws-sdk-go/releases)
- [Changelog](https://github.com/aws/aws-sdk-go/blob/main/CHANGELOG.md)
- [Commits](https://github.com/aws/aws-sdk-go/compare/v1.40.21...v1.40.22)

---
updated-dependencies:
- dependency-name: github.com/aws/aws-sdk-go
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-16 11:31:22 +03:00
dependabot[bot]
8297ad8f03 build(deps): bump google.golang.org/api from 0.53.0 to 0.54.0 (#1538)
Bumps [google.golang.org/api](https://github.com/googleapis/google-api-go-client) from 0.53.0 to 0.54.0.
- [Release notes](https://github.com/googleapis/google-api-go-client/releases)
- [Changelog](https://github.com/googleapis/google-api-go-client/blob/master/CHANGES.md)
- [Commits](https://github.com/googleapis/google-api-go-client/compare/v0.53.0...v0.54.0)

---
updated-dependencies:
- dependency-name: google.golang.org/api
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-08-16 11:30:08 +03:00
Aliaksandr Valialkin
c518858145 docs/CHANGELOG.md: cut v1.64.0 2021-08-15 23:52:03 +03:00
Aliaksandr Valialkin
a0e18f06eb lib/promscrape: restore red highlighting for DOWN targets at /targets page
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1461
2021-08-15 16:03:57 +03:00
Aliaksandr Valialkin
40a859a760 docs/CHANGELOG.md: mention the bugfix when more than 27 time series are selected at /vmui 2021-08-15 15:10:41 +03:00
Aliaksandr Valialkin
386ee5b82c docs/CHANGELOG.md: mention that VMUI automatically fills Server URL field
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1506
2021-08-15 14:45:09 +03:00
Artem Navoiev
64d64976e4 add dependency chekcs for (#1535)
- ruby (for docs)
- gomod for monorepo
- npm for vmui
- gomod go small webserver in  vmui
2021-08-15 14:09:34 +03:00
Aliaksandr Valialkin
aefba16d5e app/vmagent/remotewrite: expose vmagent_remotewrite_send_duration_seconds_total metric
This metric can be used for determining high saturation of every connection to remote storage with
an alerting query `rate(vmagent_remotewrite_send_duration_seconds_total) > 0.9s`.
This query triggers when a connection is satureated by more than 90%
2021-08-15 13:34:12 +03:00
Aliaksandr Valialkin
113f0a8a07 app/vmselect/promql: drop staleness marks before calling rollupConfig.Do
This allows dropping staleness marks only once and then calculate multiple rollup functions on the result.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-15 13:21:10 +03:00
Aliaksandr Valialkin
25997a70f1 Revert "app/vmselect/promql: properly handle Prometheus staleness marks in removeCounterResets functions"
This reverts commit 94dfcb6747a3b29a11d14e71bea21a2312bb6346.

It is better to remove staleness marks (decimal.StaleNaN) before calling rollupConfig.Do, e.g. in preFunc
2021-08-15 13:19:16 +03:00
Aliaksandr Valialkin
73d7b568da app/vmselect/promql: properly handle Prometheus staleness marks in removeCounterResets functions
Prometheus stalenss marks shouldn't be changed in removeCounterResets. Otherwise they will be converted to an ordinary NaN values,
which couldn't be removed in dropStaleNaNs() function later. This may result in incorrect calculations for rollup functions.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
2021-08-14 12:45:57 +03:00
Aliaksandr Valialkin
6d53620adf vendor: make vendor-update 2021-08-13 13:15:27 +03:00
Aliaksandr Valialkin
2ae2c1dd09 app/victoria-metrics/testdata: fix tests after 4401464c22 2021-08-13 12:21:54 +03:00
Aliaksandr Valialkin
4401464c22 all: add support for Prometheus staleness markers
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1526
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/748
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1509
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1530
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/845
2021-08-13 12:10:17 +03:00
Aliaksandr Valialkin
9a8d1bcec5 docs/Cluster-VictoriaMetrics.md: meniton that vmagent can be used for replicating the data among multiple clusters
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491
2021-08-12 12:47:32 +03:00
Aliaksandr Valialkin
556c1b36e5 vendor: update github.com/klauspost/compress from v1.13.1 to v1.13.4 2021-08-12 12:40:13 +03:00
Aliaksandr Valialkin
95dd5a48bb app/vmselect: make vmui-update after the commit 4ae14df864a7e327955f44941295a286175423b3 2021-08-11 13:41:41 +03:00
Aliaksandr Valialkin
860b272a95 app/vmui: actualize Dockerfiles 2021-08-11 13:41:41 +03:00
Denys Holius
81e4d644dd added guide for HA monitoring setup in K8s via VM Cluster (#1523)
* added guide for HA monitoring setup in K8s via VM Cluster

* fixed missed divs

* fixed different typos

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

* Update docs/guides/k8s-ha-monitoring-via-vm-cluster.md

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>

Co-authored-by: Roman Khavronenko <hagen1778@gmail.com>
2021-08-11 11:45:42 +03:00
Aliaksandr Valialkin
755f65f4bc app/vminsert: add vm_rpc_send_duration_seconds_total metric per each vminsert->vmstorage link
This metric is useful for determining high link saturation with the following alerting rule:

rate(vm_rpc_send_duration_seconds_total) > 0.9s
2021-08-11 11:44:40 +03:00
Aliaksandr Valialkin
869ff25392 docs/Cluster-VictoriaMetrics.md: update -help output for cluster components after the d375d9b878 2021-08-11 11:44:39 +03:00
Aliaksandr Valialkin
bcffd04e3a docs: make docs-sync after e0ee69797d 2021-08-11 10:53:49 +03:00
Roman Khavronenko
e0ee69797d docs: update "number of open files" tuning recommendation (#1527)
* docs: update "number of open files" tuning recomendation

Make "number of open files" recomendation not only Prometheus specific to avoid
confusion for users who does not use Prometheus.

* docs: mention fstrim in Tuning section
2021-08-11 10:51:02 +03:00
Aliaksandr Valialkin
d375d9b878 lib/envflag: add a link to docs for -envflag.enable 2021-08-11 10:29:33 +03:00
Aliaksandr Valialkin
5716af4636 deployment/dm: update Go builder from Go1.16.6 to Go1.16.7
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.7+label%3ACherryPickApproved
2021-08-06 12:12:03 +03:00
Yury Molodov
236fc7d739 vmui: fix layout and add server url by default (#1519)
* fix: change layout for correctly display big query

* fix: set default server from url

* fix: change get default server url
2021-08-06 12:06:08 +03:00
Aliaksandr Valialkin
5ce531027f docs/CHANGELOG.md: document new metrics added to vmalert at 7416fdaa8b 2021-08-05 10:13:08 +03:00
Aliaksandr Valialkin
c1185363ca app/vmagent: typo fix in the description for -remoteWrite.queues 2021-08-05 10:01:35 +03:00
Roman Khavronenko
7416fdaa8b vmalert: expose new metrics for tracking number of produced samples during last evaluation (#1518)
* vmalert: expose new metrics for tracking number of produced samples during last evaluation

Two new metrics were added to track the number of samples produced during the last evaluation:
* vmalert_recording_rules_last_evaluation_samples
* vmalert_alerting_rules_last_evaluation_samples

The gauge type is used to remain consistent with Prometheus metric
`prometheus_rule_group_last_evaluation_samples` which is on the group level.
However, the counter type was considered as well.

Two metrics instead of one are used to make it easier to separate recording and
alerting rules. It is likely, number of samples produced by recording rules is
more important so people will refer to it more frequently.

The expected usage of the new metric is the following:
```
   - alert: RecordingRuleReturnsEmptyResults
        expr: sum(vmalert_recording_rules_last_evaluation_samples) by(recording) < 1
        annotations:
          summary: Recording rule {{$labels.recording}} returns empty results.
            Please verify expression correctness.
```

Addresses https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1494

* vmalert: rename `vmalert_alerts_error` to `vmalert_alerting_rules_error` to remain consistent with recording rules metrics
2021-08-05 09:59:46 +03:00
Aliaksandr Valialkin
d826352688 app/vmagent: follow-up after fe445f753b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1491
2021-08-05 09:52:32 +03:00
Omar Ghader
46e27d60a6 feature: Add multitenant for vmagent (#1505)
* feature: Add multitenant for vmagent

* Minor fix

* Fix rcs index out of range

* Minor fix

* Fix multi Init

* Fix multi Init

* Fix multi Init

* Add default multi

* Adjust naming

* Add TenantInserted metrics

* Add TenantInserted metrics

* fix: remove unused metrics for vmagent

* fix: remove unused metrics for vmagent

Co-authored-by: mghader <marc.ghader@ubisoft.com>
Co-authored-by: Sebastian YEPES <syepes@gmail.com>
2021-08-05 09:52:31 +03:00
Aliaksandr Valialkin
f07165977a docs/Articles.md: actualize links and re-order some links 2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
50663ba41f lib/promscrape/discovery/gce: add __meta_gce_interface_ipv4_<name> labels as in Prometheus 2.29
See https://github.com/prometheus/prometheus/pull/8978
2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
3cad8b4564 lib/promscrape/discovery/ec2: add __meta_ec2_availability_zone_id label as Prometheus 2.29 does 2021-08-03 16:11:49 +03:00
Aliaksandr Valialkin
e92fde7945 app/vmselect/promql: add present_over_time(m[d]) function, which will be available starting from Prometheus 2.29.0
See https://github.com/prometheus/prometheus/releases/tag/v2.29.0-rc.0 and https://github.com/prometheus/prometheus/pull/9097
2021-08-03 16:11:49 +03:00
Qifei Wan
fa9c5c5940 app/vmalert: update config state metrics if config parsed failed (#1507) 2021-08-03 12:55:29 +03:00
Roman Khavronenko
370fe9fa2a docs: add "Scaling to trillions of metric data points" to articles (#1517) 2021-08-03 11:07:45 +03:00
wusphinx
c1ed7b77aa Update TimeSelector.tsx (#1515)
delete garbled code
2021-08-03 10:01:01 +03:00
Nikolay
7bbff7fb86 adds /rules and /alerts api for grafana (#1504)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-08-02 17:28:09 +03:00
Roman Khavronenko
b385fa622b docs: mention "Push Prometheus metrics to VictoriaMetrics or other exporters" article (#1511) 2021-08-02 17:23:10 +03:00
Roman Khavronenko
a641102ec2 docs: fix indentation for guide articles (#1512) 2021-08-02 17:16:58 +03:00
Roman Khavronenko
408ba43092 Alerts single update (#1510)
* alerts: move `ProcessNearFDLimits` to `vm-health` group since it is relevant for all services

* alerts: add new `TooHighMemoryUsage` alerting rule
2021-08-02 15:51:24 +03:00
Aliaksandr Valialkin
66eb60f20d docs/CaseStudies.md: typo fix: hed->had 2021-07-30 18:32:12 +03:00
Aliaksandr Valialkin
8a3c13fd53 docs/CHANGELOG.md: typo fix 2021-07-30 12:35:57 +03:00
Aliaksandr Valialkin
a3b4fc0474 docs/CHANGELOG.md: document d05cac6c98
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 12:19:53 +03:00
Aliaksandr Valialkin
a1911e1330 app/vmselect/netstorage: unpack time series data in mostly local big chunks
This should improve performance on multi-CPU systems for queries selecting time series with big number of raw samples
2021-07-30 12:03:17 +03:00
Aliaksandr Valialkin
d05cac6c98 li/storage: re-use the per-day inverted index search code for searching in global index
This allows removing a big pile of outdated code for global index search.

This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1486
2021-07-30 10:31:37 +03:00
Aliaksandr Valialkin
74ffaa45d9 app/vmselect/netstorage: do not query Go maps with unsafe string keys, since this breaks in Go 1.17 2021-07-30 09:57:53 +03:00
Aliaksandr Valialkin
192dfbfd90 app/vmselect: follow-up for ed95bc9531
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1493
2021-07-29 09:53:28 +03:00
arnoldyahad
00af4ff5a4 Add case prometheus/rules for grafana 8 (#1502) 2021-07-29 09:53:27 +03:00
assassins
a483044557 Performance optimization (#1481)
There are redundant steps
2021-07-28 19:26:20 +03:00
Aliaksandr Valialkin
e20ec090b2 docs: remove SampleSizeCalculations.md, since it is outdated and no longer used
There was a reference to this doc from the old victoriametrics.com site
2021-07-28 19:25:16 +03:00
Aliaksandr Valialkin
8ee8660ac4 app/vmselect: follow-up for 626073bca8
* Rename -search.maxMetricsPointSearch to -search.maxSamplesPerQuery, so it is more consistent with the existing -search.maxSamplesPerSeries
* Move the -search.maxSamplesPerQuery from vmstorage to vmselect, so it could effectively limit the number of raw samples obtained from all the vmstorage nodes
* Document the -search.maxSamplesPerQuery in docs/CHANGELOG.md
2021-07-28 18:00:23 +03:00
Denys Holius
9ffd70a921 Added new guide for monitoring k8s via VictoriaMetrics cluster (#1476)
* renamed and moved screenshots

* fixed cluster guide, updated helm chart versions, added values.yaml for vm single

* renamed guide files

* fixed typo

* add some fixes

* fixed typos,added guide k8s-monitoring-via-vm-cluster

* added fixes for yamls
2021-07-27 18:01:12 +03:00
Aliaksandr Valialkin
8481f4f004 docs/CHANGELOG.md: document 9d45b46f4c 2021-07-27 12:38:31 +03:00
Nikolay
9d45b46f4c adds check for region with custom s3 endpoint (#1465) 2021-07-27 12:35:38 +03:00
Aliaksandr Valialkin
c2deee9911 lib/storage: yet another attempt to properly determine disk space shortage, which prevents from optimal merges
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-27 12:04:50 +03:00
Aliaksandr Valialkin
bb31117555 lib/promrelabel: add tests for verifying that regex works as expected in single quotes and double quotes 2021-07-27 10:50:55 +03:00
Aaron France
fec509fe2d fix: typo in metrics.md docs 2021-07-26 21:53:26 +03:00
dependabot[bot]
0ef150c14b build(deps): bump codecov/codecov-action from 2.0.1 to 2.0.2
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 2.0.1 to 2.0.2.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v2.0.1...v2.0.2)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2021-07-26 20:42:21 +03:00
Aliaksandr Valialkin
ef781cefa7 vendor: make vendor-update 2021-07-26 16:02:46 +03:00
Aliaksandr Valialkin
8b7917cd81 all: add go:build lines for Go1.17
See https://tip.golang.org/doc/go1.17#gofmt for more details
2021-07-26 15:48:21 +03:00
Aliaksandr Valialkin
95aff47330 app/vmselect: prevent from possible deadlock when f callback blocks inside RunParallel 2021-07-26 15:47:30 +03:00
Aliaksandr Valialkin
1318736ad1 lib/promscrape: add missing whitespace at /targets page before up word 2021-07-26 12:22:59 +03:00
Aliaksandr Valialkin
fcaf152480 app/vmselect: make vmui-update after a91d41f12a 2021-07-26 10:31:11 +03:00
Aliaksandr Valialkin
bfb18438ec docs/Articles.md: add links to new articles 2021-07-23 21:06:58 +03:00
dependabot[bot]
bf25a256c5 build(deps): bump codecov/codecov-action from 1.5.2 to 2.0.1 (#1468)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 1.5.2 to 2.0.1.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v1.5.2...v2.0.1)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-07-23 12:01:52 +03:00
Yury Molodov
a91d41f12a Vmui/query editor (#1472)
* fix: move request button to server input

* feat: add switch for query autocomplete

* refactor: rename state for popover open

* feat: add detect os by userAgent

* fix: change hotkey to run query for mac

* fix: change detect mac os

* fix: change div to span inside Typography

Co-authored-by: yury <yurymolodov@victoriametrics.com>
2021-07-23 12:00:44 +03:00
Aliaksandr Valialkin
05672ffc32 app/vmselect/promql: properly handle (a op b) default N if (a op b) returns NaN series
The result should be a series with `N` values and `a op b` labels. Previously such series has been removed from the result.
2021-07-16 01:44:58 +03:00
Aliaksandr Valialkin
ed10141ff8 app/vmselect/netstorage: use more scalable algorithm for ditributing the work among among multiple channels on systems with big number of CPU cores 2021-07-16 00:35:23 +03:00
Aliaksandr Valialkin
ca75432e66 app/vmselect: do not track queries with less than 1ms execution time at /api/v1/status/top_queries
This should improve the readability and usefullness of the /api/v1/status/top_queries when debugging slow queries
or queries that take too much cpu time.
2021-07-15 16:44:28 +03:00
Aliaksandr Valialkin
4ba3fd9e6d lib/workingsetcache: switch from split cache to full cache after the cache size exceeds 95% of split capacity
Previously the switch occurred when the cache size becomes 100% of its capacity. The cache size could never reach 100% capacity.
This could prevent from switching from the split cache to full cache, thus reducing the cache effectiveness.
2021-07-15 16:12:04 +03:00
Aliaksandr Valialkin
f4e81aef7e app/vmselect/netstorage: add -search.maxSamplesPerSeries command-line option for limiting the number of samples a query can process per each series
This should prevent from out of memory crashes like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1067
2021-07-15 16:03:28 +03:00
Aliaksandr Valialkin
e6ef97a5ee app/vmselect/netstorage: improve scalability of series unpacking on multi-CPU systems 2021-07-15 15:41:58 +03:00
Aliaksandr Valialkin
171d44acd8 docs/CHANGELOG.md: typo fix: suffxies->suffixes 2021-07-15 15:02:20 +03:00
Aliaksandr Valialkin
f81d972581 app/vmui/README.md: typo fix: naviate->navigate 2021-07-15 15:02:04 +03:00
Aliaksandr Valialkin
61cc13c16f docs/CHANGELOG.md: cut v1.63.0 2021-07-15 14:02:13 +03:00
Aliaksandr Valialkin
b060d8bf53 vendor: make vendor-update 2021-07-15 12:55:40 +03:00
Aliaksandr Valialkin
d472b03e34 lib/storage: make sure the second call to DeduplicateSamples and deduplicateSamplesDuringMerge doesnt change samples 2021-07-15 12:17:45 +03:00
Aliaksandr Valialkin
682662b2ae lib/storage: remove cache directory if it contains reset_cache_on_startup file
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1447
2021-07-13 17:58:51 +03:00
Aliaksandr Valialkin
2b0e3efa5c .github/workflows/wiki.yml: properly copy subdirectories 2021-07-13 17:35:02 +03:00
Aliaksandr Valialkin
8d99e94a52 docs: clarify why spare CPU and RAM resources are needed in capacity planning 2021-07-13 15:48:25 +03:00
Aliaksandr Valialkin
8d0ec47be9 docs/Cluster-VictoriaMetrics.md: clarify the docs about the needed values for -dedup.minScrapeInterval at vmselect during replication when the data is pushed from HA pair 2021-07-13 15:43:02 +03:00
Aliaksandr Valialkin
2df66dad7b lib/httpserver: add is_set label to flag metrics
This label allows determining the set flags with the query `flag{is_set="true"}`
2021-07-13 15:10:13 +03:00
Aliaksandr Valialkin
5bc240bffe vendor: make vendor-update 2021-07-13 14:30:19 +03:00
Aliaksandr Valialkin
244d0fe5d7 deployment/docker: update Go builder from v1.16.5 to v1.16.6
Ths Go release has the following bugfixes: https://github.com/golang/go/issues?q=milestone%3AGo1.16.6+label%3ACherryPickApproved
2021-07-13 14:25:41 +03:00
Aliaksandr Valialkin
a925d5a3e1 app/vmselect/promql: duration handling improvements in MetricsQL queries
- Support durations anywhere in MetricsQL queries. E.g. sum_over_time(m[1h])/1h is equivalent to sum_over_time(m[1h])/3600
- Support durations without suffix. E.g. rate(m[300]) is equivalent to rate(m[5m])
2021-07-12 17:16:41 +03:00
Aliaksandr Valialkin
f9de546139 lib/storage: reset perKeyMisses stats less frequently
This should reduce CPU usage for queries executed with intervals higher than 30 seconds
2021-07-12 14:33:42 +03:00
Aliaksandr Valialkin
4f80b2f230 lib/storage: properly limit the size of storage/date_metricID cache 2021-07-12 14:25:44 +03:00
Aliaksandr Valialkin
f3a5465ece docs/CHANGELOG.md: document the change from bfba4c28a4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1444
2021-07-12 12:44:06 +03:00
Aliaksandr Valialkin
bfba4c28a4 app/vmalert: accept Prometheus-like durations in interval config option inside group section 2021-07-12 12:35:17 +03:00
Aliaksandr Valialkin
a684408e27 docs: update http://slack.victoriametrics.com to https://slack.victoriametrics.com 2021-07-12 10:57:16 +03:00
Aliaksandr Valialkin
8ca2799478 lib/storage: properly determine when the deduplication is needed in needsDedup
Previously needsDedup() could return true if the de-duplication wasn't needed for the following case:

         d < interval
           /     \
   |        v | v        |
     interval   interval

Now it properly returns false for this case
2021-07-12 10:53:30 +03:00
Aliaksandr Valialkin
f539772ca6 docs: sync with the cluster branch 2021-07-10 12:45:38 +03:00
Aliaksandr Valialkin
8c764e88f0 app/vmui: move source code from https://github.com/VictoriaMetrics/vmui to app/vmui
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1413
2021-07-09 17:15:23 +03:00
Aliaksandr Valialkin
2be340a4c9 docs: clarify what does "workload" mean in capacity planning docs 2021-07-09 12:49:53 +03:00
Aliaksandr Valialkin
c5f0b454f0 app/vmselect: follow-up after aa11ef6d3b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1413
2021-07-07 17:43:35 +03:00
tony
e9e35a7d6a add vmui for vmselect component (#1431)
Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-07-07 17:33:02 +03:00
Aliaksandr Valialkin
92b8380fdf docs/Cluster-VictoriaMetrics.md: improve capacity planning recommendations 2021-07-07 16:22:51 +03:00
Aliaksandr Valialkin
1152de4321 vendor: make vendor-update 2021-07-07 16:05:04 +03:00
Aliaksandr Valialkin
7e77980608 vendor: update github.com/VictoriaMetrics/metrics from v1.17.2 to v1.17.3 2021-07-07 16:00:25 +03:00
Aliaksandr Valialkin
6e0553c92e lib/mergeset: cache indexBlock items only on the second request
This should reduce the indexdb/indexBlocks cache size, since it won't contain one-time-wonders items.
2021-07-07 15:23:06 +03:00
Aliaksandr Valialkin
ed944313b0 app/{vminsert,vmselect}: export vminsert_request_duration_seconds and vmselect_request_duration_seconds histograms 2021-07-07 13:25:21 +03:00
Aliaksandr Valialkin
766edbc421 lib/httpserver: print full requestURI in httpserver.Errorf
This should simplify debugging.
2021-07-07 13:09:40 +03:00
Aliaksandr Valialkin
c55a64ba08 docs: clarify capacity planning docs 2021-07-07 12:46:55 +03:00
Aliaksandr Valialkin
e843bd7bd7 lib/storage: do not cache inmemoryBlock entries requested only once (aka one-time-wonder items)
This should reduce the cache size and memory usage for the indexdb/dataBlocks cache
2021-07-07 10:58:51 +03:00
Aliaksandr Valialkin
8b262d4ba7 lib/storage: periodically reset prefetchedMetricIDs cache in order to limit its size under high churn rate 2021-07-07 10:58:51 +03:00
Roman Khavronenko
2f54559c89 alerts: sync alert expression for DiskRunsOutOfSpaceIn3Days with dashboard (#1436) 2021-07-07 10:31:09 +03:00
Aliaksandr Valialkin
a7694092b8 Revert "lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method"
This reverts commit 7c6d3981bf.

Reason for revert: high contention at bucket16Pool on systems with big number of CPU cores.
This slows down query processing significantly.
2021-07-06 18:21:35 +03:00
Aliaksandr Valialkin
8aa9bba9bd lib/{mergeset,storage}: switch from sync.Pool to chan-based pool for inmemoryPart objects
This should reduce memory usage on systems with big number of CPU cores,
since every inmemoryPart object occupies at least 64KB of memory and sync.Pool maintains
a separate pool inmemoryPart objects per each CPU core.

Though the new scheme for the pool worsens per-cpu cache locality, this should be amortized
by big sizes of inmemoryPart objects.
2021-07-06 16:28:41 +03:00
Aliaksandr Valialkin
7c6d3981bf lib/uint64set: allow reusing bucket16 structs inside uint64set.Set via uint64set.Release method
This reduces the load on memory allocator in Go runtime in production workload.
2021-07-06 15:35:03 +03:00
Aliaksandr Valialkin
78c9174682 lib/mergeset: increase pool capacity for inmemoryBlock according to collected profiles from production workload
CPU and memory profiles show that the pool capacity for inmemoryBlock objects is too small.
This results in the increased load on memory allocation code in Go runtime.
Increase the pool capacity in order to reduce the load on Go runtime.
2021-07-06 13:41:34 +03:00
Aliaksandr Valialkin
f71e4d1853 lib/mergeset: limit the frequency for flushCallback calls to once per 10 seconds
This should improve hit ratio for tagFiltersCache when big number of new time series are constantly registered
(aka high churn rate). This, in turn, should reduce CPU usage for queries over such time series.
2021-07-06 12:17:17 +03:00
Aliaksandr Valialkin
f3acf065c9 lib/storage: consistency renaming: tagCache -> tagFiltersCache
This improves code readability
2021-07-06 11:03:51 +03:00
Aliaksandr Valialkin
0020b9f904 lib/workingsetcache: properly update stats for requests and cache misses
Previously the stats for cache misses could be improperly counted, because it had inflated cache misses
if the entry was missing in the curr cache, but was existing in the prev cache.

The same applies to cache requests - they were inflated if the entry was missing in the curr cache.
2021-07-06 10:53:32 +03:00
Roman Khavronenko
75f12bfe78 add option to add Copy button for code snippets (#1433)
To add a Copy button wrap code snippet with the following element:
```
<div class="with-copy" markdown="1">

<your-code-snippet>

</div>
```

See the changes to `Kubernetes monitoring with VictoriaMetrics Single` for details.
2021-07-06 08:23:39 +03:00
Aliaksandr Valialkin
4cf47163c1 lib/workingsetcache: fix cache capacity calculations after 4f0003f182 2021-07-05 17:11:57 +03:00
Aliaksandr Valialkin
4f0003f182 lib/workingsetcache: typo fixes after d0c830039d 2021-07-05 15:35:37 +03:00
Aliaksandr Valialkin
d0c830039d lib/storage: tune cache sizes according to production workload 2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
8f973e34fb lib/workingsetcache: properly switch to whole mode
Previously the switch from `split` to `whole` mode had been performed too early,
e.g. when the current cache size became bigger than 1/4 of the allowed cache size.

Now it is performed when the current cache size becomes bigger than 1/2 of the allowed cache size.

This change can reduce memory usage for data ingestion path when big number of active time series are ingested.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
43103be011 lib/{storage,mergeset}: increase cache timeout for data and index blocks from a minute to two minutes
One minute cache timeout result in slower queries in some production workloads where the interval
between query execution is in the range 1 minute - 2 minutes.
2021-07-05 15:16:11 +03:00
Aliaksandr Valialkin
54b9e1d3cb lib/cgroup: set GOGC to 50 by default if it isn't set
This should reduce memory usage for typical VictoriaMetrics workloads by up to 50%
2021-07-05 15:16:11 +03:00
Roman Khavronenko
bd6b8f7e31 move github-pages docs to the main repo (#1432)
* move github-pages docs to the main repo

* rm github actions for copying docs to VictoriaMetrics/VictoriaMetrics.github.io
2021-07-05 14:34:10 +03:00
Aliaksandr Valialkin
888d62e40c docs/CHANGELOG.md: document the bugfix for vm_merge_need_free_disk_space metric at 9a83e9018d 2021-07-05 12:01:08 +03:00
Aliaksandr Valialkin
6bef227388 docs/Articles.md: add an url to https://medium.com/ibm-garage/monitoring-of-multiple-openshift-clusters-with-victoriametrics-d4f0979e2544 2021-07-05 11:51:37 +03:00
Aliaksandr Valialkin
9a83e9018d lib/storage: properly detect free disk space shortage during data merge
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1373
2021-07-02 17:40:54 +03:00
Aliaksandr Valialkin
f6bb130898 app/{vmselect,vmstorage}: clarify the description for -dedup.minScrapeInterval command-line flag
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1426
2021-07-02 15:05:57 +03:00
Aliaksandr Valialkin
7088f17494 lib/promscrape/discovery/consul: use case-insensitive comparison for service names
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1424
2021-07-02 14:49:27 +03:00
Aliaksandr Valialkin
6e406083f2 lib/promauth: cache the client TLS certificate for up to a second
This should reduce CPU usage when TLS connections are established at a high rate.
2021-07-02 13:21:51 +03:00
Aliaksandr Valialkin
a93da746c0 vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.15 to v1.0.16
This should help with proper copying of tls.Config struct after it has been set up in lib/promauth.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2021-07-02 12:03:32 +03:00
Aliaksandr Valialkin
1e52f11e87 docs/Cluster-VictoriaMetrics.md: typo fix: siplify -> simplify 2021-07-02 10:50:55 +03:00
Aliaksandr Valialkin
5beb099ff9 docs/Cluster-VictoriaMetrics.md: add a chapter describing a toy cluster setup on a single host
While at it, refer to available tools, which can simplify cluster setup
2021-07-02 10:48:56 +03:00
Aliaksandr Valialkin
158c50c0ee lib/promauth: reload TLS certificates from disk on every mTLS connection as Prometheus does
This allows updating client certificates without the need to restart vmagent and/or single-node VictoriaMetrics.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1420
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/470
2021-07-01 15:27:47 +03:00
Aliaksandr Valialkin
02ed4b8b46 docs/CHANGELOG.md: document ae485c2bfd 2021-07-01 11:51:19 +03:00
Aliaksandr Valialkin
c25b839078 lib/workingsetcache: reset the cache mode when the cache is reset
This should reduce memory usage if the working set is reduced after the cache reset.
2021-07-01 11:50:11 +03:00
Nikolay
ae485c2bfd fixes /targets button style (#1423)
* fixes /targets button style
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1422

* updates boostrap version
2021-07-01 11:48:07 +03:00
Aliaksandr Valialkin
00bbe1608b deployment/docker: upgrade alpine image from v3.13.5 to v3.14.0 2021-07-01 10:56:56 +03:00
Roman Khavronenko
a38a6fe8ad dashboard: move panel Disk writes/reads to Resource usage row (#1417)
* dashboard: move panel `Disk writes/reads` to `Resource usage` row

* dashboard: make Stats panel consistent with Cluster dashboard
2021-07-01 05:46:26 +03:00
Aliaksandr Valialkin
c93cee8de8 lib/{mergeset,storage}: reduce the maximum lifetime for cached indexdb and data blocks from 2 minutes to a minute
This should reduce memory usage on a system with high number of active time series and a high churn rate.
One minute is enough for caching the blocks needed for repeated queries (e.g. alerting rules, recording rules and dashboard refreshes).
2021-06-29 19:57:07 +03:00
Aliaksandr Valialkin
fc12484734 lib/mergeset: switch from sync.Pool to a channel for a pool for inmemoryBlock structs
This should reduce memory usage for the pool on systems with big number of CPU cores.

The sync.Pool maintains per-CPU pools, so the total number of objects in the pool
is proportional to the number of available CPU cores. The channel limits the number
of pooled objects by its own capacity. This means smaller number of pooled objects on average.
2021-06-29 19:56:59 +03:00
Aliaksandr Valialkin
9ce211a514 lib/promscrape/discovery/docker: fix golint warning: struct field Id should be ID 2021-06-29 13:12:28 +03:00
Aliaksandr Valialkin
5506cff76e lib/storage: put indexDBName into the key for dateTagFilter cache and for uselessTagFilters cache
This should prevent from stats overwriting when the previous indexdb is queried.
2021-06-29 12:40:05 +03:00
Aliaksandr Valialkin
1b0501a09e lib/promscrape: typo fix in /targets output
The typo has been introduced in fb72a2133f

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1408
2021-06-28 21:26:37 +03:00
Aliaksandr Valialkin
3af2162085 docs/vmagent.md: mention about docker_sd_config support 2021-06-25 20:52:15 +03:00
Aliaksandr Valialkin
008033a374 docs/CHANGELOG.md: cut v1.62.0 2021-06-25 13:29:38 +03:00
Aliaksandr Valialkin
cb5453953f lib/promscrape: split docker and dockerswarm service discovery code bases, since they have very little in common
This is a follow up after c85a5b7fcb
2021-06-25 13:20:20 +03:00
Aliaksandr Valialkin
a69045e440 lib/promscrape: consistently sort service discovery routines
This should simplify further maintenance of the code
2021-06-25 12:10:46 +03:00
Lu Jiajing
c85a5b7fcb Support Docker ServiceDiscovery (#1402)
* add docker discovery

* add test

* add labels test and add scrape work

* remove TODO

* refactor to merge apiConfig and sdConfig

* apply suggestion
2021-06-25 11:42:47 +03:00
Nikolay
434e33da9b adds missing MustStop call to do and http sd (#1404) 2021-06-25 11:39:18 +03:00
Aliaksandr Valialkin
b50e2ec88c vendor: make vendor-update 2021-06-24 17:33:42 +03:00
Aliaksandr Valialkin
4345c07777 docs: consistently put the link to articles and slides about VictoriaMetrics after the links to case studies 2021-06-24 15:37:38 +03:00
Aliaksandr Valialkin
b2fca1ab22 docs/CaseStudies.md: add a case study for DFKI 2021-06-24 15:24:39 +03:00
Aliaksandr Valialkin
906fca9e88 docs/CaseStudies.md: add Groove X case study 2021-06-24 15:04:49 +03:00
Aliaksandr Valialkin
3c3694f72a README.md: add missing linke to Sensedia case study 2021-06-24 14:37:49 +03:00
Aliaksandr Valialkin
d25a161579 docs/CaseStudies.md: add Sensedia case study 2021-06-24 14:36:13 +03:00
Aliaksandr Valialkin
ca42410afd docs/CHANGELOG.md: document the bugfix in increase_pure() function from the commit fb4f758715 2021-06-24 12:05:39 +03:00
Aliaksandr Valialkin
5d64ed73c5 lib/protoparser/clusternative: do not pool unmarshalWork structs, since they can occupy big amounts of memory (more than 100MB per each struct)
This should reduce memory usage for vmstorage under high ingestion rate when the vmstorage runs on a system with big number of CPU cores
2021-06-23 15:46:50 +03:00
Aliaksandr Valialkin
cdfae0117a app/vmselect/promql: return the last timestamp for the max / min value from tmax_over_time() and tmin_over_time() function as most users expect 2021-06-23 14:19:00 +03:00
Aliaksandr Valialkin
70e2852376 docs/CHANGELOG.md: document the bugfix for incorrect stats collection for concurrently executed tag filter
Follow up for c22114c6f0
2021-06-23 14:05:28 +03:00
Aliaksandr Valialkin
94f3e40ab3 app/vminsert/netstorage: sort the -storageNode list passed to vminsert nodes
This should reduce resource usage (CPU, RAM, disk IO) at vmstorage nodes
if the addresses of vmstorage nodes are passed in random order to vminsert nodes.
2021-06-23 14:01:41 +03:00
Aliaksandr Valialkin
c22114c6f0 lib/storage: tune tag filters search logic
Tune the logic according to the logs provided at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-864293624

The previous logic had a race when multiple concurrent queries execute the same tag filter without prior stats.
This could result in incorrectly stored stats for such tag filter, which then could result in non-optimal sorting of tag filters
for further queries.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-23 13:29:39 +03:00
Aliaksandr Valialkin
e8a5bb92b7 lib/promscrape/discovery/consul: properly pass namespace to Consul watcher
Follow-up for 58a2989fe7
2021-06-22 17:42:41 +03:00
Aliaksandr Valialkin
ac54f34f9e lib/promscrape/discovery/http: follow up after e307bbb29a 2021-06-22 13:40:33 +03:00
Aliaksandr Valialkin
755040a171 lib/promscrape/discovery: support generic auth configs in Consul service discovery in the same way as Prometheus 2.28 does 2021-06-22 13:34:02 +03:00
Aliaksandr Valialkin
59e7755df9 docs/CHANGELOG.md: document the support for Consul namepsace
See 58a2989fe7
2021-06-22 13:34:02 +03:00
Nikolay
e307bbb29a adds http_sd (#1399)
* adds http_sd

* adds X-Prometheus-Refresh-Interval-Seconds header

* Update lib/promscrape/discovery/http/api.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 13:33:37 +03:00
Nikolay
58a2989fe7 adds consul enterprise namespace support (#1400)
* adds consul enterprise namespace support

* Update lib/promscrape/discovery/consul/consul.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-06-22 12:49:44 +03:00
Aliaksandr Valialkin
3bc83d3b17 docs/PerTenantStatistic.md: document that the per-tenant statistic is a part of cluster version of VictoriaMetrics 2021-06-22 12:43:01 +03:00
Roman Khavronenko
da8c901fab vmctl: add more context to flags description in vm-native mode (#1395) 2021-06-18 19:20:01 +03:00
Aliaksandr Valialkin
a3262daac0 docs/CHANGELOG.md: typo fixes 2021-06-18 19:14:53 +03:00
Aliaksandr Valialkin
83a4db813e app/vmselect: log slow requests to all the /api/v1/* handlers if their execution time exceeds -search.logSlowQueryDuration 2021-06-18 19:04:42 +03:00
Aliaksandr Valialkin
4cfedc5931 docs/Single-server-VictoriaMetrics.md: mention that it is recommended to use a single scrape_interval across all the scrape targets 2021-06-18 15:39:33 +03:00
Aliaksandr Valialkin
570f36b344 app/vmctl: limit JSON line size by 10K samples (#1394)
This should reduce the maximum memory usage at VictoriaMetrics when importing time series with big number of samples.
2021-06-18 15:26:47 +03:00
Aliaksandr Valialkin
eb1af09a04 docs/Cluster-VictoriaMetrics.md: clarify docs about VictoriaMetrics cluster architecture 2021-06-18 14:36:39 +03:00
Aliaksandr Valialkin
cbd9159a22 docs/CHANGELOG.md: document the reduced disk write IO usage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-18 14:02:59 +03:00
Aliaksandr Valialkin
fb72a2133f lib/promscrape: show jobs with empty scrape targets on /targets page 2021-06-18 10:53:52 +03:00
Aliaksandr Valialkin
d8ab409418 docs/{vmgateway,vmbackupmanager}: explicitly mention that these components are a part of an enterprise package 2021-06-17 17:19:49 +03:00
Nikolay
6c434b260e fixes DO service discovery labels (#1389)
adds test for digitalocean sd
2021-06-17 15:12:20 +03:00
Aliaksandr Valialkin
dcbc22552f lib/storage: fix infinite loop introduced in aa9b56a046 2021-06-17 14:28:10 +03:00
Aliaksandr Valialkin
aa9b56a046 lib/{mergeset,storage}: reduce the number of fsync calls on data ingestion path on systems with many cpu cores
VictoriaMetrics maintains a buffer per CPU core for the ingested data. These buffers are flushed to disk every second.
These buffers are flushed to disk in parallel starting from the commit 56b6b893ce .
This resulted in increased write disk IO usage on systems with many cpu cores
as described at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338#issuecomment-863046999 .

This commit merges the per-CPU buffers into bigger in-memory buffers before flushing them to disk.
This should reduce the rate of fsync syscalls and, consequently, the write disk IO on systems with many CPU cores.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
See also https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244
2021-06-17 13:52:08 +03:00
Aliaksandr Valialkin
12a83d25bf app/vmagent/remotewrite: go fmt after 0a796f7c3a 2021-06-17 13:52:06 +03:00
Aliaksandr Valialkin
9eb3fc346f docs/vmagent.md: sync with app/vmagent/README.md via make docs-sync 2021-06-16 12:36:49 +03:00
Aliaksandr Valialkin
6d17a4e12d docs/CHANGELOG.md: document the changed -remoteWrite.queues value
This is a follow-up for 0a796f7c3a

See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1385
2021-06-16 12:35:46 +03:00
Zongyang
0a796f7c3a Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUs * 2 (#1385)
* Change default value of '-remoteWrite.queues' to cgroup.AvailableCPUS() * 2 to reduce scrape interval

Default value of vmagent option '-remotewrite.queues' is 4 and default
size of vmagent ScheudleUnmarshalWorkers is number of CPUs, when available
CPUs is much greater than 4, e.g 32, worker are competing push queues
which will increase scrape interval and may cause scrape timeout.

* Update README and flag description

Co-authored-by: xiaozy <xiaozy01@fenbi.com>
2021-06-16 12:16:44 +03:00
Roman Khavronenko
fb4f758715 promql: fix increase_pure calculation for cases with stale series (#1381)
Due to staleness handling, increase_pure were using incorrect previous value
during calculation in cases where series disappears for period longer
than staleness period and then returns back. The fix suppose to account
for a real datapoint value before staleness takes place. The fix should
remove unexpected spikes while using `increase_pure` for staled series.
2021-06-15 17:37:19 +03:00
Aliaksandr Valialkin
6a8369f0fc docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics works great with APM workloads (aka Application Performance Monitoring) 2021-06-15 17:32:52 +03:00
Aliaksandr Valialkin
84fb59b0ba lib/storage: move deletedMetricIDs set from indexDB to Storage
This makes consitent the list of deleted metricIDs when it is used from both the current indexDB and the previous indexDB (aka extDB).
This should fix the issue, which could lead to storing new samples under deleted metricIDs after indexDB rotation.
See more details at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347#issuecomment-861232136 .

Thanks to @tangqipengleoo for the initial analysis and the pull request - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

This commit resolves the issue in more generic way compared to https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1383 .

The downside of the commit is the deletedMetricIDs set isn't cleaned from the metricIDs outside the retention. It needs app restart.
This should be OK in most cases.
2021-06-15 15:04:30 +03:00
Aliaksandr Valialkin
e028ad241a lib/protoparser: stop reading the input stream as soon as the callback provided by the caller returns error
This is a follow-up for af90c3c43b
2021-06-14 15:18:49 +03:00
faceair
af90c3c43b lib/protoparser: stop read when callback error (#1380) 2021-06-14 15:10:58 +03:00
Aliaksandr Valialkin
36d55bff66 lib/promscrape: show the number of samples collected during the last scrape at /targets and /api/v1/targets pages
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1377
2021-06-14 14:04:00 +03:00
Aliaksandr Valialkin
7b283ee91c vendor: update github.com/klauspost/compress from v1.13.0 to v1.13.1 2021-06-14 13:42:46 +03:00
Roman Khavronenko
a90012ef26 dashboard: bump version requirements (#1378) 2021-06-14 13:31:59 +03:00
Aliaksandr Valialkin
05bc9667c1 docs/CHANGELOG.md: document the addition of DigitalOcean service discovery
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367
2021-06-14 13:18:19 +03:00
Nikolay
729c4eeb9c adds digital ocean sd (#1376)
* adds digital ocean sd config

* adds digital ocean sd
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1367

* typo fix
2021-06-14 13:15:04 +03:00
Roman Khavronenko
b8526e88d3 Dashboard single (#1374)
* dashboard: update single version dash

The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0

* dashboard: update single version dash tags

* dashboard: update vmagent dash

The update contains the following changes:
* display anonymous memory usage metric. This metric suppose to reflect
memory usage of the process which can't be freed by OS;
* add legends to all panels. This is important for cases when users share
the screenshots;
* modify panels for Grafana v8.0.0
2021-06-14 13:03:23 +03:00
Aliaksandr Valialkin
06b8e7d148 lib/promscrape: increase the duration for reading the full response in stream parsing mode
Increase the duration from 10x to 30x of the configured `scrape_interval'.

This should help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:28:09 +03:00
Aliaksandr Valialkin
48210130ac lib/protoparser: measure the duration for reading the whole block of data instead of a single read operation
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:25:52 +03:00
Aliaksandr Valialkin
3e5b6bae66 docs/Cluster-VictoriaMetrics.md: add lists for command-line flags for cluster components 2021-06-14 12:22:05 +03:00
Aliaksandr Valialkin
3c4366806c lib/protoparser/common: log the duration for reading a block of data in ReadLinesBlockExt on error
This may help debugging issues like https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1365
2021-06-14 12:22:04 +03:00
Aliaksandr Valialkin
8a519f1518 docs/vmalert.md: follow-up after 6d5a8c28cd 2021-06-14 11:37:26 +03:00
Roman Khavronenko
6d5a8c28cd Vmalert docs (#1372)
* vmalert: mention what happens if `for` is set to 0 or omitted

* vmalert: add more context to docs
2021-06-11 13:25:53 +03:00
Aliaksandr Valialkin
2ef2e3d6f8 docs/CHANGELOG.md: cut v1.61.1 2021-06-11 13:01:31 +03:00
Aliaksandr Valialkin
ed83558646 app/vmauth: properly handle http.ErrAbortHandler panic
This panic can be raised by the reverseProxy on aborted request to the backend.
So handle it (e.g. suppress) at reverseProxy.ServeHTTP call.

Do not suppress the panic at lib/httpserver generic HTTP handler,
since it may result in an inconsistent state left after the panicking handler.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353
2021-06-11 12:50:25 +03:00
Aliaksandr Valialkin
c4f3fbfa5d lib/storage: reset cache on disk during series deletion and during indexdb rotation
This should prevent from inconsistent behavior (aka partially missing data for some time series) after unclean shutdown.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1347
2021-06-11 12:42:28 +03:00
Aliaksandr Valialkin
d979e14da2 docs/CHANGELOG.md: document the bugfix from 7adfe878e1 2021-06-11 11:28:10 +03:00
Roman Khavronenko
7adfe878e1 vmalert: fix mistake with object reuse while parsing response (#1370)
* vmalert: fix mistake with object reuse while parsing response

During the refactoring, the wrong optimisations was applied in
parse function which caused metric fields reset. The change removes
optimisation.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1369

* vmalert: add test to cover multiple metrics in one response
2021-06-11 11:22:05 +03:00
Aliaksandr Valialkin
69b1482bdb lib/storage: consistency renaming: getMaxRawRowsPerPartition -> getMaxRawRowsPerShard 2021-06-11 10:57:23 +03:00
Aliaksandr Valialkin
044ab46824 lib/storage: reduce the amounts of memory which can be occupied by rawRow items during data ingestion on a system with many CPU cores 2021-06-11 10:57:23 +03:00
John Belmonte
67b17cdd68 spelling fix: synonym (#1363) 2021-06-10 08:32:52 +03:00
Aliaksandr Valialkin
b4008c1e65 vendor: make vendor-update 2021-06-09 20:40:50 +03:00
Aliaksandr Valialkin
b83a51366e docs/CHANGELOG.md: cut v1.61.0 2021-06-09 19:04:44 +03:00
Aliaksandr Valialkin
10d105ee25 docs/FAQ.md: add a chapter comparing VictoriaMetrics to QuestDB 2021-06-09 19:03:23 +03:00
Aliaksandr Valialkin
d0dca62026 app/vmselect/promql: typo fix in the comment 2021-06-09 18:34:31 +03:00
Aliaksandr Valialkin
46d2c7d640 docs/Articles.md: update the broken link to https://nordicapis.com/api-monitoring-with-prometheus-grafana-alertmanager-and-victoriametrics/ 2021-06-09 16:39:56 +03:00
Aliaksandr Valialkin
2422b5091f app/vmauth: improve readability for a config with multiple src_paths 2021-06-09 15:39:32 +03:00
Aliaksandr Valialkin
329b6cd146 docs/CHANGELOG.md: document the enterprise bugfix for the target property in Graphite Render API 2021-06-09 13:50:52 +03:00
Aliaksandr Valialkin
db1c548cb4 docs/CHANGELOG.md: document improvements in re-routing handling in vminsert
See the following commits:

* 1c09e71f5b
* 0d067eb112
* 2c6b917749
2021-06-09 13:40:37 +03:00
Aliaksandr Valialkin
f93a31b490 docs/vmagent.md: mention that vmagent supports scrape targets sharding 2021-06-09 12:29:34 +03:00
Aliaksandr Valialkin
ab15bf8c90 docs: document rules replay feature for vmalert
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/836

This is a follow-up for 2a259ef5e7
2021-06-09 12:27:34 +03:00
Roman Khavronenko
2a259ef5e7 vmalert: support rules backfilling (aka replay) (#1358)
* vmalert: support rules backfilling (aka `replay`)

vmalert can `replay` configured rules in the past
and backfill results via remote write protocol.
It supports MetricsQL/PromQL storage as data source,
and can backfill data to remote write compatible
storage.

Supports recording and alerting rules `replay`. See more
details in README.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/836

* vmalert: review fixes

* vmalert: readme fixes
2021-06-09 12:20:38 +03:00
dependabot[bot]
408fc90b40 build(deps): bump codecov/codecov-action from 1.5.0 to 1.5.2 (#1362)
Bumps [codecov/codecov-action](https://github.com/codecov/codecov-action) from 1.5.0 to 1.5.2.
- [Release notes](https://github.com/codecov/codecov-action/releases)
- [Changelog](https://github.com/codecov/codecov-action/blob/master/CHANGELOG.md)
- [Commits](https://github.com/codecov/codecov-action/compare/v1.5.0...v1.5.2)

---
updated-dependencies:
- dependency-name: codecov/codecov-action
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-06-09 12:16:46 +03:00
Roman Khavronenko
5e9f3777bf alerts: add new alert LabelsLimitExceededOnIngestion (#1359) 2021-06-09 12:15:36 +03:00
Aliaksandr Valialkin
c16edf8287 docs/CHANGELOG.md: document the bugfix, which prevents panics for aborted http requests in vmauth
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1353

This is a follow-up for 6b29b955c0
2021-06-09 12:11:53 +03:00
Nikolay
6b29b955c0 disables panic for net/httpAbortHandler (#1355) 2021-06-09 12:08:58 +03:00
Aliaksandr Valialkin
28c44ef065 deployment/docker/docker-compose.yml: update Grafana from v7.5.2 to v8.0.0
See https://github.com/grafana/grafana/releases/tag/v8.0.0
2021-06-09 02:25:24 +03:00
k1rk
668165f53d rename serviceHealth group name to vm-health (#1360)
this causes conflicts in `victoria-metrics-k8s-stack` chart =)
2021-06-08 23:34:38 +03:00
Aliaksandr Valialkin
c0feafefc8 vendor: make vendor-update 2021-06-08 15:49:30 +03:00
Aliaksandr Valialkin
8a7e6ad5cc deployment/docker: update Go builder from v1.16.4 to v1.16.5
See the fixed isses at https://github.com/golang/go/issues?q=milestone%3AGo1.16.5+label%3ACherryPickApproved
2021-06-08 15:45:44 +03:00
Aliaksandr Valialkin
645e18dd88 vendor: update github.com/klauspost/compress from v1.12.3 to v1.13.0 2021-06-08 15:45:43 +03:00
Aliaksandr Valialkin
96b691a0ab lib/storage: properly account the number of loops spent when matching for or suffixes
This may help https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1338
2021-06-08 13:06:12 +03:00
Aliaksandr Valialkin
661d2668f8 lib/promrelabel: add tests for labelsToString() function 2021-06-04 20:42:46 +03:00
Aliaksandr Valialkin
78f83dc5ad app/{vmagent,vminsert}: follow-up after 2fe045e2a4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1343
2021-06-04 20:27:58 +03:00
jelmd
2fe045e2a4 new feature: debug relabeling (#1344)
* new feature: relabel logging

Use scrape_configs[x].relabel_debug = true to log metric names inkl.
labels before and after relabeling. After relabeling related metrics
get dropped, i.e. not submitted to servers.

* vminsert wants relabel logging, too.
2021-06-04 17:50:23 +03:00
Aliaksandr Valialkin
5ac25d2585 docs/CHANGELOG.md: document the bugfix from 6f19bb23a1 2021-06-04 11:53:00 +03:00
Hason Chan
6f19bb23a1 fix eureka_sd_configs HTTPClientConfig incorrect parsing (#1350) 2021-06-04 11:47:17 +03:00
Aliaksandr Valialkin
f963f04d3d app/vminsert: add -disableRerouting command-line flag for disabling re-routing if some vmstorage nodes have lower performance than the others 2021-06-04 04:42:01 +03:00
Aliaksandr Valialkin
2d8bd41f8a lib/storage: reduce memory allocations when syncing dateMetricIDCache 2021-06-03 16:20:42 +03:00
Aliaksandr Valialkin
d2d746c4fc docs/CHANGELOG.md: document that it is possible to build VictoriaMetrics components for Solaris
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1322

This is a follow-up for ddc8022702
2021-05-31 09:33:29 +03:00
Nikolay
ddc8022702 fixes solaris build (#1345) 2021-05-31 09:21:23 +03:00
Aliaksandr Valialkin
de8e4b6223 vendor: update github.com/VictoriaMetrics/fastcache from v1.5.8 to v1.6.0 2021-05-31 09:14:25 +03:00
Aliaksandr Valialkin
b22e380a34 app/vmauth: allow balancing the load among multiple backend nodes by specifying multiple urls in url_prefix config 2021-05-29 01:03:37 +03:00
Aliaksandr Valialkin
ffa4cd6fa5 docs/Articles.md: add a link to https://www.percona.com/blog/2021/05/26/compiling-a-percona-monitoring-and-management-v2-client-in-arm-raspberry-pi-3/ 2021-05-28 14:34:31 +03:00
Aliaksandr Valialkin
9315e3ded3 vendor: make vendor-update 2021-05-28 13:18:37 +03:00
Aliaksandr Valialkin
cecbdee00d docs/MetricsQL.md: add a link to technical details about rate() and increase() calculations in Prometheus and VictoriaMetrics 2021-05-28 13:13:50 +03:00
Aliaksandr Valialkin
634eeac804 docs/Single-server-VictoriaMetrics.md: remove misleading wording about querying Graphite metrics with MetricsQL 2021-05-28 02:39:14 +03:00
Aliaksandr Valialkin
a52a20659a lib/promscrape: fix tests after f0c21b6300 2021-05-28 01:32:50 +03:00
Aliaksandr Valialkin
d088923aef Revert "lib/mergeset: remove a pool for inmemoryBlock structs"
This reverts commit 793fe39921.

Reason to revert: production testing revealed possible slowdown when registering big number of new time series
2021-05-28 01:09:32 +03:00
Aliaksandr Valialkin
793fe39921 lib/mergeset: remove a pool for inmemoryBlock structs
The pool for inmemoryBlock struct doesn't give any performance gains in production workloads,
while it may result in excess memory usage for inmemoryBlock structs inside the pool during
background merge of indexdb.
2021-05-27 21:57:33 +03:00
Aliaksandr Valialkin
60341722d5 docs: document f0c21b6300 2021-05-27 15:03:30 +03:00
faceair
f0c21b6300 lib/promscrape: apply body size & sample limit to stream parse (#1331)
* lib/promscrape: apply body size limit to stream parse

Signed-off-by: faceair <git@faceair.me>

* lib/promscrape: apply sample limit to stream parse

Signed-off-by: faceair <git@faceair.me>
2021-05-27 14:52:44 +03:00
Aliaksandr Valialkin
eda6ee40b6 docs: make docs-sync after 2bbb1cc7c1 2021-05-26 12:32:16 +03:00
Roman Khavronenko
2bbb1cc7c1 Docs review (#1330)
* re-order components by prioritizing Cluster-VictoriaMetrics.md

* drop Home.md since it just duplicates other links
2021-05-26 12:28:58 +03:00
Roman Khavronenko
7ecaa2fe2c update the issue template (#1329)
The main changes are:
* ask for Grafana's dashboard screenshots;
* ask only for non-default cmd-line flags;
* explicitly ask about logs;
2021-05-26 12:26:45 +03:00
Aliaksandr Valialkin
7f531e3a60 docs/CHANGELOG.md: document changes from 2233d6ed8a and d210958fd0 2021-05-26 12:23:22 +03:00
Roman Khavronenko
d210958fd0 vmalert: automatically reload configuration on file change (#1326)
New flag `-rule.configCheckInterval` defines how often `vmalert` will re-read
config file. If it detects any changes, the config will be reloaded.
This behaviour is turned off by default.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/512
2021-05-25 14:27:22 +01:00
Aliaksandr Valialkin
2233d6ed8a lib/uint64set: store pointers to bucket16 instead of bucket16 objects in bucket32
This speeds up bucket32.addBucketAtPos() when bucket32.buckets contains big number of items,
since the copying of bucket16 pointers is much faster than the copying of bucket16 objects.

This is a cpu profile for copying bucket16 objects:

      10ms     13.43s (flat, cum) 32.01% of Total
      10ms      120ms    650:	b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
         .          .    651:	b.b16his[pos] = hi
         .     13.31s    652:	b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
         .          .    653:	b16 := &b.buckets[pos]
         .          .    654:	*b16 = bucket16{}
         .          .    655:	return b16
         .          .    656:}

This is a cpu profile for copying pointers to bucket16:

      10ms      1.14s (flat, cum)  2.19% of Total
         .      100ms    647:	b.b16his = append(b.b16his[:pos+1], b.b16his[pos:]...)
         .          .    648:	b.b16his[pos] = hi
      10ms      700ms    649:	b.buckets = append(b.buckets[:pos+1], b.buckets[pos:]...)
         .      330ms    650:	b16 := &bucket16{}
         .          .    651:	b.buckets[pos] = b16
         .          .    652:	return b16
         .          .    653:}
2021-05-25 14:20:52 +03:00
Aliaksandr Valialkin
fd264477bf snap: update Go builder from Go 1.15 to Go 1.16 2021-05-25 12:12:37 +03:00
Dan Fredell
c78d7dde92 Fix quote difference on label_move example (#1321)
Fix quote difference on label_move example
2021-05-24 23:20:38 +03:00
Aliaksandr Valialkin
08234aa7a0 docs/CHANGELOG.md: cut v1.60.0 2021-05-24 15:55:08 +03:00
Aliaksandr Valialkin
a47d4927d2 docs/Single-server-VictoriaMetrics.md: clarify that the storage size depends on the number of samples per series 2021-05-24 15:47:44 +03:00
Aliaksandr Valialkin
b1e8d92577 docs/vmalert.md: sync with app/vmalert/README.md via make docs-sync 2021-05-24 15:47:20 +03:00
Aliaksandr Valialkin
8e7d1f8824 app/vmagent/remotewrite: use WARN level instead of ERROR level for couldnt send a block with size ... bytes to ... log message
This is really warning, since vmagent re-tries sending the data block until success.
2021-05-24 15:43:59 +03:00
Aliaksandr Valialkin
39ef1e7a51 lib/storage: do not stop data ingestion on the first error in Storage.AddRows
Continue data ingestion for the rest of blocks.
2021-05-24 15:32:47 +03:00
Aliaksandr Valialkin
4b01c9fb2e lib/storage: limit the number of rows per each block in Storage.AddRows()
This should reduce memory usage when ingesting big blocks or rows.
2021-05-24 15:24:07 +03:00
Aliaksandr Valialkin
a4ff4b8e65 lib/storage: allow filling all the rows up to their capacity in rawRowsShard.addRows
This should reduce memory usage a bit on data ingestion path
2021-05-24 15:22:59 +03:00
Aliaksandr Valialkin
a46551245c lib/bloomfilter: fix TestLimiterConcurrent 2021-05-24 05:17:36 +03:00
Aliaksandr Valialkin
eafed5335d vendor: update github.com/valyala/gozstd from v1.10.0 to v1.11.0 2021-05-24 04:59:55 +03:00
Aliaksandr Valialkin
93d81b486d lib/fs: do not pass done callback to tryRemoveAll() func
This improves code readability a bit.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-24 04:51:57 +03:00
Aliaksandr Valialkin
f54133b200 lib/storage: do not populate MetricID->MetricName cache during data ingestion
This cache isn't needed during data ingestion, so there is no need in spending RAM on it.

This reduces RAM usage on data ingestion path by 30%
2021-05-24 03:02:46 +03:00
Aliaksandr Valialkin
24858820b5 docs/CHANGELOG.md: small typo fix 2021-05-23 14:15:01 +03:00
Aliaksandr Valialkin
04eb37a590 docs/CHANGELOG.md: document the addition of extra_filter_labels at 84cc0513e1 2021-05-23 14:10:33 +03:00
Aliaksandr Valialkin
ec79abc382 lib/{mergeset,storage}: reduce the number of IFNO log messages like merged ... items across ... blocks in ... seconds
Log these messages if the merge takes more than 30 seconds instead of 10 seconds.
2021-05-23 14:03:21 +03:00
Roman Khavronenko
84cc0513e1 vmalert: support extra_filter_labels setting per-group (#1319)
The new setting `extra_filter_labels` may be assigned to group.
If it is, then all rules within a group will automatically filter
for configured labels. The feature is well-described here
https://docs.victoriametrics.com#prometheus-querying-api-enhancements

New setting is compatible only with VM datasource.
2021-05-23 00:26:01 +03:00
Aliaksandr Valialkin
78dddfb98f lib/promauth: follow-up after 5b8176c68e 2021-05-22 18:01:11 +03:00
Nikolay
5b8176c68e basic OAuth2 support for remoteWrite and scrape targets (#1316)
* adds OAuth2 support for remoteWrite and scrapping

* adds tests
changes init
2021-05-22 16:20:18 +03:00
Aliaksandr Valialkin
e05dd475f0 lib/fs: concurrently remove up to 1024 blocked NFS directories
Previously the blocked directories were removed sequentially by a single goroutine.
This can be not enough for highly loaded VictoriaMetrics that accepts millions of sample per second,
when big number of LSM parts are created and removed at high rate.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:57:46 +03:00
Aliaksandr Valialkin
8e2985b53d lib/fs: wait for a while before giving up on NFS file removal if the removal queue is full
This should reduce the probability of the panic on a highly loaded VictoriaMetrics
accepting millions of samples per second.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1313
2021-05-21 17:21:00 +03:00
Aliaksandr Valialkin
d173a9348c docs/MetricsQL.md: add a link to a list of supported timezones that can be passed to timezone_offset() function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1306
2021-05-21 16:55:24 +03:00
Aliaksandr Valialkin
14ec2b9f26 docs/CHANGELOG.md: mention the bugfix from d626c5c2a9
Updates https://github.com/VictoriaMetrics/operator/issues/243
2021-05-21 16:37:14 +03:00
Aliaksandr Valialkin
c54bb73867 all: do not skip SIGHUP signal during service initialization
This can lead to stale or incomplete configs like in the https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-21 16:34:06 +03:00
Nikolay
d626c5c2a9 changes vmalert query function (#1307)
* changes vmalert query function
for prometheus rules compatibility its better to use labels as map.
it simplifies template evaluation and allow to ignore can't evaluate field error
because map will return default value.
fixes https://github.com/VictoriaMetrics/operator/issues/243
2021-05-21 13:55:43 +03:00
Aliaksandr Valialkin
060664e221 docs/FAQ.md: re-order questions to be more attractive to visitors 2021-05-20 19:49:34 +03:00
Aliaksandr Valialkin
49ecbc765d app/vmauth: add ability to protect /-/reload endpoint with authKey 2021-05-20 18:47:01 +03:00
Aliaksandr Valialkin
362a49bdd1 vendor: make vendor-update 2021-05-20 18:46:26 +03:00
Aliaksandr Valialkin
4c7bb75fa2 Makefile: update golangci-lint from v1.29.0 to v1.40.1 2021-05-20 18:27:10 +03:00
Aliaksandr Valialkin
0d3e78b9ee docs/CHANGELOG.md: move tip to proper place 2021-05-20 17:56:03 +03:00
Aliaksandr Valialkin
480087944a docs/FAQ.md: add a question on how to run VictoriaMetrics on FreeBSD
The question has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1284
2021-05-20 16:16:35 +03:00
Aliaksandr Valialkin
49c39ab388 docs/FAQ.md: add can I use VictoriaMetrics instead of Prometheus?
The question has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1284
2021-05-20 16:10:36 +03:00
Aliaksandr Valialkin
6cc6a032cd docs/FAQ.md: add a question about memory limits for VictoriaMetrics components
The question has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1284
2021-05-20 16:09:41 +03:00
Aliaksandr Valialkin
d06aae9454 docs/FAQ.md: add a question about multi-tenancy
The question has been extracted from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1284
2021-05-20 15:52:40 +03:00
Aliaksandr Valialkin
e394ff6466 app/vmagent/remotewrite: expose metrics with the current number of active series per day and per hour
These numbers are exposed via the following metrics:

- vmagent_hourly_series_limit_current_series
- vmagent_daily_series_limit_current_series

Expose also the limits via the following metrics:

- vmagent_hourly_series_limit_max_series
- vmagent_daily_series_limit_max_series
2021-05-20 15:28:09 +03:00
Aliaksandr Valialkin
ad73f226ff app/vmstorage: add ability to limit series cardinality via -storage.maxHourlySeries and -storage.maxDailySeries command-line flags 2021-05-20 14:15:19 +03:00
Aliaksandr Valialkin
7e526effaa app/vmagent: add ability to limit series cardinality on a per-hour and per-day basis 2021-05-20 13:13:40 +03:00
Aliaksandr Valialkin
3cd8606abd docs/CHANGELOG.md: document the bugfix in vmctl import for InfluxDB lines with identical names for field and tag
See dcf8803bbd

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1299
2021-05-20 12:06:16 +03:00
Aliaksandr Valialkin
98e425ee09 docs/CHANGELOG.md: refer to the issue related to timezone_offset() function 2021-05-20 12:03:39 +03:00
Roman Khavronenko
dcf8803bbd vmctl: explicitly set ::tag type for labels selector in influx mode (#1310)
The `::tag` type is needed in cases when field and tag names are equal, which
results into unexpected results in InfluxQL. Setting the type explicitly helps
InfluxDB to understand which exact column we apply filter to.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1299
2021-05-20 12:03:16 +03:00
Aliaksandr Valialkin
22585531ad lib/promscrape/discovery/kubernetes: make golangci-lint happy by removing empty branches 2021-05-20 12:00:29 +03:00
Aliaksandr Valialkin
0842bb9294 app/vmselect/promql: add timezone_offset(tz) function
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1306
2021-05-20 11:53:09 +03:00
Aliaksandr Valialkin
009e136d88 lib/storage: remove possible data race when logging dropped labels 2021-05-20 02:47:22 +03:00
Aliaksandr Valialkin
f7e73fbe8b app/vmagent/remotewrite: sort labels before sending the series to per-remoteWrite.url queues 2021-05-20 02:12:36 +03:00
Aliaksandr Valialkin
eb8093ca6b lib/promscrape/discovery/kubernetes: reload objects on object parse error
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 23:25:48 +03:00
Neo He
e8a6c6927d app/{vmbackup,vmrestore},docs/vmrestore.md: typo fix: vbackup -> vmbackup (#1305) 2021-05-18 16:37:28 +03:00
Aliaksandr Valialkin
f4719889da lib/httpserver: typo fix in -http.shutdownDelay command-line flag description: servier -> server 2021-05-18 16:26:16 +03:00
Aliaksandr Valialkin
b30925738b docs/vmalert.md: document multitenant support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/740
2021-05-18 16:26:14 +03:00
Aliaksandr Valialkin
0cf1fe19e6 lib/promscrape/discovery/kubernetes: simplify the reload logic for urlWatcher.objectsByKey 2021-05-18 15:40:46 +03:00
Aliaksandr Valialkin
bfb61de606 lib/promscrape/discovery/kubernetes: properly update vm_promscrape_discovery_kubernetes_scrape_works metric
Previously it wasn't descreased during config update.
2021-05-18 15:33:45 +03:00
Aliaksandr Valialkin
3166994244 lib/promscrape/discovery/kubernetes: log errors and stop service discovery when unexpected updates are received from Kubernetes API server
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-18 15:10:48 +03:00
Aliaksandr Valialkin
66aba00549 app/vmauth: reload -auth.config on the request to /-/reload
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1194
2021-05-18 02:23:55 +03:00
Aliaksandr Valialkin
3339ea41e7 docs/vmbackup.md: typo fix: snaphosts -> snapshots
Thanks to @jelmd - see 1ab27582a3 (r50884395)
2021-05-18 01:12:36 +03:00
Aliaksandr Valialkin
6c944b86d8 docs: dealay -> delay
Thanks to @jelmd . See 0b7e3510c8 (r50884991)
2021-05-18 01:07:52 +03:00
Aliaksandr Valialkin
ec6a284978 lib/promrelabel: add tests for conditional removal of label on another label match
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1294
2021-05-18 00:23:03 +03:00
Aliaksandr Valialkin
2eed410466 lib/promscrape/discovery/kubernetes: key ScrapeWork objects by urlWatcher instead of namespace
This makes the code less fragile if urlWatcher would depend on additional to namepsace properties.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-17 23:47:08 +03:00
Aliaksandr Valialkin
ede2ba5a45 docs/CHANGELOG.md: document b38edec7ee
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1293
2021-05-17 01:57:34 +03:00
Aliaksandr Valialkin
55ed77760b vendor: make vendor-update 2021-05-17 01:47:33 +03:00
Aliaksandr Valialkin
2ec6f81b10 vendor: update github.com/valyala/gozstd from v1.9.0 to v1.10.0 2021-05-17 01:47:33 +03:00
Roman Khavronenko
b96b19f040 Docs update from victoriaMetrics.github.io (#1302)
* port change from 11ca65677b

* port change from afb41dfa43

* port change from f82e3733c9

* port change from d499ab0502
2021-05-16 18:43:09 +01:00
Roman Khavronenko
b38edec7ee vmalert: use stringified label keys for duplicates map in recroding rules (#1301)
duplicates map helps to determine wheter extra labels has overriden
labels which make time series unique. It was using a sorted hashed
labels sequence as a key. But hashing algorithm could have collisions,
so it is more convenient to not use hashing at all.

Log message for recording rules duplicates was improved as well.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1293
2021-05-15 13:25:57 +03:00
Aliaksandr Valialkin
733706e6c6 lib/promscrape: reload auth tokens from files every second
Previously auth tokens were loaded at startup and couldn't be updated without vmagent restart.
Now there is no need in vmagent restart.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1297
2021-05-14 20:00:08 +03:00
Aliaksandr Valialkin
a15145e597 docs/Articles.md: add a link to https://fly.io/blog/measuring-fly/ 2021-05-14 19:47:43 +03:00
Aliaksandr Valialkin
24cbf8b66c docs/vmauth.md: sync with app/vmauth/README.md after 10a47af631 2021-05-14 18:13:10 +03:00
Aliaksandr Valialkin
10a47af631 app/{vmalert,vmauth}: explicitly set MaxIdleConnsPerHost in net/http.Client.Transport
By default MaxIdleConnsPerHost is set to 2. This limits the possibility to re-use http keep-alive connections.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1300
2021-05-14 18:12:24 +03:00
Aliaksandr Valialkin
e6ec442e96 docs/Single-server-VictoriaMetrics.md: document how to reduce memory usage when importing too long JSON lines into VictoriaMetrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1295
2021-05-14 17:22:08 +03:00
Denys Holius
70165a6758 Fix Cortex typo 2021-05-13 16:26:05 +03:00
Aliaksandr Valialkin
a422165dc6 app/vmagent/remotewrite: clarify the comment explaining why vmagent drops blocks if remote storage returns 400 or 409 status code 2021-05-13 16:16:16 +03:00
Aliaksandr Valialkin
fc3519fa26 lib/promscrape: limit scrape_timeout by scrape_interval like Prometheus does 2021-05-13 16:09:45 +03:00
Aliaksandr Valialkin
1f75ae6006 docs/CHANGELOG.md: document the bugfix from b4f5be8bd8 2021-05-13 11:18:39 +03:00
匠心零度
b4f5be8bd8 fix vagent imbalance problem (#1292)
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=1 -promscrape.config=/path/to/config.yml ...
/path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ...

Co-authored-by: lirenzuo <lirenzuo@shein.com>
2021-05-13 11:14:51 +03:00
Aliaksandr Valialkin
e6fda03e8f vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.14 to v1.0.15
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:44:14 +03:00
Aliaksandr Valialkin
f6a641de62 lib/promscrape: exponentially increase retry interval on unsuccesful requests to scrape targets or to service discovery services
This should reduce CPU load at vmagent and at remote side when the remote side doesn't accept HTTP requests.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1289
2021-05-13 10:38:50 +03:00
Aliaksandr Valialkin
c0ec541559 lib/cgroup: document the ability to detect cgroup v2 memory and cpu limits. This is follow-up for b50024812e 2021-05-13 09:26:20 +03:00
Aliaksandr Valialkin
d7be2753c0 lib/storage: substitute GetTSDBStatusForDate with GetTSDBStatusWithFiltersForDate with nil tfss 2021-05-13 09:02:33 +03:00
Nikolay
b50024812e adds cgroupsv2 support (#1283)
* adds cgroupv2 limits support
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1269

* small fix

* changes Atoi to ParseUint
2021-05-13 09:02:13 +03:00
Aliaksandr Valialkin
a22a17dc66 lib/storage: merge getTSDBStatusForDate with getTSDBStatusWithFiltersForDate
These functions are non-trivial, while their code has minimal differences.
It is better from maintainability PoV to merge these functions into a single function.
2021-05-12 17:56:53 +03:00
Aliaksandr Valialkin
beddc0c0d5 docs/Single-server-VictoriaMetrics.md: typo fix: retuns->returns 2021-05-12 17:22:34 +03:00
Aliaksandr Valialkin
832651c6c2 app/vmselect: follow up after 8a0678678b
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1168
2021-05-12 17:18:30 +03:00
Nikolay
8a0678678b Adds tsdb match filters (#1282)
* init work on filters

* init propose for status filters

* fixes tsdb status
adds test

* fix bug

* removes checks from test
2021-05-12 15:18:45 +03:00
Aliaksandr Valialkin
cfd6aa28e1 lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices scrape targets every 5 seconds, since they may depend on changed service and pod objects
This should make endpoints and endpointslices scrape targets eventually consistent with the maximum delay of 5 seconds after the related service or pod object changes.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-12 14:10:34 +03:00
Aliaksandr Valialkin
afec68ad13 lib/httpserver: add new X-Server-Hostname header instead of overwriting already exsiting header
This makes possible tracking origins of chained requests over multiple hops.
2021-05-11 23:48:59 +03:00
Aliaksandr Valialkin
f2d5c4e2d0 lib/httpserver: return X-Server-Hostname http header in all the responses for better debuggability 2021-05-11 22:03:48 +03:00
Aliaksandr Valialkin
33f7bacb01 lib/storage: properly apply time range when matching an empty filter
It must match all the time series on the given time range.
Previously it was matched to all the time series without the restriction on the given time range.
2021-05-11 01:12:05 +03:00
Aliaksandr Valialkin
3fb3ce2a6d Revert ".github/dependabot.yml: remove automated dependency version checks"
This reverts commit 5b986c95dd.

This check verifies only dependencies needed for github-actions. This is OK.
2021-05-10 12:05:09 +03:00
Aliaksandr Valialkin
8b65920a8b Add make check-licenses rule for the ability to manually check licenses in vendored dependencies
This is a follow-up for c687536956
2021-05-10 11:54:43 +03:00
Aliaksandr Valialkin
5b986c95dd .github/dependabot.yml: remove automated dependency version checks
Dependency updates must be under manual control, since the resulting code diffs must be reviewed manually for the sake of security.
It is done with `make vendor-update` now.
2021-05-10 11:41:23 +03:00
Artem Navoiev
c687536956 Add vendor license checker, update codecov action, add dependbot for … (#1280)
* Add vendor license checker, update codecov action, add dependbot for github actions

* update gitingore, temprorary turn on check

* fix action name

* change action rules to trigger only when vendor changes

* remove obsolete line from main action
2021-05-10 11:38:56 +03:00
Aliaksandr Valialkin
229d9d6dd7 docs/CHANGELOG.md: document -datasource.roundDigits added at 5c448126dc 2021-05-10 11:18:26 +03:00
Roman Khavronenko
5c448126dc vmalert: add support for round_digits param in datasource package (#1278)
Starting from v1.56.0 VM supports `round_digits` which allows to limit
the number of digits after the decimal point in response value. The feature
can be used to reduce entropy of produced by recording rules values
and significantly improve the compression. See more details in link below.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/525
2021-05-10 11:11:45 +03:00
Aliaksandr Valialkin
3b0966c00c docs/CHANGELOG.md: document vmalert fix for state restoration on startup 2021-05-10 11:10:04 +03:00
Roman Khavronenko
4247168a2d vmalert: fix error when rule didn't start if restore failed (#1279)
Previously, `startGroup` could exit on restore errors despite the
`remoteRead.ignoreRestoreErrors` flag value. Now vmalert checks the
flag value before deciding whether to return error or just log it.
2021-05-10 11:06:31 +03:00
Aliaksandr Valialkin
6ff19096be docs/vmagent.md: add stream parsing mode chapter 2021-05-08 23:14:07 +03:00
Aliaksandr Valialkin
cbd0569ce2 docs/CHANGELOG.md: mention the comment, which gives an example of multi-level vminsert setup 2021-05-08 22:57:04 +03:00
Aliaksandr Valialkin
120927ab65 vendor: make vendor-update 2021-05-08 20:21:57 +03:00
Aliaksandr Valialkin
75c2c813fc lib/storage: remove dead code after the commit 3ccf7ea20c 2021-05-08 20:14:11 +03:00
Aliaksandr Valialkin
f8d50e9641 deployment/dm: update Go builder from v1.16.3 to v1.16.4
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.4+label%3ACherryPickApproved for details
2021-05-08 20:04:05 +03:00
Aliaksandr Valialkin
7aea5f58c4 lib/ingestserver: properly close incoming connections during graceful shutdown 2021-05-08 19:52:58 +03:00
Aliaksandr Valialkin
12d733dd5d app/vminsert: add support for data ingestion via other vminsert nodes 2021-05-08 19:52:57 +03:00
Aliaksandr Valialkin
237e9f9fd7 app/vmalert: add missing comment for ErrStateRestore 2021-05-08 15:59:35 +03:00
Aliaksandr Valialkin
d6439490f5 docs/Single-server-VictoriaMetrics.md: add links to vmauth and vmgateway as auth proxy examples 2021-05-07 10:46:05 +03:00
Aliaksandr Valialkin
79ec35cef9 app/vmbackup: make sure that -snapshotName isnt set if -snapshot.createURL is set 2021-05-07 08:44:25 +03:00
Aliaksandr Valialkin
d9e3872b1c docs/CHANGELOG.md: document 904bbffc7f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 20:33:05 +03:00
Aliaksandr Valialkin
4bea1afc6d docs/CHANGELOG.md: document 9cdd4696fe
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1252
2021-05-05 20:28:58 +03:00
Aliaksandr Valialkin
904bbffc7f lib/promscrape/discovery/kubernetes: start watchers for pods and services before starting watchers for endpoints
This should eliminate possible race when an update on endpoints depends on pods and/or services, which are missing in the cache yet.
This could result in missing targets based on endpoints or endpointslices.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-05-05 12:23:50 +03:00
Roman Khavronenko
9cdd4696fe vmalert: add flag to control behaviour on startup for state restore errors (#1265)
Alerting rules now can return specific error type ErrStateRestore to indicate
whether restore state procedure failed. Such errors were returned and logged
before as well. But now user can specify whether to just log these errors
(remoteRead.ignoreRestoreErrors=true) or to stop the process
(remoteRead.ignoreRestoreErrors=false). The latter is important when VM isn't
ready yet to serve queries from vmalert and it needs to wait.

https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1252
2021-05-05 10:07:19 +03:00
Denis Fondras
cdd1b06473 Update to OpenBSD 6.9 and VictoriaMetrics 1.59 (#1263)
* Update to OpenBSD 6.9 and VictoriaMetrics 1.58

While at it, also build vmagent

* Bump to v1.59 and remove zstd dependency
2021-05-03 14:14:13 +03:00
Aliaksandr Valialkin
76027093ca lib/storage: use WARNING instead of INFO level for logging dropped labels 2021-05-03 13:56:43 +03:00
Aliaksandr Valialkin
ce9e163e94 lib/httpserver: stop the process on panics in request handlers
Panics may leave the process in inconsistent state. That's why it is better to stop the process after the panic
instead of recovering from the panic. Unfortunately, the standard net/http.Server recovers panics in request handlers.
See https://github.com/golang/go/issues/16542 . That's lib/httpserver must stop the process on itself after the panic.
2021-05-03 11:59:40 +03:00
Aliaksandr Valialkin
988c3b386f docs/CHANGELOG.md: document the bugfix for proper removal of stale parts (477369b62f) 2021-05-03 11:38:15 +03:00
Nikolay
477369b62f adds stalePartsRemover (#1261)
for new created partitions
2021-05-03 11:34:00 +03:00
Aliaksandr Valialkin
9a930720b3 lib/promrelabel: add tests for removing the specified {label="value"} pair 2021-05-03 11:26:53 +03:00
Aliaksandr Valialkin
0969b446b3 deployment/docker: update base docker image from alpine:3.13.2 to alpine:3.13.5 2021-05-01 10:50:09 +03:00
Aliaksandr Valialkin
fb097ff774 docs/CHANGELOG.md: cut v1.59.0 2021-05-01 09:39:48 +03:00
Aliaksandr Valialkin
4e2ea844ca vendor: make vendor-update 2021-05-01 09:39:26 +03:00
Aliaksandr Valialkin
58d1e6eeea lib/storage: log dropped labels if the number of labels in a metric exceeds -maxLabelsPerTimeseries command-line flag value
This should improve debuggability for this case.
2021-05-01 09:27:55 +03:00
Aliaksandr Valialkin
508f500477 docs/vmalert.md: update docs after afca7b430c 2021-04-30 11:49:01 +03:00
Aliaksandr Valialkin
86bdb5ea95 docs/Cluster-VictoriaMetrics.md: document api/v1/series/count endpoint 2021-04-30 11:46:14 +03:00
Roman Khavronenko
0f988e5a31 vmalert: fix the typo in ApplyParams func (#1259) 2021-04-30 10:31:45 +03:00
Roman Khavronenko
afca7b430c vmalert: use rule's evaluationInterval as step param by default (#1258)
User still can override param by specifying `datasource.queryStep` flag.
2021-04-30 10:01:05 +03:00
Aliaksandr Valialkin
4394dc6cbb docs/CHANGELOG.md: document the change from f3a048288e
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1232
2021-04-30 09:54:41 +03:00
Roman Khavronenko
f3a048288e Vmalert: adjust time param for datasource queries according to evaluationInterval (#1257)
* Simplify arguments list for fn `queryDataSource` to improve readbility

* vmalert: adjust `time` param according to rule evaluation interval

With this change, vmalert will start to use rule's evaluation interval
for truncating the `time` param. This is mostly needed to produce consistent
time series with timestamps unaffected by vmalert start time. Now, timestamp
becomes predictable.
Additionally, adjustment is similar to what Grafana does for plotting range graphs.
Hence, recording rule series and recording rule expression plotted in grafana
suppose to become similar in most of cases.
2021-04-30 09:46:03 +03:00
Aliaksandr Valialkin
6fa5981e68 app/vmagent: list user-visible endpoints at http://vmagent:8429/
While at it, use common WriteAPIHelp function for the listing in vmagent, vmalert and victoria-metrics
2021-04-30 09:36:43 +03:00
Aliaksandr Valialkin
2ab1266593 lib/promscrape/discovery/kubernetes: remove a mutex at urlWatcher - use groupWatcher mutex for accessing all the urlWatcher children
This simplifies the code a bit and reduces the probability of improper mutex handling and deadlocks.
2021-04-29 10:14:26 +03:00
Aliaksandr Valialkin
25b8d71df5 vendor: update github.com/klauspost/compress from v1.12.1 to v1.12.2 2021-04-29 10:12:50 +03:00
Nikolay
4e5a88114a vmagent kubernetes_sd tests (#1253)
* first part of tests for kubernetes sd

* makes linter happy

* added more test cases

* adds pub/sub for tests
2021-04-29 10:10:24 +03:00
Nikolay
15609ee447 changes vmalert Querier with per rule querier (#1249)
* changes vmalert Querier with per rule querier
it allows to changes some parametrs based on rule setting
for instance - alert type, tenant for cluster version or event endpoint url.
2021-04-28 21:41:15 +01:00
Aliaksandr Valialkin
87179c6839 lib/{storage,mergeset}: fix unaligned 64-bit atomic operation panic for 32-bit architectures
The panic has been introduced in 56b6b893ce
2021-04-27 16:41:32 +03:00
Aliaksandr Valialkin
56b6b893ce lib/mergeset: split rows ingestion among multiple shards
This improves rows ingestion on systems with many CPU cores by reducing lock contention.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1244

Thanks to @waldoweng for the original idea and draft implementation at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1243
2021-04-27 15:36:34 +03:00
Aliaksandr Valialkin
2ec7d8b384 lib/promscrape/discovery/kubernetes: fix a deadlock introduced in eddba29664
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240

Thanks to @f41gh7 for providing the initial idea for deadlock fix at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1248
2021-04-27 14:57:51 +03:00
Aliaksandr Valialkin
f89c1f7f49 lib/storage: typo fix in info message when deleting the part outside the configured retention
Previously the message was displaying incorrect retention time
2021-04-27 13:32:46 +03:00
Aliaksandr Valialkin
60947fb2d5 lib/persistentqueue: eliminate possible data race when obtaining vm_persistentqueue_bytes_pending metric value 2021-04-27 00:25:52 +03:00
Roman Khavronenko
87018650dd vmalert: keep the returned timestamp when persisting recording rule (#1245)
Previously, vmalert used `lastExecTime` timestamp when writing recording rules
to the remote storage. This may be incorrect, if vmalert uses `datasource.lookback` flag,
which means rule's expression will be executed at some moment in the past.
To avoid such situations, vmalert now will use returned timestamp instead of `lastExecTime`.
2021-04-26 01:03:22 +03:00
Roman Khavronenko
d6f44977a7 docs: update per tenant stats page (#1246) 2021-04-25 09:34:28 +01:00
Aliaksandr Valialkin
00deb69e28 docs: ordering fix 2021-04-24 02:28:53 +03:00
Aliaksandr Valialkin
bf2439f962 vendor: make vendor-update 2021-04-24 01:34:07 +03:00
Aliaksandr Valialkin
20b77abc17 docs: update docs order 2021-04-24 01:27:13 +03:00
Aliaksandr Valialkin
e9898e1772 docs/Single-server-VictoriaMetrics.md: mention that the native export format can change between releases
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1203
2021-04-24 01:22:54 +03:00
Aliaksandr Valialkin
52b4b1605e app/vmagent/remotewrite: increase the maximum possible number of inmemory blocks for systems with high amounts of RAM
This should reduce the probability of using much slower file-based persistent queue
when vmagent processes metrics at high rate (millions of metrics per second).

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1235
2021-04-23 22:03:51 +03:00
Aliaksandr Valialkin
fb37b853e9 app/vmagent/remotewrite: count maxLabelsPerBlock as 10x of maxRowsPerBlock
This should increase block sizes and subsequently increase the maximum possible bandwidth per each connection to remote storage.
This, in turn, should reduce the probability of storing the data in local buffers.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1235
2021-04-23 21:55:47 +03:00
Aliaksandr Valialkin
908e35affd lib/promscrape: apply scrape_timeout on receiving the first response byte for stream_parse: true scrape targets
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1017#issuecomment-767235047
2021-04-23 21:53:35 +03:00
Aliaksandr Valialkin
eddba29664 lib/promscrape/discovery/kubernetes: refresh role: endpoints targets on service object removal as Prometheus does
This is a follow-up for ae37cfd528

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:26:57 +03:00
Aliaksandr Valialkin
ae37cfd528 lib/promscrape/discovery/kubernetes: refresh endpoints and endpointslices targets on service object update like Prometheus does
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
2021-04-23 20:11:40 +03:00
Aliaksandr Valialkin
f7d7236b44 Makefile: remove trailing whitespace 2021-04-23 11:05:57 +03:00
Aliaksandr Valialkin
478c56f281 docs/Cluster-VictoriaMetrics.md: sync with cluster branch 2021-04-22 14:49:16 +03:00
Aliaksandr Valialkin
758de04440 docs: add missing images for vmbackupmanager.md 2021-04-22 14:49:14 +03:00
Aliaksandr Valialkin
4c4c383f30 docs/vmbackupmanager.md: sync with app/vmbackupmanager/README.md 2021-04-22 14:44:07 +03:00
Aliaksandr Valialkin
bbebdf9ba1 lib/{storage,mergeset}: remove empty directories on startup. Such directories can be left after unclean shutdown on NFS storage
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1142
2021-04-22 13:02:44 +03:00
Aliaksandr Valialkin
1ab27582a3 docs/vmbackup.md: sync with app/vmbackup/README.md 2021-04-22 11:18:51 +03:00
Aliaksandr Valialkin
1fbc04facd app/vmbackup: typo fix: snaphsot -> snapshot
Follow-up for 9de0fa3649
2021-04-22 11:16:56 +03:00
Denys Holius
9de0fa3649 fixed typo; added example of usage vmbackup in docker-compose (#1237) 2021-04-22 11:14:13 +03:00
Aliaksandr Valialkin
7c4e460513 app/vmauth: parse url_prefix only once during config load 2021-04-21 10:55:29 +03:00
Aliaksandr Valialkin
6bc52fe41a all: rename https://victoriametrics.github.io to https://docs.victoriametrics.com 2021-04-20 20:16:17 +03:00
Aliaksandr Valialkin
5720b283fa all: consistency renaming Victoria Metrics -> VictoriaMetrics
VMInsert -> vminsert
VMSelect -> vmselect
VMStorage -> vmstorage
2021-04-20 11:45:48 +03:00
Denys Holius
9f9c99c8c0 removed DS_Store files from VM_logo.zip (#1233) 2021-04-20 11:41:07 +03:00
Aliaksandr Valialkin
187e3ec909 app/vmauth: follow-up for 6a81a89b3d 2021-04-20 10:58:29 +03:00
Nikolay
6a81a89b3d adds query params support for vmauth urlPrefix (#1226)
* adds query params support for vmauth urlPrefix

* Update app/vmauth/example_config.yml

* Update app/vmauth/example_config.yml

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-20 10:51:03 +03:00
Roman Khavronenko
b955fe0038 dashboard: use unit short for Labels limit exceeded panel (#1227) 2021-04-19 13:33:21 +03:00
Aliaksandr Valialkin
cc94afeacc docs/Cluster-VictoriaMetrics.md: sync with cluster README.md 2021-04-19 13:31:55 +03:00
Roman Khavronenko
f80156d9df dashboard: fix avg GC duration expression (#1228)
Previous expression was not correct.
2021-04-19 13:28:41 +03:00
Aliaksandr Valialkin
90bba22c25 .github/workflows: remove CODECOV_TOKEN 2021-04-19 11:11:14 +03:00
Aliaksandr Valialkin
b3d88610fd app/vmctl: update README.md according to bfecd0fd55 2021-04-16 12:09:45 +03:00
Denys Holius
650c143f6d removed MACOSX dirrectory from VM_logo.zip (#1224) 2021-04-16 07:30:15 +01:00
Denys Holius
bfecd0fd55 Added some explanation to vmctl docs (#1217)
Added explanation that vmctl do not make snapshot from Prometheus when run migration data
2021-04-15 18:40:48 +01:00
dereksfoster99
cf726448f2 Update PerTenantStatistic.md 2021-04-14 20:40:53 +03:00
Aliaksandr Valialkin
8d244c5f7f vendor: update github.com/klauspost/compress from v1.11.13 to v1.12.1 2021-04-14 14:20:56 +03:00
Aliaksandr Valialkin
574b80f274 docs/Cluster-VictoriaMetrics.md: mention that sometimes -replicationFactor shouldnt be set at vmselect nodes
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1207
2021-04-14 13:07:59 +03:00
ArtemVoitsekhovskyi
403a7b4a1f Grammar-correction (#1210) 2021-04-14 12:45:22 +03:00
ArtemVoitsekhovskyi
294c8b747b Grammar_correction (#1211) 2021-04-14 12:44:26 +03:00
Aliaksandr Valialkin
9b1999a5ff lib/uint64set: remove memory allocation in getOrCreateSmallPool()
This partially reverts fb82c4b9fa

It has been appeared that the additional memory allocation may result in higher GC pauses.
It is better to spend CPU time on copying bigger bucket16 structs instead of increasing query latencies due to higher GC pauses
2021-04-14 12:30:19 +03:00
f41gh7
e5db518c0f updates per-tenant stats docs
changes docs order
changes per-tenant-stats pic
2021-04-14 12:14:51 +03:00
Artem Navoiev
e9ee2122df [draft] per tenant statistic (#121)
* [draft] per tenant statistic

* updates metric name
update graph
adds link and example config

* quick fix

* adds grafana dashboard
adds example alert

Co-authored-by: f41gh7 <nik@victoriametrics.com>
2021-04-14 11:23:07 +03:00
Aliaksandr Valialkin
251747e253 lib/storage: code clarification: remove caching the found metricName in searchMetricName 2021-04-13 10:22:21 +03:00
Aliaksandr Valialkin
663a91bb82 docs: update -help output after the commit 77be3e3a82 2021-04-12 12:34:59 +03:00
Artem Navoiev
77be3e3a82 improve docs for cli flags (#1202)
* improve docs for cli flags

* improve docs for cli flags.2
2021-04-12 12:28:04 +03:00
Aliaksandr Valialkin
0b7e3510c8 docs: make docs-sync 2021-04-10 19:54:18 +03:00
Artem Navoiev
3ed113b390 update vmgateway.md (#1201) 2021-04-10 19:52:33 +03:00
Lapo Luchini
3d20fd20d0 Allow specifying build date from a variable (#1200)
Just like the existing infrastructure for `BUILDINFO_TAG`, this can ease the production of [reproducible builds](https://wiki.freebsd.org/ReproducibleBuilds).
(e.g. in FreeBSD the date the port was committed is used at build time, not the actual build time, so that an identical port produced at different times produces an identical executable)
2021-04-10 15:46:53 +03:00
Roman Khavronenko
7644efae01 Docs update (#1199)
* docs: drop table of contents for `vmctl`

We already have it autogenerated on .github.io, so no need to keep it.

* docs: mention OpenTSDB migration feature for vmctl

* docs: sync docs for `vmalert`
2021-04-10 15:46:08 +03:00
John Seekins
a9d76b06a7 Improve documentation on OpenTSDB migration tool and fix a bug with hard offsets (#1198)
* add more documentation on OpenTSDB migration explaining what chunking means
* more clarification of OpenTSDB aggregations
* break out what a retention string becomes
* add more docs around retention strings
* add example of running program and fix mistake in how hard offsets are handled
* fix formatting
2021-04-10 13:24:18 +01:00
John Seekins
cd9786c2a7 OpenTSDB migration to VictoriaMetrics (#1089) 2021-04-08 20:58:06 +01:00
Roman Khavronenko
162681e60d add new alerts (#1195)
* alerts: backport `DiskRunsOutOfSpace` alert and some other tweaks from cluster branch

* alerts: add `ServiceDown` alert to detect "dead" services
2021-04-08 18:24:25 +03:00
Roman Khavronenko
4ed8de62ac vmalert: document template functions and mention them in README (#1197) 2021-04-08 18:19:08 +03:00
Aliaksandr Valialkin
06e962f141 docs/Cluster-VictoriaMetrics.md: recommend setting up official alerts for VictoriaMetrics components 2021-04-08 12:15:24 +03:00
Aliaksandr Valialkin
fac1e3810a docs/Single-server-VictoriaMetrics.md: recommend using the official alerts for VictoriaMetrics in Monitoring section 2021-04-08 12:12:45 +03:00
Aliaksandr Valialkin
edd1590ac7 dashboards/victoriametrics.json: typo fix: chur rate -> churn rate 2021-04-08 09:35:50 +03:00
Aliaksandr Valialkin
3f0bcbe067 lib/promscrape: create a single swosFunc per scrape_config 2021-04-08 09:31:48 +03:00
Aliaksandr Valialkin
3eadee6cb7 vendor: make vendor-update 2021-04-08 00:58:22 +03:00
Aliaksandr Valialkin
f1a22b097a docs/CHANGELOG.md: cut v1.58.0 2021-04-08 00:48:16 +03:00
Aliaksandr Valialkin
544821b719 app/vmselect/promql: fix tests after d3fa0ccabd 2021-04-08 00:18:01 +03:00
Aliaksandr Valialkin
d3fa0ccabd app/vmselect/promql: properly detect aggregate topk* and bottomk* aggregate functions in order to disable duplicate sorting
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1189
2021-04-08 00:09:40 +03:00
Aliaksandr Valialkin
cb12a8f0a8 app/vmselect: return data:null instead of data:[] from /api/v1/query_exemplars, since Grafana throws an error otherwise
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1186
2021-04-07 23:34:06 +03:00
Aliaksandr Valialkin
5a0938d807 lib/promscrape: do not spend CPU time on constructing scrapeWork key if clustering is disabled 2021-04-07 21:54:22 +03:00
Aliaksandr Valialkin
1177dca3da app/vmselect: do not sort series returned from topk* and bottomk* functions, since these series are already sorted in user-expected order
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1189
2021-04-07 14:16:08 +03:00
Aliaksandr Valialkin
75f309fbbf docs/Cluster-VictoriaMetrics.md: follow-up after c6a8ebb11f 2021-04-07 13:47:46 +03:00
Roman Khavronenko
c6a8ebb11f docs: update docs ordering and formatting (#1192)
The major change is adding `sort` directive to docs. For those docs which are copied
from internal packages `sort` is added via makefile command. For the rest it is added
manually since they're updated manually as well.

The rest of changes is connected with markdown formatting. For example, changing headers
in some files (`##` => `#`) makes navigation on .github.io to look better. This especially
useful for `changelog` docs.

Table of contents for `vmctl` is dropped, since we already have it autogenerated on .github.io.

No link changes expected. The corresponding PR to `cluster` branch will be made in follow-up PR.
2021-04-07 13:39:16 +03:00
Aliaksandr Valialkin
df32d2836c lib/storage: properly handle big time ranges passed to /api/v1/labels and /api/v1/label/<labelName>/values
It should be faster querying all the labels and/or all the values instead of querying per-day labels/values on time ranges exceeding maxDaysForPerDaySearch
2021-04-07 13:33:46 +03:00
Aliaksandr Valialkin
59f9960992 lib/promscrape/discovery: remove superflouos check in registerPendingAPIWatchers
The check `_, ok := uw.aws[aw]; !ok` isn't needed, since aw cannot exist in uw.aws
because of the check inside subscribeAPIWatcher
2021-04-07 13:07:39 +03:00
Aliaksandr Valialkin
3ec6639bbb lib/promscrape/discovery/kubernetes: register pending apiWatchers in uw.aws 2021-04-06 11:12:13 +03:00
Aliaksandr Valialkin
7d23598b33 app/vmselect: return dumb response on /api/v1/query_exemplars request
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1186
2021-04-05 23:25:08 +03:00
Aliaksandr Valialkin
78d35d4f46 Makefile: prepare arm64 and amd64 release archives for cluster version on make release command 2021-04-05 23:03:19 +03:00
Aliaksandr Valialkin
5f593b0ed3 lib/promscrape/discovery/kubernetes: remove superflouos mustStart and mustStop functions 2021-04-05 22:44:12 +03:00
Aliaksandr Valialkin
4a0d06d1db deployment/docker/docker-compose.yml: update Grafana from v7.5.1 to v7.5.2 2021-04-05 22:30:58 +03:00
Lu Jiajing
b59164cf33 fix access to nil *url.URL (#1180)
* fix access to nil *url.URL

Signed-off-by: Megrez Lu <lujiajing1126@gmail.com>

* Update lib/promscrape/discovery/kubernetes/api_watcher.go

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-05 22:25:31 +03:00
Aliaksandr Valialkin
b46194472f lib/promscrape/discovery/kubernetes: reduce CPU time spent on registering big number of Kubernetes objects shared among big number of scrape jobs
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 22:04:30 +03:00
Aliaksandr Valialkin
a51d0ec6ec lib/promscrape/discovery/kubernetes: load objects missing in local cache from api seriver in getObjectByRole()
This should fix possible race for `role: endpoints` and `role: endpointslices` service discovery,
when the referred `pod` and `service` objects aren't propagated to urlWatcher cache yet.

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182#issuecomment-813353359 for details.
2021-04-05 20:31:17 +03:00
Aliaksandr Valialkin
95dbebf512 lib/persistentqueue: delete corrupted persistent queue instead of throwing a fatal error
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1030
2021-04-05 19:26:11 +03:00
Aliaksandr Valialkin
f010d773d6 lib/promscrape/discovery/kubernetes: synchronously load Kubernetes objects on first access
Remove async registration of apiWatchers, since it breaks discovering `role: endpoints` and `role: endpointslices` targets,
which depend on pod and service objects.

There is no need in reloading `endpoints` and `endpointslices` targets if the referenced `pod` or `service` objects change,
since in this case the corresponding `endpoints` and `endpointslices` objects should also change because they contain
ResourceVersion of the referenced `pod` or `service` objects, which is modified on object update.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1182
2021-04-05 14:20:12 +03:00
Aliaksandr Valialkin
2c25b0322b lib/proxy: typo fix after a5c5b54c22 2021-04-05 13:45:24 +03:00
Aliaksandr Valialkin
a5c5b54c22 lib/proxy: add support for socks5 over tls proxy 2021-04-05 13:00:05 +03:00
Aliaksandr Valialkin
6742839fd6 lib/promscrape: pass X-Prometheus-Scrape-Timeout-Seconds header to scrape targets as Prometheus does 2021-04-05 12:15:24 +03:00
Aliaksandr Valialkin
9ff3ecb991 docs/CHANGELOG.md: explain why -sortLabels is set to false by default 2021-04-04 01:59:25 +03:00
Aliaksandr Valialkin
9a97941e2a docs/vmagent.md: mention that vmagent supports scraping via socks5 proxy 2021-04-04 01:45:52 +03:00
Aliaksandr Valialkin
fc2240fb22 docs/CHANGELOG.md: document the ability to use socks5 proxy
Follow-up for a4c6a3b3e1

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1177
2021-04-04 01:42:00 +03:00
Nikolay
a4c6a3b3e1 adds socks5 support for fasthttp client (#1178)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1177

Co-authored-by: Aliaksandr Valialkin <valyala@gmail.com>
2021-04-04 01:29:54 +03:00
Aliaksandr Valialkin
500e625e8c lib/promscrape: properly send full url in GET request via simple HTTP proxy
This is a follow-up for a0ae0f86666a75ec57b45eab2429da7ab4a7b250

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 01:20:06 +03:00
Aliaksandr Valialkin
5153410ced lib/promscrape: support for simple HTTP proxies without CONNECT method support such as https://github.com/prometheus-community/PushProx
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-04 00:40:40 +03:00
Aliaksandr Valialkin
4c56b1a6dd lib/promscrape: add tests for authorization config, which has been added in df148f48b7 2021-04-03 22:13:22 +03:00
Aliaksandr Valialkin
7a0b964e8d app/vmselect/promql: do not delete dst_label if src_label is empty in label_copy(q, src_label, dst_label) and label_move(q, src_label, dst_label) 2021-04-03 22:05:06 +03:00
Aliaksandr Valialkin
6f3080f9fb docs/CHANGELOG.md: document the change from 3055ab0115 (add ability to pass "label=value" to the third argument to topk_* and bottomk_* functions 2021-04-03 21:42:08 +03:00
Aliaksandr Valialkin
e1d1708fa2 docs/Single-server-VictoriaMetrics.md: add missing link in heading to Graphite Render API usage chapter 2021-04-03 21:34:13 +03:00
Aliaksandr Valialkin
43f9842b6f lib/proxy: log response body on non-200 response code
This should improve debuggability for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1179
2021-04-03 03:03:55 +03:00
Aliaksandr Valialkin
2256b79a89 docs/vmgateway.md: update docs 2021-04-03 00:29:19 +03:00
Aliaksandr Valialkin
b8d9d6c326 docs/CHANGELOG.md: yet another typo fix 2021-04-03 00:25:39 +03:00
Aliaksandr Valialkin
22949911e9 docs/CHANGELOG.md: typo fix 2021-04-03 00:23:50 +03:00
Aliaksandr Valialkin
3055ab0115 app/vmselect/promql: add ability to set label value additionally to label name for the remaining sum of time series returned from topk_* and bottomk_* functions in the form: topk_min(N, m, "label=value") 2021-04-02 23:55:54 +03:00
Aliaksandr Valialkin
25e19c75c7 docs/{vmauth,vmgateway}.md: small fixes 2021-04-02 23:15:00 +03:00
Aliaksandr Valialkin
3c1b39c978 app/vmgateway: publish docs 2021-04-02 23:08:41 +03:00
Aliaksandr Valialkin
0db901617d app: do not process non-GET requests on at / handler 2021-04-02 22:54:06 +03:00
Aliaksandr Valialkin
569e58dcdf vendor: make vendor-update 2021-04-02 22:20:17 +03:00
Aliaksandr Valialkin
b1d0028e79 app/vmauth: add support for authorization via Authorization: Bearer <token> 2021-04-02 22:14:53 +03:00
Aliaksandr Valialkin
b88feb631e docs/vmagent.md: mention about proxy_authorization section 2021-04-02 21:24:11 +03:00
Aliaksandr Valialkin
df148f48b7 lib/promscrape: add support for authorization config in -promscrape.config as Prometheus 2.26 does
See https://github.com/prometheus/prometheus/pull/8512
2021-04-02 21:17:45 +03:00
Aliaksandr Valialkin
7f9c68cdcb lib/promscrape: add follow_redirect option to scrape_configs section like Prometheus does
See https://github.com/prometheus/prometheus/pull/8546
2021-04-02 19:56:40 +03:00
Aliaksandr Valialkin
d1dcbfd0f9 deployment/docker: upgrade Go builder from v1.16.2 to v1.16.3
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.3+label%3ACherryPickApproved
2021-04-02 19:22:32 +03:00
Aliaksandr Valialkin
c79e4a2f90 app/vmselect/promql: remove the limit on the number of time series that can be sorted, since it may confuse users
Always sort time series returned from `/api/v1/query` and `/api/v1/query_range` unless `sort_*` function is used at top level of the query.
2021-04-02 15:02:08 +03:00
Aliaksandr Valialkin
5b08e6fb16 lib/promscrape/discovery/kubernetes: properly track objects with the same names in multiple namespaces
This is a follow-up for 12e4785fe8

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:45:32 +03:00
Aliaksandr Valialkin
12e4785fe8 lib/promscrape/discovery/kubernetes: properly discover targets in multiple namespaces
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1170
2021-04-02 14:28:30 +03:00
Aliaksandr Valialkin
3967bd705a docs/CHANGELOG.md: mention about AWS IAM roles for tasks support for EC2 service discovery
Follow-up for fdb8995642
2021-04-02 13:10:03 +03:00
Nikolay
fdb8995642 Adds aws ECS credentials support (#1175) 2021-04-02 11:56:40 +03:00
Aliaksandr Valialkin
9d237408c6 Makefile: add missing -prod suffix to binary names in *_checksums.txt file
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1171
2021-04-02 11:55:31 +03:00
Aliaksandr Valialkin
759c938870 Makefile: properly generate checksums for *.tar.gz files
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1171
2021-04-02 11:50:04 +03:00
Aliaksandr Valialkin
dc9eafcd02 app/{vminsert,vmagent}: add -sortLabels command-line option for sorting time series labels before ingesting them in the storage
This option can be useful when samples for the same time series are ingested with distinct order of labels.
For example, metric{k1="v1",k2="v2"} and metric{k2="v2",k1="v1"}.
2021-03-31 23:27:58 +03:00
Aliaksandr Valialkin
e1f699bb6c lib/storage: reduce memory usage when ingesting samples for the same time series with distinct order of labels 2021-03-31 21:24:46 +03:00
Lapo Luchini
db963205cc Add vmutils-pure target (#1163) 2021-03-31 17:42:08 +03:00
Aliaksandr Valialkin
48275d8c12 app/vmagent/remotewrite: reduce memory usage when -remoteWrite.queues is set to a big value
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1167
2021-03-31 16:16:33 +03:00
Aliaksandr Valialkin
f888d194da docs/Articles.md: add a link to https://blog.kintone.io/entry/2021/03/31/175256 2021-03-31 15:14:35 +03:00
Aliaksandr Valialkin
33622c409c app/vmagent/remotewrite: reduce memory usage when samples with big number of labels are sent to remote storage 2021-03-31 00:44:54 +03:00
Aliaksandr Valialkin
3be0e6b087 docs/Single-server-VictoriaMetrics.md: update victoria-metrics -help output after e7fdea5953 2021-03-30 21:44:54 +03:00
Aliaksandr Valialkin
e7fdea5953 app/vmselect: add -search.maxStatusRequestDuration command-line flag for limiting the duration of requests to /api/v1/status/* and /api/v1/series/count 2021-03-30 21:41:35 +03:00
Denys Holius
2af39a96c5 deployment: Grafana version updated to 7.5.1 (#1161) 2021-03-30 20:43:54 +03:00
Aliaksandr Valialkin
2d3082bb55 docs/CHANGELOG.md: cut v1.57.1 2021-03-30 15:40:06 +03:00
Aliaksandr Valialkin
0fe8f11090 docs/CHANGELOG.md: mention about returned back type label for vm_tenant_inserted_rows_total metric
See 9b4e608199
2021-03-30 15:16:57 +03:00
Aliaksandr Valialkin
d58d5562f1 app/vmselect: remove -search.storageTimeout command-line flag, since it has the same meaning as -search.maxQueryDuration
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2021-03-30 15:04:54 +03:00
Aliaksandr Valialkin
7962cf1af8 app/vmselect: prevent from possible incomplete query results after timed out query
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/711
2021-03-30 13:35:45 +03:00
Aliaksandr Valialkin
65c60cf413 Makefile: build arm binaries on make release-victoria-metrics
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1147
2021-03-29 23:44:39 +03:00
Aliaksandr Valialkin
75991277fa Makefile: build vmutils for arm on make release-vmutils
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1147
2021-03-29 23:15:29 +03:00
Aliaksandr Valialkin
1db1a29ffa all: increase minimum supported Go version for building VictoriaMetrics components from v1.14 to v1.15
This is needed after the commit c0ac740f93, which uses URL.Redacted() method,
which has been added in v1.15.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1147
2021-03-29 23:04:53 +03:00
Aliaksandr Valialkin
947b37ba8e docs/CHANGELOG.md: cut v1.57.0 2021-03-29 19:14:49 +03:00
Aliaksandr Valialkin
e6fd1f7875 docs/CHANGELOG.md: typo fixes 2021-03-29 15:46:35 +03:00
Aliaksandr Valialkin
5c7c0b2bda docs/CHANGELOG.md: mention about logging of metrics with too old timestamps in a single-node VictoriaMetrics
This is a follow up for aa81039b42
2021-03-29 15:42:46 +03:00
Aliaksandr Valialkin
5f4a9f782f docs/CHANGELOG.md: mention Graphite Render API fixes 2021-03-29 14:27:42 +03:00
Aliaksandr Valialkin
602a3d99ae docs/CHANGELOG.md: mention optimized query performance on systems with many CPU cores 2021-03-29 13:55:10 +03:00
Roman Khavronenko
cfdb6762e6 deployment: add new alert TooHighChurnRate24h (#1154)
Alert `TooHighChurnRate24h` suppose to cover cases when churn rate
is low but results in multiple times higher number than total
number of active series.
2021-03-29 12:38:03 +03:00
Roman Khavronenko
b1e49bab52 Dashboards update (#1153)
* dashboard: update single node dashboard

* add number of new series created over last 24h;
* bump version requirements.

* dashboard: update vmagent dashboard

* add panel for open file descriptors;
* add panel for disk I/O;
* add panel for `vmagent_remotewrite_packets_dropped_total` metric;
* bump version requirements.
2021-03-29 12:37:17 +03:00
Aliaksandr Valialkin
b75c2ce659 lib/uint64set: improve Set.Has() performance scalability on multi-CPU system
Do not update bucket32.hint on Set.Has() call, since it leads to memory ping-pong between CPU cores multi-CPU system
2021-03-29 12:33:47 +03:00
Aliaksandr Valialkin
2601cc0fb0 lib/storage: do not update b.nextIdx if no samples are removed because of retention 2021-03-29 12:00:21 +03:00
hagen1778
0c403bfd29 deployment: fix typo in vmalert docker-compose definition 2021-03-29 11:06:32 +03:00
Aliaksandr Valialkin
78188decf9 docs: document that vmagent drops data blocks when remote storage replies with 400 and 409 http status codes
This is a follow up for 1b7dc1e5a5.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1149
2021-03-26 14:44:06 +02:00
Aliaksandr Valialkin
c54cb3e63c app/vmagent/remotewrite: remove superflouos code after 1b7dc1e5a5 2021-03-26 13:59:46 +02:00
Aliaksandr Valialkin
8fc8ef1aba vendor: update github.com/klauspost/compress from v1.11.12 to v1.11.13 2021-03-26 13:57:01 +02:00
Nikolay
1b7dc1e5a5 Adds blocks drop (#1151)
* adds blocks drop at 400 BadRequest status code
recieved from remote storage,
not expected that remote storage will be able to handle it on retry

* removes error logging for dropped blocks,
its expected error
2021-03-26 14:17:59 +03:00
Aliaksandr Valialkin
f39c84b21f lib/promscrape/discovery/kubernetes: typo fix in error message 2021-03-26 12:46:14 +02:00
Aliaksandr Valialkin
9761ffd161 lib/promscrape/discovery/kubernetes: properly handle too old resource version error message from Kubernetes watch API 2021-03-26 12:28:10 +02:00
Aliaksandr Valialkin
aa81039b42 app/vmselect: log the metric which trigger rollup result cache reset
This should help finding the source of stale metrics
2021-03-25 21:31:39 +02:00
Aliaksandr Valialkin
50f790b5d7 docs/Cluster-VictoriaMetrics.md: mention that vmselect doesnt serve partial responses from export API
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1148
2021-03-25 21:04:13 +02:00
Aliaksandr Valialkin
136fc6217c vendor: make vendor-update 2021-03-25 17:56:10 +02:00
Aliaksandr Valialkin
5ec9e49103 docs/vmagent.md: add an example for -remoteWrite.label 2021-03-25 17:54:55 +02:00
Aliaksandr Valialkin
88f6286df7 docs/Cluster-VictoriaMetrics.md: sync with upstream 2021-03-25 17:18:18 +02:00
Aliaksandr Valialkin
f6529f932a docs: add a link to the repository from build instruction for all the VictoriaMetrics components 2021-03-25 17:14:42 +02:00
Aliaksandr Valialkin
2065d11300 docs/vmagent.md: cosmetic fixes 2021-03-25 17:11:19 +02:00
Aliaksandr Valialkin
35fb9bdee1 docs/vmagent.md: cosmetic fixes 2021-03-25 16:54:03 +02:00
Aliaksandr Valialkin
a647144616 docs/vmagent.md: typo fix: tupically -> typically 2021-03-25 16:48:45 +02:00
Aliaksandr Valialkin
c3c3e51f17 docs/vmalert.md: remove misleading -evaluationInterval=3s from example config args
3s evaluation interval is too small for practical setups. It can result in increased load on datasource.
So it is better to remove it from example config args, which are usually copy-pasted by novice users.
2021-03-25 15:29:06 +02:00
Aliaksandr Valialkin
0b2a66db30 app/vmselect/promql: do not merge time series during requests to /api/v1/query
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1141
2021-03-25 13:56:07 +02:00
Aliaksandr Valialkin
6e855d4b82 lib/storage: tune loopsCountPerMetricNameMatch according to production workload 2021-03-25 13:27:47 +02:00
Aliaksandr Valialkin
d4aadba9fa app/vmagent: add -promscrape.consul.waitTime command-line flag for configuring Consul service discovery wait time
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1144
2021-03-23 19:33:25 +02:00
Aliaksandr Valialkin
edc0a94a3c docs/CHANGELOG.md: mention the feature from 44a6cc5eca 2021-03-23 19:00:18 +02:00
Aliaksandr Valialkin
3a3d2165f9 lib/storage: do not reload metricName for the same metricID in Search.NextMetricBlock
This should speed up Search.NextMetricBlock a bit
2021-03-23 17:56:49 +02:00
Aliaksandr Valialkin
9c566f7db9 app/vmagent: mention -remoteWrite.maxDiskUsagePerURL in the descriptio of -remoteWrite.tmpDataPath flag 2021-03-23 16:38:48 +02:00
Nikolay
29f9ef9b7f changes consul_service label value (#1143)
according to prometheus discovery.
 It should mitigate issue with case sensetive services
https://github.com/hashicorp/consul/issues/5707
2021-03-23 15:35:01 +02:00
Aliaksandr Valialkin
331a6a2015 app/vmselect/graphite: accept and enforce extra_label in all the Graphite APIs 2021-03-23 15:29:16 +02:00
Aliaksandr Valialkin
b521d1d4f2 app/vmselect: move getEnforcedTagFiltersFromRequest to searchtuils, since it will be used in Graphite functions soon 2021-03-23 14:16:29 +02:00
Aliaksandr Valialkin
3cfb3a3683 lib/storage: respect the deadline passed to Storage.SearchMetricNames 2021-03-22 23:03:17 +02:00
Aliaksandr Valialkin
8e2afdf568 lib/storage: improve Search.NextMetricBlock performance by using MetricID->MetricName cache 2021-03-22 22:49:18 +02:00
Aliaksandr Valialkin
e17eb35147 docs/Single-server-VictoriaMetrics.md: sync with README.md 2021-03-22 17:51:43 +02:00
Aliaksandr Valialkin
65a61ff118 docs/Articles.md: add https://blog.cybozu.io/entry/2021/03/18/115743 2021-03-22 17:50:52 +02:00
Aliaksandr Valialkin
71b72304ae app/vmselect: improve description for -search.maxPointsPerTimeseries command-line flag 2021-03-22 16:45:34 +02:00
Aliaksandr Valialkin
44a6cc5eca app/{vminsert,vmagent}: use Influx field as metric name if measurement is empty and -influxSkipSingleField command-line is set
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1139
2021-03-22 13:53:53 +02:00
Aliaksandr Valialkin
910092ca4d lib/storage: tune loopsCountPerMetricNameMatch 2021-03-22 12:53:17 +02:00
Aliaksandr Valialkin
cef010d5f7 app/vmselect/promql: increment key prefix for faster reset for rollup result cache 2021-03-22 11:59:07 +02:00
Aliaksandr Valialkin
648d11b8e0 vendor: update github.com/VictoriaMetrics/metrics from v1.17.0 to v1.17.1 2021-03-18 18:53:07 +02:00
Aliaksandr Valialkin
fb83e97170 app/victoria-metrics: use flag.Parse instead of envflag.Parse for avoiding possible side effects of envflag 2021-03-18 18:20:22 +02:00
Aliaksandr Valialkin
b0c956a178 app/vmselect/graphite: follow-up after 529d7be26b 2021-03-18 16:30:20 +02:00
Nikolay
529d7be26b changes metricsFind api (#1137)
it should be able mitigate crash if label value contains *,[ or { symbols
2021-03-18 16:12:02 +02:00
Aliaksandr Valialkin
726f6ad804 lib/storage: small code simplification after 6cee5338b2 2021-03-18 15:21:13 +02:00
Aliaksandr Valialkin
6cee5338b2 lib/storage: prevent from infinite loop if {__graphite__="..."} filter matches a metric name with *, [ or { chars
The idea has been borrowed from https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1137
2021-03-18 14:53:47 +02:00
Aliaksandr Valialkin
e061a4fa19 docs/Single-server-VictoriaMetrics.md: remove outdated message about experimental mode for -pure builds 2021-03-18 13:49:17 +02:00
Aliaksandr Valialkin
4fb049bcba lib/fs: reduce the frequency of failed to remove directory ... due to NFS lock log warnings
Log `failed to remove directory ... due to NFS lock` warning only if the directory cannot be removed in one second.
2021-03-18 13:24:46 +02:00
Aliaksandr Valialkin
17d4a6e900 vendor: update github.com/VictoriaMetrics/metrics from v1.16.0 to v1.17.0 2021-03-17 23:23:04 +02:00
Aliaksandr Valialkin
904dababcc vendor: update github.com/VictoriaMetrics/metrics from v1.15.3 to v1.16.0
This adds the following new metrics for each VictoriaMetrics app:

* process_resident_memory_anonymous_bytes - the RSS share for memory allocated by the process itself.
  This share cannot be freed by the OS, so it must be taken into account by OOM killer.

* process_resident_memory_pagecache_bytes - the RSS share for page cache memory (aka memory-mapped files).
  This share can be freed by the OS at any time, so it must be ignored by OOM killer.
2021-03-17 17:59:40 +02:00
Aliaksandr Valialkin
45dabfac1b lib/storage: faster move heavy filters to the end of list 2021-03-17 15:12:13 +02:00
Aliaksandr Valialkin
b1713e3fcd app/vmselect/promql: typo fix after 9666834045 2021-03-17 15:12:11 +02:00
Aliaksandr Valialkin
9666834045 app/vmselect/promql: merge adjancent buckets with the smallest summary number of hits in buckets_limit() function
This should improve accuracy for the returned buckets
2021-03-17 14:31:41 +02:00
Aliaksandr Valialkin
20ac89c4e0 docs/CHANGELOG.md: cut v1.56.0 2021-03-17 02:04:55 +02:00
Aliaksandr Valialkin
bd0c6a095e docs/CHANGELOG.md: do not mention reduction in query duration
There was a signficant refactoring in the code responsible for time series search,
so it can result in both speed ups and slow downs depending on used queries.
2021-03-17 01:56:58 +02:00
Aliaksandr Valialkin
7bc728bf53 app/vmselect: add vm_index_search_duration_seconds histogram for monitoring the performance of index search 2021-03-17 01:17:41 +02:00
Aliaksandr Valialkin
828669e4e1 all: make golint happy 2021-03-17 00:49:28 +02:00
Aliaksandr Valialkin
ccfb0ae2d3 lib/storage: limit loops count in order to reduce max CPU usage during filter search 2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
576a80b3d9 lib/storage: do not modify filterLoopsCount stats with loopsCount stats
Such a modification can result in incorrect filter sorting later
2021-03-17 00:49:26 +02:00
Aliaksandr Valialkin
f104f3eb2a all: make golangci-lint happy after the commit 6378205415 2021-03-17 00:24:40 +02:00
Aliaksandr Valialkin
6378205415 lib/netutil: enable IPv6 UDP listening if -enableTCP6 command-line flag is passed to VictoriaMetrics
This is a follow-up for 18cfc4be7b

See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:16:17 +02:00
Nikolay
18cfc4be7b Adds udp6 support for ingest servers (#1134)
with flag -enableUDP6  https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1131
2021-03-17 00:03:06 +02:00
Aliaksandr Valialkin
0ce557951f app/vmselect/netstorage: reduce mutex contention when unpacking data on a system with high number of CPU cores 2021-03-16 21:51:31 +02:00
Aliaksandr Valialkin
35bb44d317 lib/uint64set: optimize bucket16.addMulti a bit 2021-03-16 21:09:07 +02:00
Aliaksandr Valialkin
aba955fa16 Makefile: prepare vmutils-windows-*.zip archive on make release-vmutils command
The archive contains the following executables for Windows:

* vmagent
* vmalert
* vmauth
* vmctl

Other components - vmbackup, vmrestore, victoria-metrics - aren't supported for Windows yet
2021-03-16 20:52:41 +02:00
Aliaksandr Valialkin
fd86a7dc1d lib/storage: time series search optimization according to production workload profiling
Do not pass filter metric ids to getMetricIDsForTagFilter, since it has been appeared that this slows down
the function by multiple times when it finds big number of metricIDs (tens of millions).
2021-03-16 20:01:43 +02:00
Aliaksandr Valialkin
264d3432ac vendor: make vendor-update 2021-03-16 19:08:36 +02:00
Aliaksandr Valialkin
e36fbfae5b lib/storage: further tuning for time series search 2021-03-16 18:46:22 +02:00
Aliaksandr Valialkin
f0a4157f89 app/vmselect/promql: do not crash if histogram_over_time() function name contains uppercase letters such as Histogram_over_time() 2021-03-16 12:24:21 +02:00
Aliaksandr Valialkin
645cf6746c vendor: update github.com/VictoriaMetrics/metrics from v1.15.2 to v1.15.3 2021-03-16 12:15:50 +02:00
Aliaksandr Valialkin
031d256810 docs/Articles.md: added a link to https://www.youtube.com/watch?v=QgLMztnj7-8 2021-03-15 23:01:49 +02:00
Aliaksandr Valialkin
dd7e82c34f app/vmstorage: add -logNewSeries command-line flag for determining the source of series churn rate 2021-03-15 22:38:50 +02:00
Aliaksandr Valialkin
37323c57c9 lib/influxutils: return response compatible with InfluxDB 1.8.4
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 22:19:59 +02:00
f41gh7
acf2a82d3c typo fix 2021-03-15 23:03:52 +03:00
Aliaksandr Valialkin
85a95bf60c all: various fixes in command-line flag descriptions 2021-03-15 21:59:25 +02:00
Aliaksandr Valialkin
7c37e9aea9 app/{vminsert,vmagent}: a follow-up for b1aa8c3d8f
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 21:37:55 +02:00
Nikolay
b1aa8c3d8f adds fake response for telegraph queries (#1130)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1124
2021-03-15 21:10:47 +02:00
Aliaksandr Valialkin
9c77e34ef9 lib/storage: further tuning for time series selector code 2021-03-15 20:31:34 +02:00
Aliaksandr Valialkin
923cdb0552 app/vmselect/promql: reduce overhead on scrape interval estimation
It should be enough to use the first 20 datapoints instead of 100 datapoints for scrape interval estimation.
2021-03-15 20:31:33 +02:00
Aliaksandr Valialkin
82aab87446 app/vmselect/promql: fix tests after 2dae0a2c47 2021-03-15 20:18:59 +02:00
Aliaksandr Valialkin
fb82c4b9fa lib/uint64set: reduce the size of bucket16 by storing smallPool by pointer.
This reduces CPU time spent on bucket16 copying inside bucket13.addBucketAtPos.
2021-03-15 17:23:31 +02:00
Aliaksandr Valialkin
eb103e1527 lib/uint64set: optimize Set.AddMulti for large sorted sets 2021-03-15 17:10:40 +02:00
Aliaksandr Valialkin
43504ebd14 lib/uint64set: optimize bucket16.add and bucket16.addMulti a bit 2021-03-15 16:58:12 +02:00
Aliaksandr Valialkin
fb935e6e2c docs/CHANGELOG.md: typo fix: FATURE -> FEATURE 2021-03-15 14:58:19 +02:00
Aliaksandr Valialkin
3ccf7ea20c lib/storage: tune per-day index search 2021-03-15 13:31:55 +02:00
Aliaksandr Valialkin
2dae0a2c47 app/vmselect: add round_digits query arg to /api/v1/query and /api/v1/query_range handlers for limiting the number of decimal digits after the point 2021-03-15 12:36:33 +02:00
Roman Khavronenko
b457739f87 Single dashboard (#1126)
* dashboard: update single node dashboard

* add panel `Open FDs` for file descriptors metrics;
* add panel `Disk writes/reads` to show the real read/write
load on storage layer;
* add `process_resident_memory_bytes` metric to memory usage panel;
* add stats panel to show available CPUs, memory and disk space;
* rm flags panel since it didn't prove its usefulness.

* alerts: add alert for reaching FDs limit
2021-03-15 12:04:24 +02:00
Aliaksandr Valialkin
6d91842c83 docs/Cluster-VictoriaMetrics.md: sync with cluster README.md 2021-03-15 11:51:40 +02:00
Aliaksandr Valialkin
c14dafce43 lib/promscrape: an attempt to reduce memory usage when vmagent scrapes targets with varying number of metrics
Do not cache too big byte buffers and too big writeRequestCtx objects,
since it is cheaper to re-create them instead of wasting RAM for their caching.

This reverts 7f6f350ee1
2021-03-15 11:45:39 +02:00
Aliaksandr Valialkin
7f6f350ee1 lib/promscrape: return back the logic for flushing big buffers to storage from the commit 3fd8653b40
This should reduce memory usage when vmagent scrapes targets with big number of metrics and `-promscrape.streamParse` isn't enabled
2021-03-14 22:26:00 +02:00
Aliaksandr Valialkin
b88806ecbf lib/promscrape/discovery/kubernetes: do not start object watcher until initial objects are loaded 2021-03-14 21:55:00 +02:00
Aliaksandr Valialkin
83edbb7cab lib/promscrape: retry service discovery in a few seconds if it starts returning 0 targets
This should reduce recovery time from temporary issues during service discovery
2021-03-14 21:53:23 +02:00
Aliaksandr Valialkin
bf15d6a6a2 lib/promscrape: remove duplicate target word in error message 2021-03-14 21:52:02 +02:00
Aliaksandr Valialkin
d409898515 lib/promscrape/discovery/kubernetes: further optimize kubernetes service discovery for the case with many scrape jobs
Do not re-calculate labels per each scrape job - reuse them instead for scrape jobs with identical Kubernetes role

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-14 21:14:53 +02:00
Aliaksandr Valialkin
7a16e8e3a2 lib/promscrape/discovery: fixes after 133b288681
- Removed a deadlock in addAPIWatcher
- Do not create unused ScrapeWork objects
- Do not spend CPU resources on creating objectByKey map in addAPIWatcher

This work is based on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/1125
2021-03-13 15:18:51 +02:00
Aliaksandr Valialkin
2096c6e464 app/vmselect/prometheus: typo fix after 7c002023d7 2021-03-12 12:19:36 +02:00
Aliaksandr Valialkin
7c002023d7 app/vmselect/prometheus: do not include datapoints with timestamps matching t-d when returning results from /api/v1/query?query=m[d]&time=t as Prometheus does 2021-03-12 12:16:50 +02:00
Aliaksandr Valialkin
43552fa8d3 docs/CaseStudies.md: fix incorrect number of active time series for Zhihu 2021-03-12 11:45:38 +02:00
Aliaksandr Valialkin
def014eb75 lib/promscrape/discovery/kubernetes: remove debug lines left after the commit 133b288681 2021-03-12 11:22:33 +02:00
Aliaksandr Valialkin
ca4d5ce037 lib/proxy: there is no need in cloning tlsCfg, which has been created two lines above 2021-03-12 10:47:02 +02:00
Aliaksandr Valialkin
895d5d1355 lib/proxy: set proxy address in tls.Config.ServerName instead of the target address
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 10:41:33 +02:00
Aliaksandr Valialkin
a6a71ef861 lib/promscrape: add ability to configure proxy options via proxy_tls_config, proxy_basic_auth, proxy_bearer_token and proxy_bearer_token_file
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-12 03:36:19 +02:00
Aliaksandr Valialkin
fa448806a5 deployment/docker: update Go builder from 1.16.1 to 1.16.2
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.2+label%3ACherryPickApproved
2021-03-12 01:53:48 +02:00
Aliaksandr Valialkin
f3d712724c docs/Articles.md: add https://www.sensedia.com/post/monitoring-with-prometheus-alertmanager 2021-03-12 01:22:51 +02:00
Aliaksandr Valialkin
f669531506 lib/storage: further tune filters sorting logic 2021-03-12 00:53:04 +02:00
Aliaksandr Valialkin
133b288681 lib/promscrape/discovery/kubernetes: use a single watcher per apiURL
Previously multiple scrape jobs could create multiple watchers for the same apiURL. Now only a single watcher is used.
This should reduce load on Kubernetes API server when many scrape job configs use Kubernetes service discovery.
2021-03-11 16:43:04 +02:00
Aliaksandr Valialkin
dc0cb54d41 deployment/docker: update Go builder from 1.16.0 to 1.16.1
See https://github.com/golang/go/issues?q=milestone%3AGo1.16.1+label%3ACherryPickApproved
2021-03-11 16:43:03 +02:00
Aliaksandr Valialkin
c0ac740f93 lib/proxy: do not show inline basic auth passwords when logging errors related to proxy_url 2021-03-11 13:43:36 +02:00
Aliaksandr Valialkin
bebcb8130c lib/promscrape/discovery/kubernetes: localize Bookmark parsing code
This is a follow-up for e772d1c920
2021-03-11 13:08:08 +02:00
Aliaksandr Valialkin
971e3d83f7 docs/ExttendedPromQL.md: remove outdated doc 2021-03-11 12:41:20 +02:00
Brensted
238fe7d4e8 Update BestPractices.md (#1123)
update lists, hyperlinks fixed.
2021-03-11 11:39:14 +02:00
Aliaksandr Valialkin
e772d1c920 lib/promscrape/discovery/kubernetes: reduce load on Kubernetes API server by using watch bookmarks
This allows continuing object watch from the last bookbark instead of reloading all the objects
on watch errors or timeouts.

See https://kubernetes.io/docs/reference/using-api/api-concepts/#watch-bookmarks
2021-03-10 15:06:35 +02:00
Aliaksandr Valialkin
bc5c4add89 lib/httpserver: export vm_available_memory_bytes and vm_available_cpu_cores metrics
These metrics are useful for tracking the available memory and CPU cores for VictoriaMetrics apps.
2021-03-10 12:02:42 +02:00
Brensted
e02265cfa7 Add files via upload 2021-03-10 10:16:59 +02:00
Brensted
314f28fb38 Add files via upload 2021-03-10 10:14:34 +02:00
Brensted
45ede6ba98 Add files via upload 2021-03-10 10:08:38 +02:00
Brensted
09dc49e942 Add files via upload 2021-03-10 10:06:49 +02:00
Brensted
96dbe9bcbd Add files via upload 2021-03-10 10:05:18 +02:00
Ihor Borodin
d212f35ae8 Fixing examples of external.alert.source in documentation (#1120)
* Fixing examples of external.alert.source in documentation
2021-03-09 20:49:50 +00:00
Aliaksandr Valialkin
48fb0d1c4b vendor: update github.com/VictoriaMetrics/fasthttp from v1.0.13 to v1.0.14 2021-03-09 21:34:07 +02:00
Aliaksandr Valialkin
2728b25783 docs/CHANGELOG.md: mention about the bugfix from 787242d7b0 2021-03-09 20:56:06 +02:00
Aliaksandr Valialkin
787242d7b0 lib/proxy: pass proxy hostname in Host header of the CONNECT request
This should resolve the following issue when connecting to tls proxy:

  cannot validate certificate for ... because it doesn't contain any IP SANs
2021-03-09 20:39:40 +02:00
Aliaksandr Valialkin
36fd007247 lib/proxy: set missing ServerName in TLS config for proxy_url.
While at it, allow setting Proxy-Authorization for `proxy_url` via `basic_auth` and `bearer_token` configs.
2021-03-09 18:58:18 +02:00
Nikolay
ad34f42467 Changes tlsConfig init for proxy connections (#1121)
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1116
2021-03-09 18:51:00 +02:00
Aliaksandr Valialkin
3fd8653b40 lib/promscrape: apply sample_limit after metric relabeling is applied as Prometheus does
See the description for `sample_limit` option from Prometheus docs:

Per-scrape limit on number of scraped samples that will be accepted.
If more than this number of samples are present after metric relabeling
the entire scrape will be treated as failed. 0 means no limit.

https://prometheus.io/docs/prometheus/latest/configuration/configuration/#scrape_config
2021-03-09 15:47:18 +02:00
Aliaksandr Valialkin
4b18a4f026 lib/promscrape/discovery/kubernetes: remove too verbose logs about starting and stopping the watchers
Log the number of objects loaded per each watch url

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-09 15:05:14 +02:00
Aliaksandr Valialkin
e44532b760 docs/CHANGELOG.md: mention about improved query performance at 18fe0ff14b 2021-03-09 13:12:25 +02:00
Aliaksandr Valialkin
fe8b12fbad app/vmselect/promql: follow up for 433fff0006 2021-03-09 12:48:44 +02:00
Nikolay
433fff0006 duplicate timeseries fix for prometheus_buckets function (#1119)
* try fix for prometheus_buckets

* merge possible end of the bucket collision
2021-03-09 12:26:23 +02:00
Aliaksandr Valialkin
534944b671 vendor: make vendor-update 2021-03-09 12:01:55 +02:00
Aliaksandr Valialkin
f6878eac36 vendor: update github.com/VictoriaMetrics/fasthttp from 1.0.12 to 1.0.13
This should fix a bug in vmagent with high CPU usage during failed scrapes with small `scrape_timeout`.
2021-03-09 11:45:02 +02:00
John Belmonte
364fdf4a56 spelling fix: adjacent (#1115) 2021-03-09 09:18:19 +02:00
Aliaksandr Valialkin
14a399dd06 lib/promscrape: add scrape_offset option to scrape_config
This option can be used for specifying the particular offset per each scrape interval for target scraping
2021-03-08 12:03:33 +02:00
Aliaksandr Valialkin
345980f78f lib/storage: go fmt 2021-03-08 12:03:31 +02:00
Aliaksandr Valialkin
18fe0ff14b lib/storage: tune loopsCount estimations in getMetricIDsForTagFilterSlow
The adjusted estmations give up to 2x lower median response times on 200qps /api/v1/query_range workload
2021-03-07 21:12:35 +02:00
Aliaksandr Valialkin
ab4f090c63 lib/promscrape/discovery/kubernetes: reduce memory usage further when big number of scrape jobs are configured for the same kubernetes_sd_config role
Serialize reloading per-role objects, so they don't occupy too much memory when objects for many scrape jobs are simultaneously refreshed.
Do not reload per-role objects if they were already refreshed by concurrent goroutines. This should reduce load on Kubernetes API server
when big number of scrape jobs are configured for the same Kubernetes role.

This is a follow-up for 17b87725ed

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113
2021-03-07 19:51:03 +02:00
Aliaksandr Valialkin
1187ee5e16 lib/decimal: prevent exponent overflow when processing values close to zero
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1114
2021-03-05 18:52:47 +02:00
Aliaksandr Valialkin
47ac2051bb app/vmauth: allow using regexps in url_map paths
See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1112
2021-03-05 18:21:36 +02:00
Aliaksandr Valialkin
17b87725ed lib/promscrape/discovery/kubernetes: reduce memory usage when Kubernetes service discovery is configured on a big number of scrape jobs
Previously vmagent was creating a separate Kubernetes object cache per each scrape job.
This could result in increased memory usage when monitoring a Kubernetes cluster with big number of objects (pods / nodes / services, etc.)
as seen at https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1113

Now it uses a shared map of scrape objects across multiple scrape jobs.
2021-03-05 17:29:55 +02:00
Aliaksandr Valialkin
c9a25e931b lib/promscrape/discovery/kubernetes: move apiWatcher code to a separate file 2021-03-05 12:36:05 +02:00
Aliaksandr Valialkin
536d59914a deployment/docker: update base Docker image from alpine:3.13.1 to alpine:3.13.2
See https://www.alpinelinux.org/posts/Alpine-3.13.2-released.html
2021-03-05 10:35:10 +02:00
Aliaksandr Valialkin
68b8c48c86 docs/Articles.md: add https://dalefro.medium.com/prometheus-victoria-metrics-on-aws-ecs-62448e266090 2021-03-05 09:06:44 +02:00
Aliaksandr Valialkin
444fca89f8 lib/promscrape: make cluster membership calculations consistent across 32-bit and 64-bit architectures 2021-03-05 09:06:17 +02:00
Aliaksandr Valialkin
a14053ffa0 app/vmselect/promql: add histogram_avg(), histogram_stddev() and histogram_stdvar() functions to MetricsQL 2021-03-04 14:12:07 +02:00
Aliaksandr Valialkin
423cd981fb lib/promscrape: add -promscrape.cluster.replicationFactor command-line flag for replicating scrape targets among vmagent instances in the cluster 2021-03-04 10:20:15 +02:00
Aliaksandr Valialkin
d962cdbc13 docs/Single-server-VictoriaMetrics.md: mention that VictoriaMetrics needs free resources for handling workload spikes 2021-03-04 09:57:53 +02:00
Aliaksandr Valialkin
9a48c1b53d lib/promscrape/discovery/kubernetes: fix tests after e154f4a644 2021-03-03 22:41:30 +02:00
Aliaksandr Valialkin
201b685b13 all: bump minimum supported Go version from 1.13 to 1.14 2021-03-03 15:57:13 +02:00
Aliaksandr Valialkin
90d6e94e5b vendor: update github.com/VictoriaMetrics/fastcache from v1.5.7 to v1.5.8 2021-03-03 15:52:51 +02:00
Aliaksandr Valialkin
f391e5a3a0 docs/vmagent.md: remove outdated suggestion for determining labels that lead to duplicate targets
The original labels for duplicate targets is already printed in the error message starting from 71ea4935de
2021-03-03 12:26:11 +02:00
Aliaksandr Valialkin
3a68b94487 docs/CHANGELOG.md: cut v1.55.1 release 2021-03-03 11:49:10 +02:00
Aliaksandr Valialkin
467cb9e3d1 docs/vmagent.md: make docs-sync after the commit 621bf03745 2021-03-03 10:52:40 +02:00
Roman Khavronenko
621bf03745 Vmagent docs upd (#1104)
* vmagent: port changes from https://github.com/VictoriaMetrics/VictoriaMetrics.github.io/pull/1

Thanks to @dereksfoster99 for this patch!

* vmagent: reword to make the meaning clear
2021-03-03 10:51:51 +02:00
Aliaksandr Valialkin
4c3ef78c05 docs/CHANGELOG.md: mention recent bugfixes from commits 7906316741 and e154f4a644 2021-03-03 10:50:31 +02:00
Aliaksandr Valialkin
e09a245b2b app/vmalert/README.md: sync with docs/vmalert.md 2021-03-03 10:43:59 +02:00
Nikolay
e154f4a644 Fix ingress discovery api (#1110) 2021-03-03 10:43:39 +02:00
Aliaksandr Valialkin
7906316741 lib/promscrape/discovery/kubernetes: properly check for nil pointer inside interface
See https://mangatmodi.medium.com/go-check-nil-interface-the-right-way-d142776edef1

This fixes a panic when the ScrapeWork is filtered out in swcFunc.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1108
2021-03-03 10:33:17 +02:00
Alexandros Orfanos
f4f8f21875 vmalert: docs update - status endpoint needs group ID, not group name (#1106) 2021-03-03 07:46:10 +00:00
1605 changed files with 200175 additions and 84415 deletions

View File

@@ -4,3 +4,4 @@ gocache-for-docker
victoria-metrics-data
vmstorage-data
vmselect-cache
.vscode

View File

@@ -4,14 +4,14 @@ about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
It would be great [upgrading](https://victoriametrics.github.io/#how-to-upgrade) to [the latest avaialble release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
It would be a great [upgrading](https://docs.victoriametrics.com/#how-to-upgrade)
to [the latest available release](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
and verifying whether the bug is reproducible there.
It is also recommended reading [troubleshooting docs](https://victoriametrics.github.io/#troubleshooting).
It is also recommended reading [troubleshooting docs](https://docs.victoriametrics.com/#troubleshooting).
**To Reproduce**
Steps to reproduce the behavior.
@@ -19,9 +19,22 @@ Steps to reproduce the behavior.
**Expected behavior**
A clear and concise description of what you expected to happen.
**Logs**
Check if any warnings or errors were logged by VictoriaMetrics components
or components in communication with VictoriaMetrics (e.g. Prometheus, Grafana).
**Screenshots**
If applicable, add screenshots to help explain your problem.
For VictoriaMetrics health-state issues please provide full-length screenshots
of Grafana dashboards if possible:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176)
See how to setup monitoring here:
* [monitoring for single-node VictoriaMetrics](https://docs.victoriametrics.com/#monitoring)
* [montioring for VictoriaMetrics cluster](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#monitoring)
**Version**
The line returned when passing `--version` command line flag to binary. For example:
```
@@ -30,15 +43,5 @@ victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Used command-line flags**
Command-line flags are listed as `flag{name="httpListenAddr", value=":443"} 1` lines at `/metrics` page.
See the following docs for details:
Please provide applied command-line flags used for running VictoriaMetrics and its components.
* [monitoring for single-node VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#monitoring)
* [montioring for VictoriaMetrics cluster](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/cluster/README.md#monitoring)
**Additional context**
Add any other context about the problem here such as error logs from VictoriaMetrics and Prometheus,
`/metrics` output, screenshots from the official Grafana dashboards for VictoriaMetrics:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/dashboards/10229)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176)

26
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,26 @@
version: 2
updates:
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: "daily"
- package-ecosystem: "gomod"
directory: "/"
schedule:
interval: "weekly"
- package-ecosystem: "bundler"
directory: "/docs"
schedule:
interval: "daily"
- package-ecosystem: "gomod"
directory: "/app/vmui/packages/vmui/web"
schedule:
interval: "weekly"
- package-ecosystem: "docker"
directory: "/"
schedule:
interval: "daily"
- package-ecosystem: "npm"
directory: "/app/vmui/packages/vmui"
schedule:
interval: "weekly"

23
.github/workflows/check-licenses.yml vendored Normal file
View File

@@ -0,0 +1,23 @@
name: license-check
on:
push:
paths:
- 'vendor'
pull_request:
paths:
- 'vendor'
jobs:
build:
name: Build
runs-on: ubuntu-latest
steps:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.17
id: go
- name: Code checkout
uses: actions/checkout@master
- name: Check License
run: |
make check-licenses

View File

@@ -1,30 +0,0 @@
name: github-pages
on:
push:
paths:
- 'docs/*'
- 'README.md'
branches:
- master
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git gpages
cp docs/* gpages
cp README.md gpages
cd gpages
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add .
git commit -m "update github pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.github.io.git"
git push "${remote_repo}"
cd ..
rm -rf gpages

View File

@@ -16,15 +16,15 @@ jobs:
- name: Setup Go
uses: actions/setup-go@main
with:
go-version: 1.16
go-version: 1.17
id: go
- name: Dependencies
run: |
go get -u golang.org/x/lint/golint
go get -u github.com/kisielk/errcheck
curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v1.29.0
- name: Code checkout
uses: actions/checkout@master
- name: Dependencies
run: |
make install-golint
make install-errcheck
make install-golangci-lint
- name: Build
env:
GO111MODULE: on
@@ -60,8 +60,7 @@ jobs:
GOOS=darwin go build -mod=vendor ./app/vmctl
CGO_ENABLED=0 GOOS=windows go build -mod=vendor ./app/vmagent
- name: Publish coverage
uses: codecov/codecov-action@v1.0.6
uses: codecov/codecov-action@v2.1.0
with:
token: ${{secrets.CODECOV_TOKEN}}
file: ./coverage.txt

View File

@@ -16,7 +16,7 @@ jobs:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git wiki
cp docs/* wiki
cp -r docs/* wiki
cd wiki
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"

5
.gitignore vendored
View File

@@ -4,6 +4,7 @@
*.pprof
/bin
.idea
.vscode
*.test
*.swp
/gocache-for-docker
@@ -15,3 +16,7 @@
/package/temp-rpm-*
/package/*.deb
/package/*.rpm
.DS_store
Gemfile.lock
/_site
_site

5
.wwhrd.yml Normal file
View File

@@ -0,0 +1,5 @@
allowlist:
- Apache-2.0
- MIT
- BSD-3-Clause
- BSD-2-Clause

View File

@@ -7,7 +7,7 @@ contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
appearance, race, religion or sexual identity and orientation.
## Our Standards
@@ -24,9 +24,9 @@ Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Trolling, insulting/derogatory comments and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
* Publishing others' private information, such as physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
@@ -38,26 +38,26 @@ behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
reject comments, commits, code, wiki edits, issues and other contributions
that are not aligned to this Code of Conduct or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
threatening, offensive or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
address, posting via an official social media account or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
Instances of abusive, harassing or otherwise unacceptable behavior may be
reported by contacting the project team at info@victoriametrics.com. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
is deemed necessary and appropriate for the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.

View File

@@ -175,7 +175,7 @@
END OF TERMS AND CONDITIONS
Copyright 2019-2021 VictoriaMetrics, Inc.
Copyright 2019-2022 VictoriaMetrics, Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

114
Makefile
View File

@@ -1,5 +1,6 @@
PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
DATEINFO_TAG ?= $(shell date -u +'%Y%m%d-%H%M%S')
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \
git diff-index --quiet HEAD -- || echo '-dirty-'$$(git diff-index -u HEAD | openssl sha1 | cut -c 10-17)))
@@ -8,7 +9,7 @@ ifeq ($(PKG_TAG),)
PKG_TAG := $(BUILDINFO_TAG)
endif
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(shell date -u +'%Y%m%d-%H%M%S')-$(BUILDINFO_TAG)'
GO_BUILDINFO = -X '$(PKG_PREFIX)/lib/buildinfo.Version=$(APP_NAME)-$(DATEINFO_TAG)-$(BUILDINFO_TAG)'
.PHONY: $(MAKECMDGOALS)
@@ -23,6 +24,8 @@ all: \
include app/*/Makefile
include deployment/*/Makefile
include snap/local/Makefile
clean:
rm -rf bin/*
@@ -53,6 +56,14 @@ vmutils: \
vmrestore \
vmctl
vmutils-pure: \
vmagent-pure \
vmalert-pure \
vmauth-pure \
vmbackup-pure \
vmrestore-pure \
vmctl-pure
vmutils-arm64: \
vmagent-arm64 \
vmalert-arm64 \
@@ -61,9 +72,26 @@ vmutils-arm64: \
vmrestore-arm64 \
vmctl-arm64
release-snap:
snapcraft
snapcraft upload "victoriametrics_$(PKG_TAG)_multi.snap" --release beta,edge,candidate
vmutils-arm: \
vmagent-arm \
vmalert-arm \
vmauth-arm \
vmbackup-arm \
vmrestore-arm \
vmctl-arm
vmutils-windows-amd64: \
vmagent-windows-amd64 \
vmalert-windows-amd64 \
vmauth-windows-amd64 \
vmctl-windows-amd64
publish-release:
git checkout $(TAG) && $(MAKE) release publish && \
git checkout $(TAG)-cluster && $(MAKE) release publish && \
git checkout $(TAG)-enterprise && $(MAKE) release publish && \
git checkout $(TAG)-enterprise-cluster && $(MAKE) release publish
release: \
release-victoria-metrics \
@@ -71,11 +99,15 @@ release: \
release-victoria-metrics: \
release-victoria-metrics-amd64 \
release-victoria-metrics-arm \
release-victoria-metrics-arm64
release-victoria-metrics-amd64:
GOARCH=amd64 $(MAKE) release-victoria-metrics-generic
release-victoria-metrics-arm:
GOARCH=arm $(MAKE) release-victoria-metrics-generic
release-victoria-metrics-arm64:
GOARCH=arm64 $(MAKE) release-victoria-metrics-generic
@@ -85,11 +117,13 @@ release-victoria-metrics-generic: victoria-metrics-$(GOARCH)-prod
victoria-metrics-$(GOARCH)-prod \
&& sha256sum victoria-metrics-$(GOARCH)-$(PKG_TAG).tar.gz \
victoria-metrics-$(GOARCH)-prod \
| sed s/-$(GOARCH)// > victoria-metrics-$(GOARCH)-$(PKG_TAG)_checksums.txt
| sed s/-$(GOARCH)-prod/-prod/ > victoria-metrics-$(GOARCH)-$(PKG_TAG)_checksums.txt
release-vmutils: \
release-vmutils-amd64 \
release-vmutils-arm64
release-vmutils-arm64 \
release-vmutils-arm \
release-vmutils-windows-amd64
release-vmutils-amd64:
GOARCH=amd64 $(MAKE) release-vmutils-generic
@@ -97,6 +131,12 @@ release-vmutils-amd64:
release-vmutils-arm64:
GOARCH=arm64 $(MAKE) release-vmutils-generic
release-vmutils-arm:
GOARCH=arm $(MAKE) release-vmutils-generic
release-vmutils-windows-amd64:
GOARCH=amd64 $(MAKE) release-vmutils-windows-generic
release-vmutils-generic: \
vmagent-$(GOARCH)-prod \
vmalert-$(GOARCH)-prod \
@@ -119,7 +159,26 @@ release-vmutils-generic: \
vmbackup-$(GOARCH)-prod \
vmrestore-$(GOARCH)-prod \
vmctl-$(GOARCH)-prod \
| sed s/-$(GOARCH)// > vmutils-$(GOARCH)-$(PKG_TAG)_checksums.txt
| sed s/-$(GOARCH)-prod/-prod/ > vmutils-$(GOARCH)-$(PKG_TAG)_checksums.txt
release-vmutils-windows-generic: \
vmagent-windows-$(GOARCH)-prod \
vmalert-windows-$(GOARCH)-prod \
vmauth-windows-$(GOARCH)-prod \
vmctl-windows-$(GOARCH)-prod
cd bin && \
zip vmutils-windows-$(GOARCH)-$(PKG_TAG).zip \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe \
&& sha256sum vmutils-windows-$(GOARCH)-$(PKG_TAG).zip \
vmagent-windows-$(GOARCH)-prod.exe \
vmalert-windows-$(GOARCH)-prod.exe \
vmauth-windows-$(GOARCH)-prod.exe \
vmctl-windows-$(GOARCH)-prod.exe \
> vmutils-windows-$(GOARCH)-$(PKG_TAG)_checksums.txt
pprof-cpu:
go tool pprof -trim_path=github.com/VictoriaMetrics/VictoriaMetrics@ $(PPROF_FILE)
@@ -137,7 +196,7 @@ lint: install-golint
golint app/...
install-golint:
which golint || go install golang.org/x/lint/golint
which golint || GO111MODULE=off go get golang.org/x/lint/golint
errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./lib/...
@@ -152,7 +211,7 @@ errcheck: install-errcheck
errcheck -exclude=errcheck_excludes.txt ./app/vmctl/...
install-errcheck:
which errcheck || go install github.com/kisielk/errcheck
which errcheck || GO111MODULE=off go get github.com/kisielk/errcheck
check-all: fmt vet lint errcheck golangci-lint
@@ -194,24 +253,43 @@ app-local-pure:
app-local-with-goarch:
GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-$(GOARCH)$(RACE) $(PKG_PREFIX)/app/$(APP_NAME)
app-local-windows-with-goarch:
CGO_ENABLED=0 GO111MODULE=on go build $(RACE) -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/$(APP_NAME)-windows-$(GOARCH)$(RACE).exe $(PKG_PREFIX)/app/$(APP_NAME)
quicktemplate-gen: install-qtc
qtc
install-qtc:
which qtc || go install github.com/valyala/quicktemplate/qtc
which qtc || GO111MODULE=off go get github.com/valyala/quicktemplate/qtc
golangci-lint: install-golangci-lint
golangci-lint run --exclude '(SA4003|SA1019|SA5011):' -D errcheck -D structcheck --timeout 2m
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.29.0
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.43.0
install-wwhrd:
which wwhrd || GO111MODULE=off go get github.com/frapposelli/wwhrd
check-licenses: install-wwhrd
wwhrd check -f .wwhrd.yml
copy-docs:
echo "---\nsort: ${ORDER}\n---\n" > ${DST}
cat ${SRC} >> ${DST}
# Copies docs for all components and adds the order tag.
# Cluster docs are supposed to be ordered as 9th.
# For The rest of docs is ordered manually.t
docs-sync:
cp app/vmagent/README.md docs/vmagent.md
cp app/vmalert/README.md docs/vmalert.md
cp app/vmauth/README.md docs/vmauth.md
cp app/vmbackup/README.md docs/vmbackup.md
cp app/vmrestore/README.md docs/vmrestore.md
cp app/vmctl/README.md docs/vmctl.md
cp README.md docs/Single-server-VictoriaMetrics.md
cp README.md docs/README.md
SRC=README.md DST=docs/Single-server-VictoriaMetrics.md ORDER=1 $(MAKE) copy-docs
SRC=app/vmagent/README.md DST=docs/vmagent.md ORDER=3 $(MAKE) copy-docs
SRC=app/vmalert/README.md DST=docs/vmalert.md ORDER=4 $(MAKE) copy-docs
SRC=app/vmauth/README.md DST=docs/vmauth.md ORDER=5 $(MAKE) copy-docs
SRC=app/vmbackup/README.md DST=docs/vmbackup.md ORDER=6 $(MAKE) copy-docs
SRC=app/vmrestore/README.md DST=docs/vmrestore.md ORDER=7 $(MAKE) copy-docs
SRC=app/vmctl/README.md DST=docs/vmctl.md ORDER=8 $(MAKE) copy-docs
SRC=app/vmgateway/README.md DST=docs/vmgateway.md ORDER=9 $(MAKE) copy-docs
SRC=app/vmbackupmanager/README.md DST=docs/vmbackupmanager.md ORDER=10 $(MAKE) copy-docs

1140
README.md

File diff suppressed because it is too large Load Diff

Binary file not shown.

View File

@@ -3,10 +3,8 @@ package main
import (
"flag"
"fmt"
"io"
"net/http"
"os"
"path"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
@@ -26,9 +24,8 @@ import (
var (
httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Remove superflouos samples from time series if they are located closer to each other than this duration. "+
"This may be useful for reducing overhead when multiple identically configured Prometheus instances write data to the same VictoriaMetrics. "+
"Deduplication is disabled if the -dedup.minScrapeInterval is 0")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Leave only the first sample in every time series per each discrete interval "+
"equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling")
dryRun = flag.Bool("dryRun", false, "Whether to check only -promscrape.config and then exit. "+
"Unknown config entries are allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse")
)
@@ -54,7 +51,7 @@ func main() {
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetMinScrapeIntervalForDeduplication(*minScrapeInterval)
storage.SetDedupInterval(*minScrapeInterval)
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
vmselect.Init()
vminsert.Init()
@@ -86,14 +83,22 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics.</h2></br>")
fmt.Fprintf(w, "See docs at <a href='https://victoriametrics.github.io/'>https://victoriametrics.github.io/</a></br>")
fmt.Fprintf(w, "Useful endpoints: </br>")
writeAPIHelp(w, [][]string{
{"/targets", "discovered targets list"},
{"/api/v1/targets", "advanced information about discovered targets in JSON format"},
{"/metrics", "available service metrics"},
{"/api/v1/status/tsdb", "tsdb status page"},
if r.Method != "GET" {
return false
}
fmt.Fprintf(w, "<h2>Single-node VictoriaMetrics</h2></br>")
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/'>https://docs.victoriametrics.com/</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
{"vmui", "Web UI"},
{"targets", "discovered targets list"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"api/v1/status/tsdb", "tsdb status page"},
{"api/v1/status/top_queries", "top queries"},
{"api/v1/status/active_queries", "active queries"},
})
return true
}
@@ -109,20 +114,11 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return false
}
func writeAPIHelp(w io.Writer, pathList [][]string) {
pathPrefix := httpserver.GetPathPrefix()
for _, p := range pathList {
p, doc := p[0], p[1]
p = path.Join(pathPrefix, p)
fmt.Fprintf(w, "<a href='%s'>%q</a> - %s<br/>", p, p, doc)
}
}
func usage() {
const s = `
victoria-metrics is a time series database and monitoring solution.
See the docs at https://victoriametrics.github.io/
See the docs at https://docs.victoriametrics.com/
`
flagutil.Usage(s)
}

View File

@@ -22,7 +22,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -149,7 +148,7 @@ func setUp() {
}
func processFlags() {
envflag.Parse()
flag.Parse()
for _, fv := range []struct {
flag string
value string

View File

@@ -11,6 +11,6 @@
"status":"success",
"data":{"resultType":"matrix",
"result":[
{"metric":{"item":"y"},"values":[["{TIME_S-1m}","0.5"]]}
{"metric":{"item":"y"},"values":[["{TIME_S-1m}","0.5"], ["{TIME_S}","0.5"]]}
]}}
}

View File

@@ -19,6 +19,7 @@
["{TIME_S-110s}","3"],
["{TIME_S-100s}","3"],
["{TIME_S-90s}","3"],
["{TIME_S-80s}","3"],
["{TIME_S-60s}","2"],
["{TIME_S-50s}","2"],
["{TIME_S-40s}","2"],

View File

@@ -13,6 +13,6 @@
"data":{"resultType":"matrix",
"result":[
{"metric":{"__name__":"not_nan_as_missing_data","item":"x"},"values":[["{TIME_S-2m}","2"]]},
{"metric":{"__name__":"not_nan_as_missing_data","item":"y"},"values":[["{TIME_S-2m}","4"],["{TIME_S-1m}","3"]]}
{"metric":{"__name__":"not_nan_as_missing_data","item":"y"},"values":[["{TIME_S-2m}","4"],["{TIME_S-1m}","3"],["{TIME_S}", "3"]]}
]}}
}

View File

@@ -78,3 +78,9 @@ vmagent-local-with-goarch:
vmagent-pure:
APP_NAME=vmagent $(MAKE) app-local-pure
vmagent-windows-amd64:
GOARCH=amd64 APP_NAME=vmagent $(MAKE) app-local-windows-with-goarch
vmagent-windows-amd64-prod:
APP_NAME=vmagent $(MAKE) app-via-docker-windows-amd64

File diff suppressed because it is too large Load Diff

View File

@@ -5,32 +5,35 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="csvimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="csvimport"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="csvimport"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="csvimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="csvimport"}`)
)
// InsertHandler processes csv data from req.
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
return insertRows(at, rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -64,8 +67,11 @@ func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))
}
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -0,0 +1,99 @@
package datadog
import (
"fmt"
"net/http"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="datadog"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="datadog"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="datadog"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/v1/series request.
//
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
ce := req.Header.Get("Content-Encoding")
return parser.ParseStream(req.Body, ce, func(series []parser.Series) error {
return insertRows(at, series, extraLabels)
})
})
}
func insertRows(at *auth.Token, series []parser.Series, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for i := range series {
ss := &series[i]
rowsTotal += len(ss.Points)
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: ss.Metric,
})
labels = append(labels, prompbmarshal.Label{
Name: "host",
Value: ss.Host,
})
for _, tag := range ss.Tags {
n := strings.IndexByte(tag, ':')
if n < 0 {
return fmt.Errorf("cannot find ':' in tag %q", tag)
}
name := tag[:n]
value := tag[n+1:]
if name == "host" {
name = "exported_host"
}
labels = append(labels, prompbmarshal.Label{
Name: name,
Value: value,
})
}
labels = append(labels, extraLabels...)
samplesLen := len(samples)
for _, pt := range ss.Points {
samples = append(samples, prompbmarshal.Sample{
Timestamp: pt.Timestamp(),
Value: pt.Value(),
})
}
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -8,34 +8,37 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via Influx line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if Influx line contains only a single field")
measurementFieldSeparator = flag.String("influxMeasurementFieldSeparator", "_", "Separator for '{measurement}{separator}{field_name}' metric name when inserted via InfluxDB line protocol")
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses '{measurement}' instead of '{measurement}{separator}{field_name}' for metic name if InfluxDB line contains only a single field")
skipMeasurement = flag.Bool("influxSkipMeasurement", false, "Uses '{field_name}' as a metric name while ignoring '{measurement}' and '-influxMeasurementFieldSeparator'")
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="influx"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="influx"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="influx"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="influx"}`)
)
// InsertHandlerForReader processes remote write for influx line protocol.
//
// See https://github.com/influxdata/telegraf/tree/master/plugins/inputs/socket_listener/
func InsertHandlerForReader(r io.Reader) error {
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, false, "", "", func(db string, rows []parser.Row) error {
return insertRows(db, rows, nil)
return parser.ParseStream(r, isGzipped, "", "", func(db string, rows []parser.Row) error {
return insertRows(nil, db, rows, nil)
})
})
}
@@ -43,7 +46,7 @@ func InsertHandlerForReader(r io.Reader) error {
// InsertHandlerForHTTP processes remote write for influx line protocol.
//
// See https://github.com/influxdata/influxdb/blob/4cbdc197b8117fee648d62e2e5be75c6575352f0/tsdb/README.md
func InsertHandlerForHTTP(req *http.Request) error {
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
@@ -55,12 +58,12 @@ func InsertHandlerForHTTP(req *http.Request) error {
// Read db tag from https://docs.influxdata.com/influxdb/v1.7/tools/api/#write-http-endpoint
db := q.Get("db")
return parser.ParseStream(req.Body, isGzipped, precision, db, func(db string, rows []parser.Row) error {
return insertRows(db, rows, extraLabels)
return insertRows(at, db, rows, extraLabels)
})
})
}
func insertRows(db string, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, db string, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := getPushCtx()
defer putPushCtx(ctx)
@@ -96,7 +99,8 @@ func insertRows(db string, rows []parser.Row, extraLabels []prompbmarshal.Label)
if !*skipMeasurement {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, r.Measurement...)
}
skipFieldKey := len(r.Fields) == 1 && *skipSingleField
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1139
skipFieldKey := len(r.Measurement) > 0 && len(r.Fields) == 1 && *skipSingleField
if len(ctx.metricGroupBuf) > 0 && !skipFieldKey {
ctx.metricGroupBuf = append(ctx.metricGroupBuf, *measurementFieldSeparator...)
}
@@ -129,8 +133,11 @@ func insertRows(db string, rows []parser.Row, extraLabels []prompbmarshal.Label)
ctx.ctx.Labels = labels
ctx.ctx.Samples = samples
ctx.commonLabels = commonLabels
remotewrite.Push(&ctx.ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil

View File

@@ -3,6 +3,7 @@ package main
import (
"flag"
"fmt"
"io"
"net/http"
"os"
"strings"
@@ -10,6 +11,7 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadog"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/influx"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/native"
@@ -19,10 +21,12 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/influxutils"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
influxserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/influx"
opentsdbserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/opentsdb"
@@ -39,13 +43,14 @@ var (
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for Influx line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to `http://<vmagent>:8429/write`")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8189 must be set. Doesn't work if empty. "+
"This flag isn't needed when ingesting data over HTTP - just send it to http://<vmagent>:8429/write")
graphiteListenAddr = flag.String("graphiteListenAddr", "", "TCP and UDP address to listen for Graphite plaintext data. Usually :2003 must be set. Doesn't work if empty")
opentsdbListenAddr = flag.String("opentsdbListenAddr", "", "TCP and UDP address to listen for OpentTSDB metrics. "+
"Telnet put messages and HTTP /api/put messages are simultaneously served on TCP port. "+
"Usually :4242 must be set. Doesn't work if empty")
opentsdbHTTPListenAddr = flag.String("opentsdbHTTPListenAddr", "", "TCP address to listen for OpentTSDB HTTP put requests. Usually :4242 must be set. Doesn't work if empty")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig . "+
"Unknown config entries are allowed in -promscrape.config by default. This can be changed with -promscrape.config.strictParse")
@@ -91,7 +96,9 @@ func main() {
common.StartUnmarshalWorkers()
writeconcurrencylimiter.Init()
if len(*influxListenAddr) > 0 {
influxServer = influxserver.MustStart(*influxListenAddr, influx.InsertHandlerForReader)
influxServer = influxserver.MustStart(*influxListenAddr, func(r io.Reader) error {
return influx.InsertHandlerForReader(r, false)
})
}
if len(*graphiteListenAddr) > 0 {
graphiteServer = graphiteserver.MustStart(*graphiteListenAddr, graphite.InsertHandler)
@@ -144,78 +151,133 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
if r.URL.Path == "/" {
fmt.Fprintf(w, "vmagent - see docs at https://victoriametrics.github.io/vmagent.html")
if r.Method != "GET" {
return false
}
fmt.Fprintf(w, "<h2>vmagent</h2>")
fmt.Fprintf(w, "See docs at <a href='https://docs.victoriametrics.com/vmagent.html'>https://docs.victoriametrics.com/vmagent.html</a></br>")
fmt.Fprintf(w, "Useful endpoints:</br>")
httpserver.WriteAPIHelp(w, [][2]string{
{"targets", "discovered targets list"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"-/reload", "reload configuration"},
})
return true
}
path := strings.Replace(r.URL.Path, "//", "/", -1)
switch path {
case "/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(r); err != nil {
if err := promremotewrite.InsertHandler(nil, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(r); err != nil {
if err := vmimport.InsertHandler(nil, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(r); err != nil {
if err := csvimport.InsertHandler(nil, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(r); err != nil {
if err := prometheusimport.InsertHandler(nil, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(r); err != nil {
if err := native.InsertHandler(nil, r); err != nil {
nativeimportErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/write", "/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(r); err != nil {
if err := influx.InsertHandlerForHTTP(nil, r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "/query":
// Emulate fake response for influx query.
// This is required for TSBS benchmark.
influxQueryRequests.Inc()
fmt.Fprintf(w, `{"results":[{"series":[{"values":[]}]}]}`)
influxutils.WriteDatabaseNames(w)
return true
case "/datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(nil, r); err != nil {
datadogWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/v1/validate":
datadogValidateRequests.Inc()
// See https://docs.datadoghq.com/api/latest/authentication/#validate-api-key
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{"valid":true}`)
return true
case "/datadog/api/v1/check_run":
datadogCheckRunRequests.Inc()
// See https://docs.datadoghq.com/api/latest/service-checks/#submit-a-service-check
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/intake/":
datadogIntakeRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
case "/targets":
promscrapeTargetsRequests.Inc()
promscrape.WriteHumanReadableTargetsStatus(w, r)
return true
case "/config":
if *configAuthKey != "" && r.FormValue("authKey") != *configAuthKey {
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("The provided authKey doesn't match -configAuthKey"),
StatusCode: http.StatusUnauthorized,
}
httpserver.Errorf(w, r, "%s", err)
return true
}
promscrapeConfigRequests.Inc()
w.Header().Set("Content-Type", "text/plain; charset=utf-8")
promscrape.WriteConfigData(w)
return true
case "/api/v1/targets":
promscrapeAPIV1TargetsRequests.Inc()
w.Header().Set("Content-Type", "application/json; charset=utf-8")
w.Header().Set("Content-Type", "application/json")
state := r.FormValue("state")
promscrape.WriteAPIV1Targets(w, state)
return true
@@ -235,9 +297,121 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
}
if remotewrite.MultitenancyEnabled() {
return processMultitenantRequest(w, r, path)
}
return false
}
func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path string) bool {
p, err := httpserver.ParsePath(path)
if err != nil {
// Cannot parse multitenant path. Skip it - probably it will be parsed later.
return false
}
if p.Prefix != "insert" {
httpserver.Errorf(w, r, `unsupported multitenant prefix: %q; expected "insert"`, p.Prefix)
return true
}
at, err := auth.NewToken(p.AuthToken)
if err != nil {
httpserver.Errorf(w, r, "cannot obtain auth token: %s", err)
return true
}
switch p.Suffix {
case "prometheus/", "prometheus", "prometheus/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(at, r); err != nil {
prometheusWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import":
vmimportRequests.Inc()
if err := vmimport.InsertHandler(at, r); err != nil {
vmimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/csv":
csvimportRequests.Inc()
if err := csvimport.InsertHandler(at, r); err != nil {
csvimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/prometheus":
prometheusimportRequests.Inc()
if err := prometheusimport.InsertHandler(at, r); err != nil {
prometheusimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "prometheus/api/v1/import/native":
nativeimportRequests.Inc()
if err := native.InsertHandler(at, r); err != nil {
nativeimportErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "influx/write", "influx/api/v2/write":
influxWriteRequests.Inc()
if err := influx.InsertHandlerForHTTP(at, r); err != nil {
influxWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(http.StatusNoContent)
return true
case "influx/query":
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
case "datadog/api/v1/series":
datadogWriteRequests.Inc()
if err := datadog.InsertHandlerForHTTP(at, r); err != nil {
datadogWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
// See https://docs.datadoghq.com/api/latest/metrics/#submit-metrics
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/api/v1/validate":
datadogValidateRequests.Inc()
// See https://docs.datadoghq.com/api/latest/authentication/#validate-api-key
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{"valid":true}`)
return true
case "datadog/api/v1/check_run":
datadogCheckRunRequests.Inc()
// See https://docs.datadoghq.com/api/latest/service-checks/#submit-a-service-check
w.Header().Set("Content-Type", "application/json")
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/intake/":
datadogIntakeRequests.Inc()
w.Header().Set("Content-Type", "application/json")
fmt.Fprintf(w, `{}`)
return true
default:
httpserver.Errorf(w, r, "unsupported multitenant path suffix: %q", p.Suffix)
return true
}
}
var (
prometheusWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/write", protocol="promremotewrite"}`)
prometheusWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/write", protocol="promremotewrite"}`)
@@ -254,14 +428,23 @@ var (
nativeimportRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/import/native", protocol="nativeimport"}`)
nativeimportErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/api/v1/import/native", protocol="nativeimport"}`)
influxWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/write", protocol="influx"}`)
influxWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/influx/write", protocol="influx"}`)
influxWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/influx/write", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/query", protocol="influx"}`)
influxQueryRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/influx/query", protocol="influx"}`)
datadogWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/v1/series", protocol="datadog"}`)
datadogValidateRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)
datadogIntakeRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/intake/", protocol="datadog"}`)
promscrapeTargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/targets"}`)
promscrapeAPIV1TargetsRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/api/v1/targets"}`)
promscrapeConfigRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/config"}`)
promscrapeConfigReloadRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/-/reload"}`)
)
@@ -269,7 +452,7 @@ func usage() {
const s = `
vmagent collects metrics data via popular data ingestion protocols and routes it to VictoriaMetrics.
See the docs at https://victoriametrics.github.io/vmagent.html .
See the docs at https://docs.victoriametrics.com/vmagent.html .
`
flagutil.Usage(s)
}

View File

@@ -5,36 +5,39 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="native"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="native"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="native"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="native"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="native"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(block *parser.Block) error {
return insertRows(block, extraLabels)
return insertRows(at, block, extraLabels)
})
})
}
func insertRows(block *parser.Block, extraLabels []prompbmarshal.Label) error {
func insertRows(at *auth.Token, block *parser.Block, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -42,6 +45,9 @@ func insertRows(block *parser.Block, extraLabels []prompbmarshal.Label) error {
// since relabeling can prevent from inserting the rows.
rowsLen := len(block.Values)
rowsInserted.Add(rowsLen)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsLen)
}
rowsPerInsert.Update(float64(rowsLen))
tssDst := ctx.WriteRequest.Timeseries[:0]
@@ -80,6 +86,6 @@ func insertRows(block *parser.Block, extraLabels []prompbmarshal.Label) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
return nil
}

View File

@@ -1,24 +1,28 @@
package prometheusimport
import (
"io"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="prometheus"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="prometheus"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="prometheus"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes `/api/v1/import/prometheus` request.
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
@@ -30,12 +34,21 @@ func InsertHandler(req *http.Request) error {
return writeconcurrencylimiter.Do(func() error {
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, defaultTimestamp, isGzipped, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
return insertRows(at, rows, extraLabels)
}, nil)
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
// InsertHandlerForReader processes metrics from given reader with optional gzip format
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, 0, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
}, nil)
})
}
func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -69,8 +82,11 @@ func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
rowsInserted.Add(len(rows))
if at != nil {
rowsTenantInserted.Get(at).Add(len(rows))
}
rowsPerInsert.Update(float64(len(rows)))
return nil
}

View File

@@ -1,38 +1,51 @@
package promremotewrite
import (
"io"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/promremotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="promremotewrite"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="promremotewrite"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="promremotewrite"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(tss []prompb.TimeSeries) error {
return insertRows(tss, extraLabels)
return parser.ParseStream(req.Body, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
})
}
func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Label) error {
// InsertHandlerForReader processes metrics from given reader
func InsertHandlerForReader(at *auth.Token, r io.Reader) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, func(tss []prompb.TimeSeries) error {
return insertRows(at, tss, nil)
})
})
}
func insertRows(at *auth.Token, timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -68,8 +81,11 @@ func insertRows(timeseries []prompb.TimeSeries, extraLabels []prompbmarshal.Labe
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -2,8 +2,6 @@ package remotewrite
import (
"bytes"
"crypto/tls"
"encoding/base64"
"fmt"
"io/ioutil"
"net/http"
@@ -42,17 +40,36 @@ var (
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
basicAuthPassword = flagutil.NewArray("remoteWrite.basicAuth.password", "Optional basic auth password to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
basicAuthPasswordFile = flagutil.NewArray("remoteWrite.basicAuth.passwordFile", "Optional path to basic auth password to use for -remoteWrite.url. "+
"The file is re-read every second. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
bearerToken = flagutil.NewArray("remoteWrite.bearerToken", "Optional bearer auth token to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
bearerTokenFile = flagutil.NewArray("remoteWrite.bearerTokenFile", "Optional path to bearer token file to use for -remoteWrite.url. "+
"The token is re-read from the file every second. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2ClientID = flagutil.NewArray("remoteWrite.oauth2.clientID", "Optional OAuth2 clientID to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2ClientSecret = flagutil.NewArray("remoteWrite.oauth2.clientSecret", "Optional OAuth2 clientSecret to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2ClientSecretFile = flagutil.NewArray("remoteWrite.oauth2.clientSecretFile", "Optional OAuth2 clientSecretFile to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2TokenURL = flagutil.NewArray("remoteWrite.oauth2.tokenUrl", "Optional OAuth2 tokenURL to use for -remoteWrite.url. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
oauth2Scopes = flagutil.NewArray("remoteWrite.oauth2.scopes", "Optional OAuth2 scopes to use for -remoteWrite.url. Scopes must be delimited by ';'. "+
"If multiple args are set, then they are applied independently for the corresponding -remoteWrite.url")
)
type client struct {
sanitizedURL string
remoteWriteURL string
authHeader string
fq *persistentqueue.FastQueue
hc *http.Client
sendBlock func(block []byte) bool
authCfg *promauth.Config
rl rateLimiter
bytesSent *metrics.Counter
@@ -62,16 +79,18 @@ type client struct {
errorsCount *metrics.Counter
packetsDropped *metrics.Counter
retriesCount *metrics.Counter
sendDuration *metrics.FloatCounter
wg sync.WaitGroup
stopCh chan struct{}
}
func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqueue.FastQueue, concurrency int) *client {
tlsCfg, err := getTLSConfig(argIdx)
func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqueue.FastQueue, concurrency int) *client {
authCfg, err := getAuthConfig(argIdx)
if err != nil {
logger.Panicf("FATAL: cannot initialize TLS config: %s", err)
logger.Panicf("FATAL: cannot initialize auth config: %s", err)
}
tlsCfg := authCfg.NewTLSConfig()
tr := &http.Transport{
Dial: statDial,
TLSClientConfig: tlsCfg,
@@ -86,32 +105,16 @@ func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqu
if !strings.Contains(pURL, "://") {
logger.Fatalf("cannot parse -remoteWrite.proxyURL=%q: it must start with `http://`, `https://` or `socks5://`", pURL)
}
urlProxy, err := url.Parse(pURL)
pu, err := url.Parse(pURL)
if err != nil {
logger.Fatalf("cannot parse -remoteWrite.proxyURL=%q: %s", pURL, err)
}
tr.Proxy = http.ProxyURL(urlProxy)
}
authHeader := ""
username := basicAuthUsername.GetOptionalArg(argIdx)
password := basicAuthPassword.GetOptionalArg(argIdx)
if len(username) > 0 || len(password) > 0 {
// See https://en.wikipedia.org/wiki/Basic_access_authentication
token := username + ":" + password
token64 := base64.StdEncoding.EncodeToString([]byte(token))
authHeader = "Basic " + token64
}
token := bearerToken.GetOptionalArg(argIdx)
if len(token) > 0 {
if authHeader != "" {
logger.Fatalf("`-remoteWrite.bearerToken`=%q cannot be set when `-remoteWrite.basicAuth.*` flags are set", token)
}
authHeader = "Bearer " + token
tr.Proxy = http.ProxyURL(pu)
}
c := &client{
sanitizedURL: sanitizedURL,
remoteWriteURL: remoteWriteURL,
authHeader: authHeader,
authCfg: authCfg,
fq: fq,
hc: &http.Client{
Transport: tr,
@@ -119,6 +122,11 @@ func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqu
},
stopCh: make(chan struct{}),
}
c.sendBlock = c.sendBlockHTTP
return c
}
func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
if bytesPerSec := rateLimit.GetOptionalArgOrDefault(argIdx, 0); bytesPerSec > 0 {
logger.Infof("applying %d bytes per second rate limit for -remoteWrite.url=%q", bytesPerSec, sanitizedURL)
c.rl.perSecondLimit = int64(bytesPerSec)
@@ -132,6 +140,7 @@ func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqu
c.errorsCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_errors_total{url=%q}`, c.sanitizedURL))
c.packetsDropped = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_packets_dropped_total{url=%q}`, c.sanitizedURL))
c.retriesCount = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_retries_count_total{url=%q}`, c.sanitizedURL))
c.sendDuration = metrics.GetOrCreateFloatCounter(fmt.Sprintf(`vmagent_remotewrite_send_duration_seconds_total{url=%q}`, c.sanitizedURL))
for i := 0; i < concurrency; i++ {
c.wg.Add(1)
go func() {
@@ -140,7 +149,6 @@ func newClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persistentqu
}()
}
logger.Infof("initialized client for -remoteWrite.url=%q", c.sanitizedURL)
return c
}
func (c *client) MustStop() {
@@ -149,23 +157,48 @@ func (c *client) MustStop() {
logger.Infof("stopped client for -remoteWrite.url=%q", c.sanitizedURL)
}
func getTLSConfig(argIdx int) (*tls.Config, error) {
c := &promauth.TLSConfig{
func getAuthConfig(argIdx int) (*promauth.Config, error) {
username := basicAuthUsername.GetOptionalArg(argIdx)
password := basicAuthPassword.GetOptionalArg(argIdx)
passwordFile := basicAuthPasswordFile.GetOptionalArg(argIdx)
var basicAuthCfg *promauth.BasicAuthConfig
if username != "" || password != "" || passwordFile != "" {
basicAuthCfg = &promauth.BasicAuthConfig{
Username: username,
Password: promauth.NewSecret(password),
PasswordFile: passwordFile,
}
}
token := bearerToken.GetOptionalArg(argIdx)
tokenFile := bearerTokenFile.GetOptionalArg(argIdx)
var oauth2Cfg *promauth.OAuth2Config
clientSecret := oauth2ClientSecret.GetOptionalArg(argIdx)
clientSecretFile := oauth2ClientSecretFile.GetOptionalArg(argIdx)
if clientSecretFile != "" || clientSecret != "" {
oauth2Cfg = &promauth.OAuth2Config{
ClientID: oauth2ClientID.GetOptionalArg(argIdx),
ClientSecret: promauth.NewSecret(clientSecret),
ClientSecretFile: clientSecretFile,
TokenURL: oauth2TokenURL.GetOptionalArg(argIdx),
Scopes: strings.Split(oauth2Scopes.GetOptionalArg(argIdx), ";"),
}
}
tlsCfg := &promauth.TLSConfig{
CAFile: tlsCAFile.GetOptionalArg(argIdx),
CertFile: tlsCertFile.GetOptionalArg(argIdx),
KeyFile: tlsKeyFile.GetOptionalArg(argIdx),
ServerName: tlsServerName.GetOptionalArg(argIdx),
InsecureSkipVerify: tlsInsecureSkipVerify.GetOptionalArg(argIdx),
}
if c.CAFile == "" && c.CertFile == "" && c.KeyFile == "" && c.ServerName == "" && !c.InsecureSkipVerify {
return nil, nil
}
cfg, err := promauth.NewConfig(".", nil, "", "", c)
authCfg, err := promauth.NewConfig(".", nil, basicAuthCfg, token, tokenFile, oauth2Cfg, tlsCfg)
if err != nil {
return nil, fmt.Errorf("cannot populate TLS config: %w", err)
return nil, fmt.Errorf("cannot populate OAuth2 config for remoteWrite idx: %d, err: %w", argIdx, err)
}
tlsCfg := cfg.NewTLSConfig()
return tlsCfg, nil
return authCfg, nil
}
func (c *client) runWorker() {
@@ -178,7 +211,9 @@ func (c *client) runWorker() {
return
}
go func() {
startTime := time.Now()
ch <- c.sendBlock(block)
c.sendDuration.Add(time.Since(startTime).Seconds())
}()
select {
case ok := <-ch:
@@ -207,9 +242,9 @@ func (c *client) runWorker() {
}
}
// sendBlock returns false only if c.stopCh is closed.
// sendBlockHTTP returns false only if c.stopCh is closed.
// Otherwise it tries sending the block to remote storage indefinitely.
func (c *client) sendBlock(block []byte) bool {
func (c *client) sendBlockHTTP(block []byte) bool {
c.rl.register(len(block), c.stopCh)
retryDuration := time.Second
retriesCount := 0
@@ -219,15 +254,15 @@ func (c *client) sendBlock(block []byte) bool {
again:
req, err := http.NewRequest("POST", c.remoteWriteURL, bytes.NewBuffer(block))
if err != nil {
logger.Panicf("BUG: unexected error from http.NewRequest(%q): %s", c.sanitizedURL, err)
logger.Panicf("BUG: unexpected error from http.NewRequest(%q): %s", c.sanitizedURL, err)
}
h := req.Header
h.Set("User-Agent", "vmagent")
h.Set("Content-Type", "application/x-protobuf")
h.Set("Content-Encoding", "snappy")
h.Set("X-Prometheus-Remote-Write-Version", "0.1.0")
if c.authHeader != "" {
req.Header.Set("Authorization", c.authHeader)
if ah := c.authCfg.GetAuthHeader(); ah != "" {
req.Header.Set("Authorization", ah)
}
startTime := time.Now()
@@ -239,7 +274,7 @@ again:
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Errorf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
logger.Warnf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.sanitizedURL, err, retryDuration.Seconds())
t := timerpool.Get(retryDuration)
select {
@@ -259,13 +294,22 @@ again:
return true
}
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_requests_total{url=%q, status_code="%d"}`, c.sanitizedURL, statusCode)).Inc()
if statusCode == 409 {
// Just drop block on 409 status code like Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
body, _ := ioutil.ReadAll(resp.Body)
if statusCode == 409 || statusCode == 400 {
body, err := ioutil.ReadAll(resp.Body)
_ = resp.Body.Close()
l := logger.WithThrottler("remoteWriteRejected", 5*time.Second)
if err != nil {
l.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; "+
"failed to read response body: %s",
len(block), c.sanitizedURL, statusCode, err)
} else {
l.Errorf("sending a block with size %d bytes to %q was rejected (skipping the block): status code %d; response body: %s",
len(block), c.sanitizedURL, statusCode, string(body))
}
// Just drop block on 409 and 400 status codes like Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/873
// and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1149
_ = resp.Body.Close()
logger.Errorf("unexpected status code received when sending a block with size %d bytes to %q: #%d; dropping the block like Prometheus does; "+
"response body=%q", len(block), c.sanitizedURL, statusCode, body)
c.packetsDropped.Inc()
return true
}

View File

@@ -20,13 +20,10 @@ import (
var (
flushInterval = flag.Duration("remoteWrite.flushInterval", time.Second, "Interval for flushing the data to remote storage. "+
"This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url")
maxUnpackedBlockSize = flagutil.NewBytes("remoteWrite.maxBlockSize", 8*1024*1024, "The maximum size in bytes of unpacked request to send to remote storage. "+
"It shouldn't exceed -maxInsertRequestSize from VictoriaMetrics")
maxUnpackedBlockSize = flagutil.NewBytes("remoteWrite.maxBlockSize", 8*1024*1024, "The maximum block size to send to remote storage. Bigger blocks may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxRowsPerBlock")
maxRowsPerBlock = flag.Int("remoteWrite.maxRowsPerBlock", 10000, "The maximum number of samples to send in each block to remote storage. Higher number may improve performance at the cost of the increased memory usage. See also -remoteWrite.maxBlockSize")
)
// the maximum number of rows to send per each block.
const maxRowsPerBlock = 10000
type pendingSeries struct {
mu sync.Mutex
wr writeRequest
@@ -150,10 +147,13 @@ func (wr *writeRequest) adjustSampleValues() {
func (wr *writeRequest) push(src []prompbmarshal.TimeSeries) {
tssDst := wr.tss
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
maxLabelsPerBlock := 10 * maxSamplesPerBlock
for i := range src {
tssDst = append(tssDst, prompbmarshal.TimeSeries{})
wr.copyTimeSeries(&tssDst[len(tssDst)-1], &src[i])
if len(wr.samples) >= maxRowsPerBlock {
if len(wr.samples) >= maxSamplesPerBlock || len(wr.labels) >= maxLabelsPerBlock {
wr.tss = tssDst
wr.flush()
tssDst = wr.tss

View File

@@ -14,10 +14,17 @@ import (
var (
unparsedLabelsGlobal = flagutil.NewArray("remoteWrite.label", "Optional label in the form 'name=value' to add to all the metrics before sending them to -remoteWrite.url. "+
"Pass multiple -remoteWrite.label flags in order to add multiple flags to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://victoriametrics.github.io/vmagent.html#relabeling for details")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url")
"Pass multiple -remoteWrite.label flags in order to add multiple labels to metrics before sending them to remote storage")
relabelConfigPathGlobal = flag.String("remoteWrite.relabelConfig", "", "Optional path to file with relabel_config entries. "+
"The path can point either to local file or to http url. These entries are applied to all the metrics "+
"before sending them to -remoteWrite.url. See https://docs.victoriametrics.com/vmagent.html#relabeling for details")
relabelDebugGlobal = flag.Bool("remoteWrite.relabelDebug", false, "Whether to log metrics before and after relabeling with -remoteWrite.relabelConfig. "+
"If the -remoteWrite.relabelDebug is enabled, then the metrics aren't sent to remote storage. This is useful for debugging the relabeling configs")
relabelConfigPaths = flagutil.NewArray("remoteWrite.urlRelabelConfig", "Optional path to relabel config for the corresponding -remoteWrite.url. "+
"The path can point either to local file or to http url")
relabelDebug = flagutil.NewArrayBool("remoteWrite.urlRelabelDebug", "Whether to log metrics before and after relabeling with -remoteWrite.urlRelabelConfig. "+
"If the -remoteWrite.urlRelabelDebug is enabled, then the metrics aren't sent to the corresponding -remoteWrite.url. "+
"This is useful for debugging the relabeling configs")
)
var labelsGlobal []prompbmarshal.Label
@@ -31,23 +38,23 @@ func CheckRelabelConfigs() error {
func loadRelabelConfigs() (*relabelConfigs, error) {
var rcs relabelConfigs
if *relabelConfigPathGlobal != "" {
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal)
global, err := promrelabel.LoadRelabelConfigs(*relabelConfigPathGlobal, *relabelDebugGlobal)
if err != nil {
return nil, fmt.Errorf("cannot load -remoteWrite.relabelConfig=%q: %w", *relabelConfigPathGlobal, err)
}
rcs.global = global
}
if len(*relabelConfigPaths) > len(*remoteWriteURLs) {
return nil, fmt.Errorf("too many -remoteWrite.urlRelabelConfig args: %d; it mustn't exceed the number of -remoteWrite.url args: %d",
len(*relabelConfigPaths), len(*remoteWriteURLs))
if len(*relabelConfigPaths) > (len(*remoteWriteURLs) + len(*remoteWriteMultitenantURLs)) {
return nil, fmt.Errorf("too many -remoteWrite.urlRelabelConfig args: %d; it mustn't exceed the number of -remoteWrite.url or -remoteWrite.multitenantURL args: %d",
len(*relabelConfigPaths), (len(*remoteWriteURLs) + len(*remoteWriteMultitenantURLs)))
}
rcs.perURL = make([]*promrelabel.ParsedConfigs, len(*remoteWriteURLs))
rcs.perURL = make([]*promrelabel.ParsedConfigs, (len(*remoteWriteURLs) + len(*remoteWriteMultitenantURLs)))
for i, path := range *relabelConfigPaths {
if len(path) == 0 {
// Skip empty relabel config.
continue
}
prc, err := promrelabel.LoadRelabelConfigs(path)
prc, err := promrelabel.LoadRelabelConfigs(path, relabelDebug.GetOptionalArg(i))
if err != nil {
return nil, fmt.Errorf("cannot load relabel configs from -remoteWrite.urlRelabelConfig=%q: %w", path, err)
}

View File

@@ -3,9 +3,15 @@ package remotewrite
import (
"flag"
"fmt"
"net/url"
"strconv"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bloomfilter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -13,6 +19,8 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
xxhash "github.com/cespare/xxhash/v2"
)
@@ -20,10 +28,14 @@ import (
var (
remoteWriteURLs = flagutil.NewArray("remoteWrite.url", "Remote storage URL to write data to. It must support Prometheus remote_write API. "+
"It is recommended using VictoriaMetrics as remote storage. Example url: http://<victoriametrics-host>:8428/api/v1/write . "+
"Pass multiple -remoteWrite.url flags in order to write data concurrently to multiple remote storage systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored")
queues = flag.Int("remoteWrite.queues", 4, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage")
"Pass multiple -remoteWrite.url flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.multitenantURL")
remoteWriteMultitenantURLs = flagutil.NewArray("remoteWrite.multitenantURL", "Base path for multitenant remote storage URL to write data to. "+
"See https://docs.victoriametrics.com/vmagent.html#multitenancy for details. Example url: http://<vminsert>:8480 . "+
"Pass multiple -remoteWrite.multitenantURL flags in order to replicate data to multiple remote storage systems. See also -remoteWrite.url")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory where temporary data for remote write component is stored. "+
"See also -remoteWrite.maxDiskUsagePerURL")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
@@ -37,16 +49,39 @@ var (
"Examples: -remoteWrite.roundDigits=2 would round 1.236 to 1.24, while -remoteWrite.roundDigits=-1 would round 126.78 to 130. "+
"By default digits rounding is disabled. Set it to 100 for disabling it for a particular remote storage. "+
"This option may be used for improving data compression for the stored metrics")
sortLabels = flag.Bool("sortLabels", false, `Whether to sort labels for incoming samples before writing them to all the configured remote storage systems. `+
`This may be needed for reducing memory usage at remote storage when the order of labels in incoming samples is random. `+
`For example, if m{k1="v1",k2="v2"} may be sent as m{k2="v2",k1="v1"}`+
`Enabled sorting for labels can slow down ingestion performance a bit`)
maxHourlySeries = flag.Int("remoteWrite.maxHourlySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last hour. "+
"Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
maxDailySeries = flag.Int("remoteWrite.maxDailySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
)
var rwctxs []*remoteWriteCtx
var (
// rwctxsDefault contains statically populated entries when -remoteWrite.url is specified.
rwctxsDefault []*remoteWriteCtx
// rwctxsMap contains dynamically populated entries when -remoteWrite.multitenantURL is specified.
rwctxsMap = make(map[tenantmetrics.TenantID][]*remoteWriteCtx)
rwctxsMapLock sync.Mutex
// Data without tenant id is written to defaultAuthToken if -remoteWrite.multitenantURL is specified.
defaultAuthToken = &auth.Token{}
)
// MultitenancyEnabled returns true if -remoteWrite.multitenantURL is specified.
func MultitenancyEnabled() bool {
return len(*remoteWriteMultitenantURLs) > 0
}
// Contains the current relabelConfigs.
var allRelabelConfigs atomic.Value
// maxQueues limits the maximum value for `-remoteWrite.queues`. There is no sense in setting too high value,
// since it may lead to high memory usage due to big number of buffers.
var maxQueues = cgroup.AvailableCPUs() * 4
var maxQueues = cgroup.AvailableCPUs() * 16
// InitSecretFlags must be called after flag.Parse and before any logging.
func InitSecretFlags() {
@@ -62,8 +97,29 @@ func InitSecretFlags() {
//
// Stop must be called for graceful shutdown.
func Init() {
if len(*remoteWriteURLs) == 0 {
logger.Fatalf("at least one `-remoteWrite.url` command-line flag must be set")
if len(*remoteWriteURLs) == 0 && len(*remoteWriteMultitenantURLs) == 0 {
logger.Fatalf("at least one `-remoteWrite.url` or `-remoteWrite.multitenantURL` command-line flag must be set")
}
if len(*remoteWriteURLs) > 0 && len(*remoteWriteMultitenantURLs) > 0 {
logger.Fatalf("cannot set both `-remoteWrite.url` and `-remoteWrite.multitenantURL` command-line flags")
}
if *maxHourlySeries > 0 {
hourlySeriesLimiter = bloomfilter.NewLimiter(*maxHourlySeries, time.Hour)
_ = metrics.NewGauge(`vmagent_hourly_series_limit_max_series`, func() float64 {
return float64(hourlySeriesLimiter.MaxItems())
})
_ = metrics.NewGauge(`vmagent_hourly_series_limit_current_series`, func() float64 {
return float64(hourlySeriesLimiter.CurrentItems())
})
}
if *maxDailySeries > 0 {
dailySeriesLimiter = bloomfilter.NewLimiter(*maxDailySeries, 24*time.Hour)
_ = metrics.NewGauge(`vmagent_daily_series_limit_max_series`, func() float64 {
return float64(dailySeriesLimiter.MaxItems())
})
_ = metrics.NewGauge(`vmagent_daily_series_limit_current_series`, func() float64 {
return float64(dailySeriesLimiter.CurrentItems())
})
}
if *queues > maxQueues {
*queues = maxQueues
@@ -72,33 +128,23 @@ func Init() {
*queues = 1
}
initLabelsGlobal()
// Register SIGHUP handler for config reload before loadRelabelConfigs.
// This guarantees that the config will be re-read if the signal arrives just after loadRelabelConfig.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
sighupCh := procutil.NewSighupChan()
rcs, err := loadRelabelConfigs()
if err != nil {
logger.Fatalf("cannot load relabel configs: %s", err)
}
allRelabelConfigs.Store(rcs)
maxInmemoryBlocks := memory.Allowed() / len(*remoteWriteURLs) / maxRowsPerBlock / 100
if maxInmemoryBlocks > 200 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 200
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
for i, remoteWriteURL := range *remoteWriteURLs {
sanitizedURL := fmt.Sprintf("%d:secret-url", i+1)
if *showRemoteWriteURL {
sanitizedURL = fmt.Sprintf("%d:%s", i+1, remoteWriteURL)
}
rwctx := newRemoteWriteCtx(i, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
rwctxs = append(rwctxs, rwctx)
if len(*remoteWriteURLs) > 0 {
rwctxsDefault = newRemoteWriteCtxs(nil, *remoteWriteURLs)
}
// Start config reloader.
sighupCh := procutil.NewSighupChan()
configReloaderWG.Add(1)
go func() {
defer configReloaderWG.Done()
@@ -120,6 +166,41 @@ func Init() {
}()
}
func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
if len(urls) == 0 {
logger.Panicf("BUG: urls must be non-empty")
}
maxInmemoryBlocks := memory.Allowed() / len(urls) / *maxRowsPerBlock / 100
if maxInmemoryBlocks / *queues > 100 {
// There is no much sense in keeping higher number of blocks in memory,
// since this means that the producer outperforms consumer and the queue
// will continue growing. It is better storing the queue to file.
maxInmemoryBlocks = 100 * *queues
}
if maxInmemoryBlocks < 2 {
maxInmemoryBlocks = 2
}
rwctxs := make([]*remoteWriteCtx, len(urls))
for i, remoteWriteURLRaw := range urls {
remoteWriteURL, err := url.Parse(remoteWriteURLRaw)
if err != nil {
logger.Fatalf("invalid -remoteWrite.url=%q: %s", remoteWriteURL, err)
}
sanitizedURL := fmt.Sprintf("%d:secret-url", i+1)
if at != nil {
// Construct full remote_write url for the given tenant according to https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format
remoteWriteURL.Path = fmt.Sprintf("%s/insert/%d:%d/prometheus/api/v1/write", remoteWriteURL.Path, at.AccountID, at.ProjectID)
sanitizedURL = fmt.Sprintf("%s:%d:%d", sanitizedURL, at.AccountID, at.ProjectID)
}
if *showRemoteWriteURL {
sanitizedURL = fmt.Sprintf("%d:%s", i+1, remoteWriteURL)
}
rwctxs[i] = newRemoteWriteCtx(i, at, remoteWriteURL, maxInmemoryBlocks, sanitizedURL)
}
return rwctxs
}
var stopCh = make(chan struct{})
var configReloaderWG sync.WaitGroup
@@ -130,16 +211,62 @@ func Stop() {
close(stopCh)
configReloaderWG.Wait()
for _, rwctx := range rwctxs {
for _, rwctx := range rwctxsDefault {
rwctx.MustStop()
}
rwctxs = nil
rwctxsDefault = nil
// There is no need in locking rwctxsMapLock here, since nobody should call Push during the Stop call.
for _, rwctxs := range rwctxsMap {
for _, rwctx := range rwctxs {
rwctx.MustStop()
}
}
rwctxsMap = nil
if sl := hourlySeriesLimiter; sl != nil {
sl.MustStop()
}
if sl := dailySeriesLimiter; sl != nil {
sl.MustStop()
}
}
// Push sends wr to remote storage systems set via `-remoteWrite.url`.
//
// Note that wr may be modified by Push due to relabeling and rounding.
func Push(wr *prompbmarshal.WriteRequest) {
PushWithAuthToken(nil, wr)
}
// PushWithAuthToken sends wr to remote storage systems set via `-remoteWrite.multitenantURL`.
//
// Note that wr may be modified by Push due to relabeling and rounding.
func PushWithAuthToken(at *auth.Token, wr *prompbmarshal.WriteRequest) {
if at == nil && len(*remoteWriteMultitenantURLs) > 0 {
// Write data to default tenant if at isn't set while -remoteWrite.multitenantURL is set.
at = defaultAuthToken
}
var rwctxs []*remoteWriteCtx
if at == nil {
rwctxs = rwctxsDefault
} else {
if len(*remoteWriteMultitenantURLs) == 0 {
logger.Panicf("BUG: remoteWriteMultitenantURLs must be non-empty for non-nil at")
}
rwctxsMapLock.Lock()
tenantID := tenantmetrics.TenantID{
AccountID: at.AccountID,
ProjectID: at.ProjectID,
}
rwctxs = rwctxsMap[tenantID]
if rwctxs == nil {
rwctxs = newRemoteWriteCtxs(at, *remoteWriteMultitenantURLs)
rwctxsMap[tenantID] = rwctxs
}
rwctxsMapLock.Unlock()
}
var rctx *relabelCtx
rcs := allRelabelConfigs.Load().(*relabelConfigs)
pcsGlobal := rcs.global
@@ -147,14 +274,19 @@ func Push(wr *prompbmarshal.WriteRequest) {
rctx = getRelabelCtx()
}
tss := wr.Timeseries
maxSamplesPerBlock := *maxRowsPerBlock
// Allow up to 10x of labels per each block on average.
maxLabelsPerBlock := 10 * maxSamplesPerBlock
for len(tss) > 0 {
// Process big tss in smaller blocks in order to reduce the maximum memory usage
samplesCount := 0
labelsCount := 0
i := 0
for i < len(tss) {
samplesCount += len(tss[i].Samples)
labelsCount += len(tss[i].Labels)
i++
if samplesCount > maxRowsPerBlock {
if samplesCount >= maxSamplesPerBlock || labelsCount >= maxLabelsPerBlock {
break
}
}
@@ -170,9 +302,9 @@ func Push(wr *prompbmarshal.WriteRequest) {
tssBlock = rctx.applyRelabeling(tssBlock, labelsGlobal, pcsGlobal)
globalRelabelMetricsDropped.Add(tssBlockLen - len(tssBlock))
}
for _, rwctx := range rwctxs {
rwctx.Push(tssBlock)
}
sortLabelsIfNeeded(tssBlock)
tssBlock = limitSeriesCardinality(tssBlock)
pushBlockToRemoteStorages(rwctxs, tssBlock)
if rctx != nil {
rctx.reset()
}
@@ -182,6 +314,106 @@ func Push(wr *prompbmarshal.WriteRequest) {
}
}
func pushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmarshal.TimeSeries) {
if len(tssBlock) == 0 {
// Nothing to push
return
}
// Push block to remote storages in parallel in order to reduce the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
for _, rwctx := range rwctxs {
wg.Add(1)
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
rwctx.Push(tssBlock)
}(rwctx)
}
wg.Wait()
}
// sortLabelsIfNeeded sorts labels if -sortLabels command-line flag is set.
func sortLabelsIfNeeded(tss []prompbmarshal.TimeSeries) {
if !*sortLabels {
return
}
for i := range tss {
promrelabel.SortLabels(tss[i].Labels)
}
}
func limitSeriesCardinality(tss []prompbmarshal.TimeSeries) []prompbmarshal.TimeSeries {
if hourlySeriesLimiter == nil && dailySeriesLimiter == nil {
return tss
}
dst := make([]prompbmarshal.TimeSeries, 0, len(tss))
for i := range tss {
labels := tss[i].Labels
h := getLabelsHash(labels)
if hourlySeriesLimiter != nil && !hourlySeriesLimiter.Add(h) {
hourlySeriesLimitRowsDropped.Add(len(tss[i].Samples))
logSkippedSeries(labels, "-remoteWrite.maxHourlySeries", hourlySeriesLimiter.MaxItems())
continue
}
if dailySeriesLimiter != nil && !dailySeriesLimiter.Add(h) {
dailySeriesLimitRowsDropped.Add(len(tss[i].Samples))
logSkippedSeries(labels, "-remoteWrite.maxDailySeries", dailySeriesLimiter.MaxItems())
continue
}
dst = append(dst, tss[i])
}
return dst
}
var (
hourlySeriesLimiter *bloomfilter.Limiter
dailySeriesLimiter *bloomfilter.Limiter
hourlySeriesLimitRowsDropped = metrics.NewCounter(`vmagent_hourly_series_limit_rows_dropped_total`)
dailySeriesLimitRowsDropped = metrics.NewCounter(`vmagent_daily_series_limit_rows_dropped_total`)
)
func getLabelsHash(labels []prompbmarshal.Label) uint64 {
bb := labelsHashBufPool.Get()
b := bb.B[:0]
for _, label := range labels {
b = append(b, label.Name...)
b = append(b, label.Value...)
}
h := xxhash.Sum64(b)
bb.B = b
labelsHashBufPool.Put(bb)
return h
}
var labelsHashBufPool bytesutil.ByteBufferPool
func logSkippedSeries(labels []prompbmarshal.Label, flagName string, flagValue int) {
select {
case <-logSkippedSeriesTicker.C:
// Do not use logger.WithThrottler() here, since this will increase CPU usage
// because every call to logSkippedSeries will result to a call to labelsToString.
logger.Warnf("skip series %s because %s=%d reached", labelsToString(labels), flagName, flagValue)
default:
}
}
var logSkippedSeriesTicker = time.NewTicker(5 * time.Second)
func labelsToString(labels []prompbmarshal.Label) string {
var b []byte
b = append(b, '{')
for i, label := range labels {
b = append(b, label.Name...)
b = append(b, '=')
b = strconv.AppendQuote(b, label.Value)
if i+1 < len(labels) {
b = append(b, ',')
}
}
b = append(b, '}')
return string(b)
}
var globalRelabelMetricsDropped = metrics.NewCounter("vmagent_remotewrite_global_relabel_metrics_dropped_total")
type remoteWriteCtx struct {
@@ -194,20 +426,38 @@ type remoteWriteCtx struct {
relabelMetricsDropped *metrics.Counter
}
func newRemoteWriteCtx(argIdx int, remoteWriteURL string, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
h := xxhash.Sum64([]byte(remoteWriteURL))
path := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
fq := persistentqueue.MustOpenFastQueue(path, sanitizedURL, maxInmemoryBlocks, maxPendingBytesPerURL.N)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, path, sanitizedURL), func() float64 {
func newRemoteWriteCtx(argIdx int, at *auth.Token, remoteWriteURL *url.URL, maxInmemoryBlocks int, sanitizedURL string) *remoteWriteCtx {
// strip query params, otherwise changing params resets pq
pqURL := *remoteWriteURL
pqURL.RawQuery = ""
pqURL.Fragment = ""
h := xxhash.Sum64([]byte(pqURL.String()))
queuePath := fmt.Sprintf("%s/persistent-queue/%d_%016X", *tmpDataPath, argIdx+1, h)
fq := persistentqueue.MustOpenFastQueue(queuePath, sanitizedURL, maxInmemoryBlocks, maxPendingBytesPerURL.N)
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_data_bytes{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetPendingBytes())
})
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, path, sanitizedURL), func() float64 {
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_pending_inmemory_blocks{path=%q, url=%q}`, queuePath, sanitizedURL), func() float64 {
return float64(fq.GetInmemoryQueueLen())
})
c := newClient(argIdx, remoteWriteURL, sanitizedURL, fq, *queues)
var c *client
switch remoteWriteURL.Scheme {
case "http", "https":
c = newHTTPClient(argIdx, remoteWriteURL.String(), sanitizedURL, fq, *queues)
default:
logger.Fatalf("unsupported scheme: %s for remoteWriteURL: %s, want `http`, `https`", remoteWriteURL.Scheme, sanitizedURL)
}
c.init(argIdx, *queues, sanitizedURL)
sf := significantFigures.GetOptionalArgOrDefault(argIdx, 0)
rd := roundDigits.GetOptionalArgOrDefault(argIdx, 100)
pss := make([]*pendingSeries, *queues)
pssLen := *queues
if n := cgroup.AvailableCPUs(); pssLen > n {
// There is no sense in running more than availableCPUs concurrent pendingSeries,
// since every pendingSeries can saturate up to a single CPU.
pssLen = n
}
pss := make([]*pendingSeries, pssLen)
for i := range pss {
pss[i] = newPendingSeries(fq.MustWriteBlock, sf, rd)
}
@@ -217,7 +467,7 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL string, maxInmemoryBlocks int,
c: c,
pss: pss,
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, path, sanitizedURL)),
relabelMetricsDropped: metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_relabel_metrics_dropped_total{path=%q, url=%q}`, queuePath, sanitizedURL)),
}
}

View File

@@ -1,9 +1,7 @@
package remotewrite
import (
"fmt"
"net"
"strings"
"sync/atomic"
"time"
@@ -11,13 +9,8 @@ import (
"github.com/VictoriaMetrics/metrics"
)
func statDial(network, addr string) (conn net.Conn, err error) {
if !strings.HasPrefix(network, "tcp") {
return nil, fmt.Errorf("unexpected network passed to statDial: %q; it must start from `tcp`", network)
}
if !netutil.TCP6Enabled() {
network = "tcp4"
}
func statDial(networkUnused, addr string) (conn net.Conn, err error) {
network := netutil.GetTCPNetwork()
conn, err = net.DialTimeout(network, addr, 5*time.Second)
dialsTotal.Inc()
if err != nil {

View File

@@ -1,40 +1,54 @@
package vmimport
import (
"io"
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
parser "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/writeconcurrencylimiter"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="vmimport"}`)
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="vmimport"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="vmimport"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="vmimport"}`)
)
// InsertHandler processes `/api/v1/import` request.
//
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6
func InsertHandler(req *http.Request) error {
func InsertHandler(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(req, func(rows []parser.Row) error {
return insertRows(rows, extraLabels)
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
return parser.ParseStream(req.Body, isGzipped, func(rows []parser.Row) error {
return insertRows(at, rows, extraLabels)
})
})
}
func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
// InsertHandlerForReader processes metrics from given reader
func InsertHandlerForReader(r io.Reader, isGzipped bool) error {
return writeconcurrencylimiter.Do(func() error {
return parser.ParseStream(r, isGzipped, func(rows []parser.Row) error {
return insertRows(nil, rows, nil)
})
})
}
func insertRows(at *auth.Token, rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
@@ -74,8 +88,11 @@ func insertRows(rows []parser.Row, extraLabels []prompbmarshal.Label) error {
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
remotewrite.Push(&ctx.WriteRequest)
remotewrite.PushWithAuthToken(at, &ctx.WriteRequest)
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -56,6 +56,7 @@ test-vmalert:
go test -v -race -cover ./app/vmalert/datasource
go test -v -race -cover ./app/vmalert/notifier
go test -v -race -cover ./app/vmalert/config
go test -v -race -cover ./app/vmalert/remotewrite
run-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules2-good.rules \
@@ -66,7 +67,17 @@ run-vmalert: vmalert
-remoteRead.url=http://localhost:8428 \
-external.label=cluster=east-1 \
-external.label=replica=a \
-evaluationInterval=3s
-evaluationInterval=3s \
-rule.configCheckInterval=10s
replay-vmalert: vmalert
./bin/vmalert -rule=app/vmalert/config/testdata/rules-replay-good.rules \
-datasource.url=http://localhost:8428 \
-remoteWrite.url=http://localhost:8428 \
-external.label=cluster=east-1 \
-external.label=replica=a \
-replay.timeFrom=2021-05-11T07:21:43Z \
-replay.timeTo=2021-05-29T18:40:43Z
vmalert-amd64:
CGO_ENABLED=1 GOARCH=amd64 $(MAKE) vmalert-local-with-goarch
@@ -88,3 +99,9 @@ vmalert-local-with-goarch:
vmalert-pure:
APP_NAME=vmalert $(MAKE) app-local-pure
vmalert-windows-amd64:
GOARCH=amd64 APP_NAME=vmalert $(MAKE) app-local-windows-with-goarch
vmalert-windows-amd64-prod:
APP_NAME=vmalert $(MAKE) app-via-docker-windows-amd64

View File

@@ -1,30 +1,35 @@
## vmalert
# vmalert
`vmalert` executes a list of given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
`vmalert` executes a list of the given [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
or [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
rules against configured address.
rules against configured `-datasource.url`. For sending alerting notifications
vmalert relies on [Alertmanager]((https://github.com/prometheus/alertmanager)) configured via `-notifier.url` flag.
Recording rules results are persisted via [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
protocol and require `-remoteWrite.url` to be configured.
Vmalert is heavily inspired by [Prometheus](https://prometheus.io/docs/alerting/latest/overview/)
implementation and aims to be compatible with its syntax.
### Features:
## Features
* Integration with [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) TSDB;
* VictoriaMetrics [MetricsQL](https://victoriametrics.github.io/MetricsQL.html)
* VictoriaMetrics [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html)
support and expressions validation;
* Prometheus [alerting rules definition format](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/#defining-alerting-rules)
support;
* Integration with [Alertmanager](https://github.com/prometheus/alertmanager);
* Keeps the alerts [state on restarts](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/app/vmalert#alerts-state-on-restarts);
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite) for details.
* Keeps the alerts [state on restarts](#alerts-state-on-restarts);
* Graphite datasource can be used for alerting and recording rules. See [these docs](#graphite);
* Recording and Alerting rules backfilling (aka `replay`). See [these docs](#rules-backfilling);
* Lightweight without extra dependencies.
### Limitations:
* `vmalert` execute queries against remote datasource which has reliability risks because of network.
It is recommended to configure alerts thresholds and rules expressions with understanding that network request
may fail;
* by default, rules execution is sequential within one group, but persisting of execution results to remote
storage is asynchronous. Hence, user shouldn't rely on recording rules chaining when result of previous
recording rule is reused in next one;
* `vmalert` has no UI, just an API for getting groups and rules statuses.
## Limitations
* `vmalert` execute queries against remote datasource which has reliability risks because of the network.
It is recommended to configure alerts thresholds and rules expressions with the understanding that network
requests may fail;
* by default, rules execution is sequential within one group, but persistence of execution results to remote
storage is asynchronous. Hence, user shouldn't rely on chaining of recording rules when result of previous
recording rule is reused in the next one;
### QuickStart
## QuickStart
To build `vmalert` from sources:
```
@@ -32,95 +37,131 @@ git clone https://github.com/VictoriaMetrics/VictoriaMetrics
cd VictoriaMetrics
make vmalert
```
The build binary will be placed to `VictoriaMetrics/bin` folder.
The build binary will be placed in `VictoriaMetrics/bin` folder.
To start using `vmalert` you will need the following things:
* list of rules - PromQL/MetricsQL expressions to execute;
* datasource address - reachable VictoriaMetrics instance for rules execution;
* notifier address - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts and sending notifications.
* remote write address - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage address for storing recording rules results and alerts state in for of timeseries. This is optional.
* datasource address - reachable MetricsQL endpoint to run queries against;
* notifier address [optional] - reachable [Alert Manager](https://github.com/prometheus/alertmanager) instance for processing,
aggregating alerts, and sending notifications.
* remote write address [optional] - [remote write](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations)
compatible storage to persist rules and alerts state info;
* remote read address [optional] - MetricsQL compatible datasource to restore alerts state from.
Then configure `vmalert` accordingly:
```
./bin/vmalert -rule=alert.rules \
./bin/vmalert -rule=alert.rules \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://localhost:8428 \ # PromQL compatible datasource
-notifier.url=http://localhost:9093 \ # AlertManager URL
-notifier.url=http://localhost:9093 \ # AlertManager URL (required if alerting rules are used)
-notifier.url=http://127.0.0.1:9093 \ # AlertManager replica URL
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist rules
-remoteRead.url=http://localhost:8428 \ # PromQL compatible datasource to restore alerts state from
-remoteWrite.url=http://localhost:8428 \ # Remote write compatible storage to persist rules and alerts state info (required if recording rules are used)
-remoteRead.url=http://localhost:8428 \ # MetricsQL compatible datasource to restore alerts state from
-external.label=cluster=east-1 \ # External label to be applied for each rule
-external.label=replica=a \ # Multiple external labels may be set
-evaluationInterval=3s # Default evaluation interval if not specified in rules group
-external.label=replica=a # Multiple external labels may be set
```
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
Note there's a separate `remoteRead.url` to allow writing results of
alerting/recording rules into a different storage than the initial data that's
queried. This allows using `vmalert` to aggregate data from a short-term,
high-frequency, high-cardinality storage into a long-term storage with
decreased cardinality and a bigger interval between samples.
Configuration for [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found
See the full list of configuration flags in [configuration](#configuration) section.
If you run multiple `vmalert` services for the same datastore or AlertManager - do not forget
to specify different `external.label` flags in order to define which `vmalert` generated rules or alerts.
Configuration for [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
and [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) rules is very
similar to Prometheus rules and configured using YAML. Configuration examples may be found
in [testdata](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/config/testdata) folder.
Every `rule` belongs to `group` and every configuration file may contain arbitrary number of groups:
Every `rule` belongs to a `group` and every configuration file may contain arbitrary number of groups:
```yaml
groups:
[ - <rule_group> ]
```
#### Groups
### Groups
Each group has following attributes:
Each group has the following attributes:
```yaml
# The name of the group. Must be unique within a file.
name: <string>
# How often rules in the group are evaluated.
[ interval: <duration> | default = global.evaluation_interval ]
[ interval: <duration> | default = -evaluationInterval flag ]
# How many rules execute at once. Increasing concurrency may speed
# up round execution speed.
# How many rules execute at once within a group. Increasing concurrency may speed
# up round execution speed.
[ concurrency: <integer> | default = 1 ]
# Optional type for expressions inside the rules. Supported values: "graphite" and "prometheus".
# By default "prometheus" rule type is used.
# By default "prometheus" type is used.
[ type: <string> ]
# Warning: DEPRECATED
# Please use `params` instead:
# params:
# extra_label: ["job=nodeexporter", "env=prod"]
extra_filter_labels:
[ <labelname>: <labelvalue> ... ]
# Optional list of HTTP URL parameters
# applied for all rules requests within a group
# For example:
# params:
# nocache: ["1"] # disable caching for vmselect
# denyPartialResponse: ["true"] # fail if one or more vmstorage nodes returned an error
# extra_label: ["env=dev"] # apply additional label filter "env=dev" for all requests
# see more details at https://docs.victoriametrics.com#prometheus-querying-api-enhancements
params:
[ <string>: [<string>, ...]]
# Optional list of labels added to every rule within a group.
# It has priority over the external labels.
# Labels are commonly used for adding environment
# or tenant-specific tag.
labels:
[ <labelname>: <labelvalue> ... ]
rules:
[ - <rule> ... ]
```
#### Rules
### Rules
Every rule contains `expr` field for [PromQL](https://prometheus.io/docs/prometheus/latest/querying/basics/)
or [MetricsQL](https://docs.victoriametrics.com/MetricsQL.html) expression. Vmalert will execute the configured
expression and then act according to the Rule type.
There are two types of Rules:
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allows to define alert conditions via [MetricsQL](https://victoriametrics.github.io/MetricsQL.html)
and to send notifications about firing alerts to [Alertmanager](https://github.com/prometheus/alertmanager).
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow you to precompute frequently needed or computationally expensive expressions
and save their result as a new set of time series.
* [alerting](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/) -
Alerting rules allow defining alert conditions via `expr` field and to send notifications to
[Alertmanager](https://github.com/prometheus/alertmanager) if execution result is not empty.
* [recording](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/) -
Recording rules allow defining `expr` which result will be then backfilled to configured
`-remoteWrite.url`. Recording rules are used to precompute frequently needed or computationally
expensive expressions and save their result as a new set of time series.
`vmalert` forbids to define duplicates - rules with the same combination of name, expression and labels
within one group.
`vmalert` forbids defining duplicates - rules with the same combination of name, expression, and labels
within one group.
##### Alerting rules
#### Alerting rules
The syntax for alerting rule is following:
The syntax for alerting rule is the following:
```yaml
# The name of the alert. Must be a valid metric name.
alert: <string>
# Optional type for the rule. Supported values: "graphite", "prometheus".
# By default "prometheus" rule type is used.
[ type: <string> ]
# The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If type="graphite", then the expression
# By default PromQL/MetricsQL expression is used. If group.type="graphite", then the expression
# must contain valid Graphite expression.
expr: <string>
# Alerts are considered firing once they have been returned for this long.
# Alerts which have not yet fired for long enough are considered pending.
# Alerts which have not yet been fired for long enough are considered pending.
# If param is omitted or set to 0 then alerts will be immediately considered
# as firing once they return.
[ for: <duration> | default = 0s ]
# Labels to add or overwrite for each alert.
@@ -130,21 +171,22 @@ labels:
# Annotations to add to each alert.
annotations:
[ <labelname>: <tmpl_string> ]
```
```
##### Recording rules
It is allowed to use [Go templating](https://golang.org/pkg/text/template/) in annotations
to format data, iterate over it or execute expressions.
Additionally, `vmalert` provides some extra templating functions
listed [here](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/app/vmalert/notifier/template_func.go).
#### Recording rules
The syntax for recording rules is following:
```yaml
# The name of the time series to output to. Must be a valid metric name.
record: <string>
# Optional type for the rule. Supported values: "graphite", "prometheus".
# By default "prometheus" rule type is used.
[ type: <string> ]
# The expression to evaluate. The expression language depends on the type value.
# By default MetricsQL expression is used. If type="graphite", then the expression
# By default MetricsQL expression is used. If group.type="graphite", then the expression
# must contain valid Graphite expression.
expr: <string>
@@ -153,63 +195,324 @@ labels:
[ <labelname>: <labelvalue> ]
```
For recording rules to work `-remoteWrite.url` must specified.
For recording rules to work `-remoteWrite.url` must be specified.
#### Alerts state on restarts
### Alerts state on restarts
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after reloading of `vmalert`
`vmalert` has no local storage, so alerts state is stored in the process memory. Hence, after restart of `vmalert`
the process alerts state will be lost. To avoid this situation, `vmalert` should be configured via the following flags:
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or VMInsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and may be queried from VM just as any other time series.
The state stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or VMSelect (Cluster). `vmalert` will try to restore alerts state
* `-remoteWrite.url` - URL to VictoriaMetrics (Single) or vminsert (Cluster). `vmalert` will persist alerts state
into the configured address in the form of time series named `ALERTS` and `ALERTS_FOR_STATE` via remote-write protocol.
These are regular time series and maybe queried from VM just as any other time series.
The state is stored to the configured address on every rule evaluation.
* `-remoteRead.url` - URL to VictoriaMetrics (Single) or vmselect (Cluster). `vmalert` will try to restore alerts state
from configured address by querying time series with name `ALERTS_FOR_STATE`.
Both flags are required for the proper state restoring. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` or received state doesn't match current `vmalert`
rules configuration.
Both flags are required for proper state restoration. Restore process may fail if time series are missing
in configured `-remoteRead.url`, weren't updated in the last `1h` (controlled by `-remoteRead.lookback`)
or received state doesn't match current `vmalert` rules configuration.
#### WEB
### Multitenancy
There are the following approaches exist for alerting and recording rules across
[multiple tenants](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#multitenancy):
* To run a separate `vmalert` instance per each tenant.
The corresponding tenant must be specified in `-datasource.url` command-line flag
according to [these docs](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#url-format).
For example, `/path/to/vmalert -datasource.url=http://vmselect:8481/select/123/prometheus`
would run alerts against `AccountID=123`. For recording rules the `-remoteWrite.url` command-line
flag must contain the url for the specific tenant as well.
For example, `-remoteWrite.url=http://vminsert:8480/insert/123/prometheus` would write recording
rules to `AccountID=123`.
* To specify `tenant` parameter per each alerting and recording group if
[enterprise version of vmalert](https://victoriametrics.com/products/enterprise/) is used
with `-clusterMode` command-line flag. For example:
```yaml
groups:
- name: rules_for_tenant_123
tenant: "123"
rules:
# Rules for accountID=123
- name: rules_for_tenant_456:789
tenant: "456:789"
rules:
# Rules for accountID=456, projectID=789
```
If `-clusterMode` is enabled, then `-datasource.url`, `-remoteRead.url` and `-remoteWrite.url` must
contain only the hostname without tenant id. For example: `-datasource.url=http://vmselect:8481`.
`vmalert` automatically adds the specified tenant to urls per each recording rule in this case.
If `-clusterMode` is enabled and the `tenant` in a particular group is missing, then the tenant value
is obtained from `-defaultTenant.prometheus` or `-defaultTenant.graphite` depending on the `type` of the group.
The enterprise version of vmalert is available in `vmutils-*-enterprise.tar.gz` files
at [release page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) and in `*-enterprise`
tags at [Docker Hub](https://hub.docker.com/r/victoriametrics/vmalert/tags).
### Topology examples
The following sections are showing how `vmalert` may be used and configured
for different scenarios.
Please note, not all flags in examples are required:
* `-remoteWrite.url` and `-remoteRead.url` are optional and are needed only if
you have recording rules or want to store [alerts state](#alerts-state-on-restarts) on `vmalert` restarts;
* `-notifier.url` is optional and is needed only if you have alerting rules.
#### Single-node VictoriaMetrics
The simplest configuration where one single-node VM server is used for
rules execution, storing recording rules results and alerts state.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions
-remoteWrite.url=http://victoriametrics:8428 \ # VM-single addr to persist alerts state and recording rules results
-remoteRead.url=http://victoriametrics:8428 \ # VM-single addr for restoring alerts state after restart
-notifier.url=http://alertmanager:9093 # AlertManager addr to send alerts when they trigger
```
<img alt="vmalert single" width="500" src="vmalert_single.png">
#### Cluster VictoriaMetrics
In [cluster mode](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html)
VictoriaMetrics has separate components for writing and reading path:
`vminsert` and `vmselect` components respectively. `vmselect` is used for executing rules expressions
and `vminsert` is used to persist recording rules results and alerts state.
Cluster mode could have multiple `vminsert` and `vmselect` components.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://vmselect:8481/select/0/prometheus # vmselect addr for executing rules expressions
-remoteWrite.url=http://vminsert:8480/insert/0/prometheuss # vminsert addr to persist alerts state and recording rules results
-remoteRead.url=http://vmselect:8481/select/0/prometheus # vmselect addr for restoring alerts state after restart
-notifier.url=http://alertmanager:9093 # AlertManager addr to send alerts when they trigger
```
<img alt="vmalert cluster" src="vmalert_cluster.png">
In case when you want to spread the load on these components - add balancers before them and configure
`vmalert` with balancer's addresses. Please, see more about VM's cluster architecture
[here](https://docs.victoriametrics.com/Cluster-VictoriaMetrics.html#architecture-overview).
#### HA vmalert
For HA user can run multiple identically configured `vmalert` instances.
It means all of them will execute the same rules, write state and results to
the same destinations, and send alert notifications to multiple configured
Alertmanagers.
`vmalert` configuration flags:
```
./bin/vmalert -rule=rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://victoriametrics:8428 \ # VM-single addr for executing rules expressions
-remoteWrite.url=http://victoriametrics:8428 \ # VM-single addr to persist alerts state and recording rules results
-remoteRead.url=http://victoriametrics:8428 \ # VM-single addr for restoring alerts state after restart
-notifier.url=http://alertmanager1:9093 \ # Multiple AlertManager addresses to send alerts when they trigger
-notifier.url=http://alertmanagerN:9093 # The same alert will be sent to all configured notifiers
```
<img alt="vmalert ha" width="800px" src="vmalert_ha.png">
To avoid recording rules results and alerts state duplication in VictoriaMetrics server
don't forget to configure [deduplication](https://docs.victoriametrics.com/Single-server-VictoriaMetrics.html#deduplication).
Alertmanager will automatically deduplicate alerts with identical labels, so ensure that
all `vmalert`s are having the same config.
Don't forget to configure [cluster mode](https://prometheus.io/docs/alerting/latest/alertmanager/)
for Alertmanagers for better reliability.
This example uses single-node VM server for the sake of simplicity.
Check how to replace it with [cluster VictoriaMetrics](#cluster-victoriametrics) if needed.
#### Downsampling and aggregation via vmalert
Example shows how to build a topology where `vmalert` will process data from one cluster
and write results into another. Such clusters may be called as "hot" (low retention,
high-speed disks, used for operative monitoring) and "cold" (long term retention,
slower/cheaper disks, low resolution data). With help of `vmalert`, user can setup
recording rules to process raw data from "hot" cluster (by applying additional transformations
or reducing resolution) and push results to "cold" cluster.
`vmalert` configuration flags:
```
./bin/vmalert -rule=downsampling-rules.yml \ # Path to the file with rules configuration. Supports wildcard
-datasource.url=http://raw-cluster-vmselect:8481/select/0/prometheus # vmselect addr for executing recordi ng rules expressions
-remoteWrite.url=http://aggregated-cluster-vminsert:8480/insert/0/prometheuss # vminsert addr to persist recording rules results
```
<img alt="vmalert multi cluster" src="vmalert_multicluster.png">
Please note, [replay](#rules-backfilling) feature may be used for transforming historical data.
Flags `-remoteRead.url` and `-notifier.url` are omitted since we assume only recording rules are used.
### Web
`vmalert` runs a web-server (`-httpListenAddr`) for serving metrics and alerts endpoints:
* `http://<vmalert-addr>` - UI;
* `http://<vmalert-addr>/api/v1/groups` - list of all loaded groups and rules;
* `http://<vmalert-addr>/api/v1/alerts` - list of all active alerts;
* `http://<vmalert-addr>/api/v1/<groupName>/<alertID>/status" ` - get alert status by ID.
* `http://<vmalert-addr>/api/v1/<groupID>/<alertID>/status" ` - get alert status by ID.
Used as alert source in AlertManager.
* `http://<vmalert-addr>/metrics` - application metrics.
* `http://<vmalert-addr>/-/reload` - hot configuration reload.
### Graphite
## Graphite
vmalert sends requests to `<-datasource.url>/render?format=json` during evaluation of alerting and recording rules
if the corresponding group or rule contains `type: "graphite"` config option. It is expected that the `<-datasource.url>/render`
implements [Graphite Render API](https://graphite.readthedocs.io/en/stable/render_api.html) for `format=json`.
When using vmalert with both `graphite` and `prometheus` rules configured against cluster version of VM do not forget
to set `-datasource.appendTypePrefix` flag to `true`, so vmalert can adjust URL prefix automatically based on query type.
to set `-datasource.appendTypePrefix` flag to `true`, so vmalert can adjust URL prefix automatically based on the query type.
## Rules backfilling
vmalert supports alerting and recording rules backfilling (aka `replay`). In replay mode vmalert
can read the same rules configuration as normal, evaluate them on the given time range and backfill
results via remote write to the configured storage. vmalert supports any PromQL/MetricsQL compatible
data source for backfilling.
### How it works
In `replay` mode vmalert works as a cli-tool and exits immediately after work is done.
To run vmalert in `replay` mode:
```
./bin/vmalert -rule=path/to/your.rules \ # path to files with rules you usually use with vmalert
-datasource.url=http://localhost:8428 \ # PromQL/MetricsQL compatible datasource
-remoteWrite.url=http://localhost:8428 \ # remote write compatible storage to persist results
-replay.timeFrom=2021-05-11T07:21:43Z \ # time from begin replay
-replay.timeTo=2021-05-29T18:40:43Z # time to finish replay
```
The output of the command will look like the following:
```
Replay mode:
from: 2021-05-11 07:21:43 +0000 UTC # set by -replay.timeFrom
to: 2021-05-29 18:40:43 +0000 UTC # set by -replay.timeTo
max data points per request: 1000 # set by -replay.maxDatapointsPerQuery
Group "ReplayGroup"
interval: 1m0s
requests to make: 27
max range per request: 16h40m0s
> Rule "type:vm_cache_entries:rate5m" (ID: 1792509946081842725)
27 / 27 [----------------------------------------------------------------------------------------------------] 100.00% 78 p/s
> Rule "go_cgo_calls_count:rate5m" (ID: 17958425467471411582)
27 / 27 [-----------------------------------------------------------------------------------------------------] 100.00% ? p/s
Group "vmsingleReplay"
interval: 30s
requests to make: 54
max range per request: 8h20m0s
> Rule "RequestErrorsToAPI" (ID: 17645863024999990222)
54 / 54 [-----------------------------------------------------------------------------------------------------] 100.00% ? p/s
> Rule "TooManyLogs" (ID: 9042195394653477652)
54 / 54 [-----------------------------------------------------------------------------------------------------] 100.00% ? p/s
2021-06-07T09:59:12.098Z info app/vmalert/replay.go:68 replay finished! Imported 511734 samples
```
In `replay` mode all groups are executed sequentially one-by-one. Rules within the group are
executed sequentially as well (`concurrency` setting is ignored). Vmalert sends rule's expression
to [/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries) endpoint
of the configured `-datasource.url`. Returned data is then processed according to the rule type and
backfilled to `-remoteWrite.url` via [remote Write protocol](https://prometheus.io/docs/prometheus/latest/storage/#remote-storage-integrations).
Vmalert respects `evaluationInterval` value set by flag or per-group during the replay.
Vmalert automatically disables caching on VictoriaMetrics side by sending `nocache=1` param. It allows
to prevent cache pollution and unwanted time range boundaries adjustment during backfilling.
#### Recording rules
The result of recording rules `replay` should match with results of normal rules evaluation.
#### Alerting rules
The result of alerting rules `replay` is time series reflecting [alert's state](#alerts-state-on-restarts).
To see if `replayed` alert has fired in the past use the following PromQL/MetricsQL expression:
```
ALERTS{alertname="your_alertname", alertstate="firing"}
```
Execute the query against storage which was used for `-remoteWrite.url` during the `replay`.
### Additional configuration
There are following non-required `replay` flags:
* `-replay.maxDatapointsPerQuery` - the max number of data points expected to receive in one request.
In two words, it affects the max time range for every `/query_range` request. The higher the value,
the fewer requests will be issued during `replay`.
* `-replay.ruleRetryAttempts` - when datasource fails to respond vmalert will make this number of retries
per rule before giving up.
* `-replay.rulesDelay` - delay between sequential rules execution. Important in cases if there are chaining
(rules which depend on each other) rules. It is expected, that remote storage will be able to persist
previously accepted data during the delay, so data will be available for the subsequent queries.
Keep it equal or bigger than `-remoteWrite.flushInterval`.
See full description for these flags in `./vmalert --help`.
### Limitations
* Graphite engine isn't supported yet;
* `query` template function is disabled for performance reasons (might be changed in future);
### Configuration
## Monitoring
`vmalert` exports various metrics in Prometheus exposition format at `http://vmalert-host:8880/metrics` page.
We recommend setting up regular scraping of this page either through `vmagent` or by Prometheus so that the exported
metrics may be analyzed later.
Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/14950) for `vmalert` overview. Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it.
If you have suggestions for improvements or have found a bug - please open an issue on github or add
a review to the dashboard.
## Configuration
### Flags
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
The shortlist of configuration flags is the following:
```
-datasource.appendTypePrefix
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to VMSelect URL.
Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.
-datasource.basicAuth.password string
Optional basic auth password for -datasource.url
-datasource.basicAuth.passwordFile string
Optional path to basic auth password to use for -datasource.url
-datasource.basicAuth.username string
Optional basic auth username for -datasource.url
-datasource.bearerToken string
Optional bearer auth token to use for -datasource.url.
-datasource.bearerTokenFile string
Optional path to bearer token file to use for -datasource.url.
-datasource.lookback duration
Lookback defines how far to look into past when evaluating queries. For example, if datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.
-datasource.maxIdleConnections int
Defines the number of idle (keep-alive connections) to configured datasource.Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state. (default 100)
Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state. (default 100)
-datasource.queryStep duration
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.
queryStep defines how far a value can fallback to when evaluating queries. For example, if datasource.queryStep=15s then param "step" with value "15s" will be added to every query.If queryStep isn't specified, rule's evaluationInterval will be used instead.
-datasource.roundDigits int
Adds "round_digits" GET param to datasource requests. In VM "round_digits" limits the number of digits after the decimal point in response values.
-datasource.tlsCAFile string
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default system CA is used
Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used
-datasource.tlsCertFile string
Optional path to client-side TLS certificate file to use when connecting to -datasource.url
-datasource.tlsInsecureSkipVerify
@@ -217,41 +520,43 @@ The shortlist of configuration flags is the following:
-datasource.tlsKeyFile string
Optional path to client-side TLS certificate key to use when connecting to -datasource.url
-datasource.tlsServerName string
Optional TLS server name to use for connections to -datasource.url. By default the server name from -datasource.url is used
Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used
-datasource.url string
Victoria Metrics or VMSelect url. Required parameter. E.g. http://127.0.0.1:8428
VictoriaMetrics or vmselect url. Required parameter. E.g. http://127.0.0.1:8428
-disableAlertgroupLabel
Whether to disable adding group's name as label to generated alerts and time series.
-dryRun -rule
Whether to check only config files without running vmalert. The rules file are validated. The -rule flag must be specified.
-enableTCP6
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP is used
Whether to enable IPv6 for listening and dialing. By default only IPv4 TCP and UDP is used
-envflag.enable
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set
Whether to enable reading flags from environment variables additionally to command line. Command line flag values have priority over values from environment vars. Flags are read only from command line if this flag isn't set. See https://docs.victoriametrics.com/#environment-variables for more details
-envflag.prefix string
Prefix for environment variables if -envflag.enable is set
-evaluationInterval duration
How often to evaluate the rules (default 1m0s)
-external.alert.source string
External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used
-external.label array
Optional label in the form 'name=value' to add to all generated recording rules and alerts. Pass multiple -label flags in order to add multiple label sets.
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-external.url string
External URL is used as alert's source for sent alerts to the notifier
-fs.disableMmap
Whether to use pread() instead of mmap() for reading data files. By default mmap() is used for 64-bit arches and pread() is used for 32-bit arches, since they cannot read data files bigger than 2^32 bytes in memory. mmap() is usually faster for reading small data chunks than pread()
-http.connTimeout duration
Incoming http connections are closed after the configured timeout. This may help spreading incoming load among a cluster of services behind load balancer. Note that the real timeout may be bigger by up to 10% as a protection from Thundering herd problem (default 2m0s)
Incoming http connections are closed after the configured timeout. This may help to spread the incoming load among a cluster of services behind a load balancer. Please note that the real timeout may be bigger by up to 10% as a protection against the thundering herd problem (default 2m0s)
-http.disableResponseCompression
Disable compression of HTTP responses for saving CPU resources. By default compression is enabled to save network bandwidth
Disable compression of HTTP responses to save CPU resources. By default compression is enabled to save network bandwidth
-http.idleConnTimeout duration
Timeout for incoming idle http connections (default 1m0s)
-http.maxGracefulShutdownDuration duration
The maximum duration for graceful shutdown of HTTP server. Highly loaded server may require increased value for graceful shutdown (default 7s)
The maximum duration for a graceful shutdown of the HTTP server. A highly loaded server may require increased value for a graceful shutdown (default 7s)
-http.pathPrefix string
An optional prefix to add to all the paths handled by http server. For example, if '-http.pathPrefix=/foo/bar' is set, then all the http requests will be handled on '/foo/bar/*' paths. This may be useful for proxied requests. See https://www.robustperception.io/using-external-urls-and-proxies-with-prometheus
-http.shutdownDelay duration
Optional delay before http server shutdown. During this dealy the servier returns non-OK responses from /health page, so load balancers can route new requests to other servers
Optional delay before http server shutdown. During this delay, the server returns non-OK responses from /health page, so load balancers can route new requests to other servers
-httpAuth.password string
Password for HTTP Basic Auth. The authentication is disabled if -httpAuth.username is empty
-httpAuth.username string
@@ -261,7 +566,7 @@ The shortlist of configuration flags is the following:
-loggerDisableTimestamps
Whether to disable writing timestamps in logs
-loggerErrorsPerSecondLimit int
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, then the remaining errors are suppressed. Zero value disables the rate limit
Per-second limit on the number of ERROR messages. If more than the given number of errors are emitted per second, the remaining errors are suppressed. Zero values disable the rate limit
-loggerFormat string
Format for logs. Possible values: default, json (default "default")
-loggerLevel string
@@ -271,44 +576,54 @@ The shortlist of configuration flags is the following:
-loggerTimezone string
Timezone to use for timestamps in logs. Timezone must be a valid IANA Time Zone. For example: America/New_York, Europe/Berlin, Etc/GMT+3 or Local (default "UTC")
-loggerWarnsPerSecondLimit int
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero value disables the rate limit
-memory.allowedBytes value
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to non-zero value. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage
Supports the following optional suffixes for values: KB, MB, GB, KiB, MiB, GiB (default 0)
Per-second limit on the number of WARN messages. If more than the given number of warns are emitted per second, then the remaining warns are suppressed. Zero values disable the rate limit
-memory.allowedBytes size
Allowed size of system memory VictoriaMetrics caches may occupy. This option overrides -memory.allowedPercent if set to a non-zero value. Too low a value may increase the cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache resulting in higher disk IO usage
Supports the following optional suffixes for size values: KB, MB, GB, KiB, MiB, GiB (default 0)
-memory.allowedPercent float
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low value may increase cache miss rate, which usually results in higher CPU and disk IO usage. Too high value may evict too much data from OS page cache, which will result in higher disk IO usage (default 60)
Allowed percent of system memory VictoriaMetrics caches may occupy. See also -memory.allowedBytes. Too low a value may increase cache miss rate usually resulting in higher CPU and disk IO usage. Too high a value may evict too much data from OS page cache which will result in higher disk IO usage (default 60)
-metricsAuthKey string
Auth key for /metrics. It overrides httpAuth settings
Auth key for /metrics. It must be passed via authKey query arg. It overrides httpAuth.* settings
-notifier.basicAuth.password array
Optional basic auth password for -notifier.url
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.basicAuth.username array
Optional basic auth username for -notifier.url
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsCAFile array
Optional path to TLS CA file to use for verifying connections to -notifier.url. By default system CA is used
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsCertFile array
Optional path to client-side TLS certificate file to use when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsInsecureSkipVerify array
Whether to skip tls verification when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
-notifier.tlsKeyFile array
Optional path to client-side TLS certificate key to use when connecting to -notifier.url
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.tlsServerName array
Optional TLS server name to use for connections to -notifier.url. By default the server name from -notifier.url is used
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-notifier.url array
Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093
Supports array of values separated by comma or specified via multiple flags.
Prometheus alertmanager URL, e.g. http://127.0.0.1:9093
Supports an array of values separated by comma or specified via multiple flags.
-pprofAuthKey string
Auth key for /debug/pprof. It overrides httpAuth settings
Auth key for /debug/pprof. It must be passed via authKey query arg. It overrides httpAuth.* settings
-remoteRead.basicAuth.password string
Optional basic auth password for -remoteRead.url
-remoteRead.basicAuth.passwordFile string
Optional path to basic auth password to use for -remoteRead.url
-remoteRead.basicAuth.username string
Optional basic auth username for -remoteRead.url
-remoteRead.bearerToken string
Optional bearer auth token to use for -remoteRead.url.
-remoteRead.bearerTokenFile string
Optional path to bearer token file to use for -remoteRead.url.
-remoteRead.disablePathAppend
Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.
-remoteRead.ignoreRestoreErrors
Whether to ignore errors from remote storage when restoring alerts state on startup. (default true)
-remoteRead.lookback duration
Lookback defines how far to look into past for alerts timeseries. For example, if lookback=1h then range from now() to now()-1h will be scanned. (default 1h0m0s)
-remoteRead.tlsCAFile string
@@ -322,13 +637,21 @@ The shortlist of configuration flags is the following:
-remoteRead.tlsServerName string
Optional TLS server name to use for connections to -remoteRead.url. By default the server name from -remoteRead.url is used
-remoteRead.url vmalert
Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428
Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts state. This configuration makes sense only if vmalert was configured with `remoteWrite.url` before and has been successfully persisted its state. E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend
-remoteWrite.basicAuth.password string
Optional basic auth password for -remoteWrite.url
-remoteWrite.basicAuth.passwordFile string
Optional path to basic auth password to use for -remoteWrite.url
-remoteWrite.basicAuth.username string
Optional basic auth username for -remoteWrite.url
-remoteWrite.bearerToken string
Optional bearer auth token to use for -remoteWrite.url.
-remoteWrite.bearerTokenFile string
Optional path to bearer token file to use for -remoteWrite.url.
-remoteWrite.concurrency int
Defines number of writers for concurrent writing into remote querier (default 1)
-remoteWrite.disablePathAppend
Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.
-remoteWrite.flushInterval duration
Defines interval of flushes to remote write endpoint (default 5s)
-remoteWrite.maxBatchSize int
@@ -346,16 +669,30 @@ The shortlist of configuration flags is the following:
-remoteWrite.tlsServerName string
Optional TLS server name to use for connections to -remoteWrite.url. By default the server name from -remoteWrite.url is used
-remoteWrite.url string
Optional URL to Victoria Metrics or VMInsert where to persist alerts state and recording rules results in form of timeseries. E.g. http://127.0.0.1:8428
Optional URL to VictoriaMetrics or vminsert where to persist alerts state and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend
-replay.maxDatapointsPerQuery int
Max number of data points expected in one request. The higher the value, the less requests will be made during replay. (default 1000)
-replay.ruleRetryAttempts int
Defines how many retries to make before giving up on rule if request for it returns an error. (default 5)
-replay.rulesDelay duration
Delay between rules evaluation within the group. Could be important if there are chained rules inside of the groupand processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule.Keep it equal or bigger than -remoteWrite.flushInterval. (default 1s)
-replay.timeFrom string
The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'
-replay.timeTo string
The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'
-rule array
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.
Supports array of values separated by comma or specified via multiple flags.
Supports an array of values separated by comma or specified via multiple flags.
-rule.configCheckInterval duration
Interval for checking for changes in '-rule' files. By default the checking is disabled. Send SIGHUP signal in order to force config check for changes
-rule.maxResolveDuration duration
Limits the maximum duration for automatic alert expiration, which is by default equal to 3 evaluation intervals of the parent group.
-rule.validateExpressions
Whether to validate rules expressions via MetricsQL engine (default true)
-rule.validateTemplates
@@ -363,57 +700,77 @@ The shortlist of configuration flags is the following:
-tls
Whether to enable TLS (aka HTTPS) for incoming requests. -tlsCertFile and -tlsKeyFile must be set if -tls is set
-tlsCertFile string
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs, since RSA certs are slow
Path to file with TLS certificate. Used only if -tls is set. Prefer ECDSA certs instead of RSA certs as RSA certs are slower
-tlsKeyFile string
Path to file with TLS key. Used only if -tls is set
-version
Show VictoriaMetrics version
```
Pass `-help` to `vmalert` in order to see the full list of supported
command-line flags with their descriptions.
### Hot config reload
`vmalert` supports "hot" config reload via the following methods:
* send SIGHUP signal to `vmalert` process;
* send GET request to `/-/reload` endpoint;
* configure `-rule.configCheckInterval` flag for periodic reload
on config change.
To reload configuration without `vmalert` restart send SIGHUP signal
or send GET request to `/-/reload` endpoint.
### URL params
### Contributing
To set additional URL params for `datasource.url`, `remoteWrite.url` or `remoteRead.url`
just add them in address: `-datasource.url=http://localhost:8428?nocache=1`.
To set additional URL params for specific [group of rules](#Groups) modify
the `params` group:
```yaml
groups:
- name: TestGroup
params:
denyPartialResponse: ["true"]
extra_label: ["env=dev"]
```
Please note, `params` are used only for executing rules expressions (requests to `datasource.url`).
If there would be a conflict between URL params set in `datasource.url` flag and params in group definition
the latter will have higher priority.
## Contributing
`vmalert` is mostly designed and built by VictoriaMetrics community.
Feel free to share your experience and ideas for improving this
Feel free to share your experience and ideas for improving this
software. Please keep simplicity as the main priority.
### How to build from sources
## How to build from sources
It is recommended using
[binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
It is recommended using
[binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
- `vmalert` is located in `vmutils-*` archives there.
#### Development build
### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmalert` from the root folder of the repository.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.
2. Run `make vmalert` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert` binary and puts it into the `bin` folder.
#### Production build
### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-prod` from the root folder of the repository.
2. Run `make vmalert-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-prod` binary and puts it into the `bin` folder.
#### ARM build
### ARM build
ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/).
#### Development ARM build
### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.13.
2. Run `make vmalert-arm` or `make vmalert-arm64` from the root folder of the repository.
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.17.
2. Run `make vmalert-arm` or `make vmalert-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-arm` or `vmalert-arm64` binary respectively and puts it into the `bin` folder.
#### Production ARM build
### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make vmalert-arm-prod` or `make vmalert-arm64-prod` from the root folder of the repository.
2. Run `make vmalert-arm-prod` or `make vmalert-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics).
It builds `vmalert-arm-prod` or `vmalert-arm64-prod` binary respectively and puts it into the `bin` folder.

View File

@@ -19,15 +19,18 @@ import (
// AlertingRule is basic alert entity
type AlertingRule struct {
Type datasource.Type
RuleID uint64
Name string
Expr string
For time.Duration
Labels map[string]string
Annotations map[string]string
GroupID uint64
GroupName string
Type datasource.Type
RuleID uint64
Name string
Expr string
For time.Duration
Labels map[string]string
Annotations map[string]string
GroupID uint64
GroupName string
EvalInterval time.Duration
q datasource.Querier
// guard status fields
mu sync.RWMutex
@@ -39,6 +42,9 @@ type AlertingRule struct {
// resets on every successful Exec
// may be used as Health state
lastExecError error
// stores the number of samples returned during
// the last evaluation
lastExecSamples int
metrics *alertingRuleMetrics
}
@@ -47,28 +53,35 @@ type alertingRuleMetrics struct {
errors *gauge
pending *gauge
active *gauge
samples *gauge
}
func newAlertingRule(group *Group, cfg config.Rule) *AlertingRule {
func newAlertingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule) *AlertingRule {
ar := &AlertingRule{
Type: cfg.Type,
RuleID: cfg.ID,
Name: cfg.Alert,
Expr: cfg.Expr,
For: cfg.For.Duration(),
Labels: cfg.Labels,
Annotations: cfg.Annotations,
GroupID: group.ID(),
GroupName: group.Name,
alerts: make(map[uint64]*notifier.Alert),
metrics: &alertingRuleMetrics{},
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Alert,
Expr: cfg.Expr,
For: cfg.For.Duration(),
Labels: cfg.Labels,
Annotations: cfg.Annotations,
GroupID: group.ID(),
GroupName: group.Name,
EvalInterval: group.Interval,
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: &group.Type,
EvaluationInterval: group.Interval,
QueryParams: group.Params,
}),
alerts: make(map[uint64]*notifier.Alert),
metrics: &alertingRuleMetrics{},
}
labels := fmt.Sprintf(`alertname=%q, group=%q, id="%d"`, ar.Name, group.Name, ar.ID())
ar.metrics.pending = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_pending{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
ar.mu.RLock()
defer ar.mu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StatePending {
@@ -79,8 +92,8 @@ func newAlertingRule(group *Group, cfg config.Rule) *AlertingRule {
})
ar.metrics.active = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_firing{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
ar.mu.RLock()
defer ar.mu.RUnlock()
var num int
for _, a := range ar.alerts {
if a.State == notifier.StateFiring {
@@ -89,15 +102,21 @@ func newAlertingRule(group *Group, cfg config.Rule) *AlertingRule {
}
return float64(num)
})
ar.metrics.errors = getOrCreateGauge(fmt.Sprintf(`vmalert_alerts_error{%s}`, labels),
ar.metrics.errors = getOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_error{%s}`, labels),
func() float64 {
ar.mu.Lock()
defer ar.mu.Unlock()
ar.mu.RLock()
defer ar.mu.RUnlock()
if ar.lastExecError == nil {
return 0
}
return 1
})
ar.metrics.samples = getOrCreateGauge(fmt.Sprintf(`vmalert_alerting_rules_last_evaluation_samples{%s}`, labels),
func() float64 {
ar.mu.RLock()
defer ar.mu.RUnlock()
return float64(ar.lastExecSamples)
})
return ar
}
@@ -106,6 +125,7 @@ func (ar *AlertingRule) Close() {
metrics.UnregisterMetric(ar.metrics.active.name)
metrics.UnregisterMetric(ar.metrics.pending.name)
metrics.UnregisterMetric(ar.metrics.errors.name)
metrics.UnregisterMetric(ar.metrics.samples.name)
}
// String implements Stringer interface
@@ -119,15 +139,77 @@ func (ar *AlertingRule) ID() uint64 {
return ar.RuleID
}
// ExecRange executes alerting rule on the given time range similarly to Exec.
// It doesn't update internal states of the Rule and meant to be used just
// to get time series for backfilling.
// It returns ALERT and ALERT_FOR_STATE time series as result.
func (ar *AlertingRule) ExecRange(ctx context.Context, start, end time.Time) ([]prompbmarshal.TimeSeries, error) {
series, err := ar.q.QueryRange(ctx, ar.Expr, start, end)
if err != nil {
return nil, err
}
var result []prompbmarshal.TimeSeries
qFn := func(query string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported in replay mode")
}
for _, s := range series {
// set additional labels to identify group and rule name
if ar.Name != "" {
s.SetLabel(alertNameLabel, ar.Name)
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
s.SetLabel(alertGroupNameLabel, ar.GroupName)
}
// extra labels could contain templates, so we expand them first
labels, err := expandLabels(s, qFn, ar)
if err != nil {
return nil, fmt.Errorf("failed to expand labels: %s", err)
}
for k, v := range labels {
// apply extra labels to datasource
// so the hash key will be consistent on restore
s.SetLabel(k, v)
}
a, err := ar.newAlert(s, time.Time{}, qFn) // initial alert
if err != nil {
return nil, fmt.Errorf("failed to create alert: %s", err)
}
if ar.For == 0 { // if alert is instant
a.State = notifier.StateFiring
for i := range s.Values {
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
}
continue
}
// if alert with For > 0
prevT := time.Time{}
for i := range s.Values {
at := time.Unix(s.Timestamps[i], 0)
if at.Sub(prevT) > ar.EvalInterval {
// reset to Pending if there are gaps > EvalInterval between DPs
a.State = notifier.StatePending
a.Start = at
} else if at.Sub(a.Start) >= ar.For {
a.State = notifier.StateFiring
}
prevT = at
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
}
}
return result, nil
}
// Exec executes AlertingRule expression via the given Querier.
// Based on the Querier results AlertingRule maintains notifier.Alerts
func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series bool) ([]prompbmarshal.TimeSeries, error) {
qMetrics, err := q.Query(ctx, ar.Expr, ar.Type)
func (ar *AlertingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error) {
qMetrics, err := ar.q.Query(ctx, ar.Expr)
ar.mu.Lock()
defer ar.mu.Unlock()
ar.lastExecError = err
ar.lastExecTime = time.Now()
ar.lastExecSamples = len(qMetrics)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", ar.Expr, err)
}
@@ -139,10 +221,17 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
}
}
qFn := func(query string) ([]datasource.Metric, error) { return q.Query(ctx, query, ar.Type) }
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query) }
updated := make(map[uint64]struct{})
// update list of active alerts
for _, m := range qMetrics {
// set additional labels to identify group and rule name
if ar.Name != "" {
m.SetLabel(alertNameLabel, ar.Name)
}
if !*disableAlertGroupLabel && ar.GroupName != "" {
m.SetLabel(alertGroupNameLabel, ar.GroupName)
}
// extra labels could contain templates, so we expand them first
labels, err := expandLabels(m, qFn, ar)
if err != nil {
@@ -161,9 +250,9 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
}
updated[h] = struct{}{}
if a, ok := ar.alerts[h]; ok {
if a.Value != m.Value {
if a.Value != m.Values[0] {
// update Value field with latest value
a.Value = m.Value
a.Value = m.Values[0]
// and re-exec template since Value can be used
// in annotations
a.Annotations, err = a.ExecTemplate(qFn, ar.Annotations)
@@ -201,10 +290,7 @@ func (ar *AlertingRule) Exec(ctx context.Context, q datasource.Querier, series b
alertsFired.Inc()
}
}
if series {
return ar.toTimeSeries(ar.lastExecTime), nil
}
return nil, nil
return ar.toTimeSeries(ar.lastExecTime.Unix()), nil
}
func expandLabels(m datasource.Metric, q notifier.QueryFn, ar *AlertingRule) (map[string]string, error) {
@@ -214,13 +300,13 @@ func expandLabels(m datasource.Metric, q notifier.QueryFn, ar *AlertingRule) (ma
}
tpl := notifier.AlertTplData{
Labels: metricLabels,
Value: m.Value,
Value: m.Values[0],
Expr: ar.Expr,
}
return notifier.ExecTemplate(q, ar.Labels, tpl)
}
func (ar *AlertingRule) toTimeSeries(timestamp time.Time) []prompbmarshal.TimeSeries {
func (ar *AlertingRule) toTimeSeries(timestamp int64) []prompbmarshal.TimeSeries {
var tss []prompbmarshal.TimeSeries
for _, a := range ar.alerts {
if a.State == notifier.StateInactive {
@@ -244,6 +330,8 @@ func (ar *AlertingRule) UpdateWith(r Rule) error {
ar.For = nr.For
ar.Labels = nr.Labels
ar.Annotations = nr.Annotations
ar.EvalInterval = nr.EvalInterval
ar.q = nr.q
return nil
}
@@ -271,13 +359,10 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, start time.Time, qFn notif
GroupID: ar.GroupID,
Name: ar.Name,
Labels: map[string]string{},
Value: m.Value,
Value: m.Values[0],
Start: start,
Expr: ar.Expr,
}
// label defined here to make override possible by
// time series labels.
a.Labels[alertGroupNameLabel] = ar.GroupName
for _, l := range m.Labels {
// drop __name__ to be consistent with Prometheus alerting
if l.Name == "__name__" {
@@ -317,6 +402,7 @@ func (ar *AlertingRule) RuleAPI() APIAlertingRule {
Expression: ar.Expr,
For: ar.For.String(),
LastError: lastErr,
LastSamples: ar.lastExecSamples,
LastExec: ar.lastExecTime,
Labels: ar.Labels,
Annotations: ar.Annotations,
@@ -335,10 +421,11 @@ func (ar *AlertingRule) AlertsAPI() []*APIAlert {
}
func (ar *AlertingRule) newAlertAPI(a notifier.Alert) *APIAlert {
return &APIAlert{
aa := &APIAlert{
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", a.ID),
GroupID: fmt.Sprintf("%d", a.GroupID),
RuleID: fmt.Sprintf("%d", ar.RuleID),
Name: a.Name,
Expression: ar.Expr,
@@ -346,8 +433,13 @@ func (ar *AlertingRule) newAlertAPI(a notifier.Alert) *APIAlert {
Annotations: a.Annotations,
State: a.State.String(),
ActiveAt: a.Start,
Value: strconv.FormatFloat(a.Value, 'e', -1, 64),
Restored: a.Restored,
Value: strconv.FormatFloat(a.Value, 'f', -1, 32),
}
if alertURLGeneratorFn != nil {
aa.SourceLink = alertURLGeneratorFn(a)
}
return aa
}
const (
@@ -362,43 +454,42 @@ const (
alertStateLabel = "alertstate"
// alertGroupNameLabel defines the label name attached for generated time series.
// attaching this label may be disabled via `-disableAlertgroupLabel` flag.
alertGroupNameLabel = "alertgroup"
)
// alertToTimeSeries converts the given alert with the given timestamp to timeseries
func (ar *AlertingRule) alertToTimeSeries(a *notifier.Alert, timestamp time.Time) []prompbmarshal.TimeSeries {
func (ar *AlertingRule) alertToTimeSeries(a *notifier.Alert, timestamp int64) []prompbmarshal.TimeSeries {
var tss []prompbmarshal.TimeSeries
tss = append(tss, alertToTimeSeries(ar.Name, a, timestamp))
tss = append(tss, alertToTimeSeries(a, timestamp))
if ar.For > 0 {
tss = append(tss, alertForToTimeSeries(ar.Name, a, timestamp))
tss = append(tss, alertForToTimeSeries(a, timestamp))
}
return tss
}
func alertToTimeSeries(name string, a *notifier.Alert, timestamp time.Time) prompbmarshal.TimeSeries {
func alertToTimeSeries(a *notifier.Alert, timestamp int64) prompbmarshal.TimeSeries {
labels := make(map[string]string)
for k, v := range a.Labels {
labels[k] = v
}
labels["__name__"] = alertMetricName
labels[alertNameLabel] = name
labels[alertStateLabel] = a.State.String()
return newTimeSeries(1, labels, timestamp)
return newTimeSeries([]float64{1}, []int64{timestamp}, labels)
}
// alertForToTimeSeries returns a timeseries that represents
// state of active alerts, where value is time when alert become active
func alertForToTimeSeries(name string, a *notifier.Alert, timestamp time.Time) prompbmarshal.TimeSeries {
func alertForToTimeSeries(a *notifier.Alert, timestamp int64) prompbmarshal.TimeSeries {
labels := make(map[string]string)
for k, v := range a.Labels {
labels[k] = v
}
labels["__name__"] = alertForStateMetricName
labels[alertNameLabel] = name
return newTimeSeries(float64(a.Start.Unix()), labels, timestamp)
return newTimeSeries([]float64{float64(a.Start.Unix())}, []int64{timestamp}, labels)
}
// Restore restores the state of active alerts basing on previously written timeseries.
// Restore restores the state of active alerts basing on previously written time series.
// Restore restores only Start field. Field State will be always Pending and supposed
// to be updated on next Exec, as well as Value field.
// Only rules with For > 0 will be restored.
@@ -407,7 +498,7 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
return fmt.Errorf("querier is nil")
}
qFn := func(query string) ([]datasource.Metric, error) { return q.Query(ctx, query, ar.Type) }
qFn := func(query string) ([]datasource.Metric, error) { return ar.q.Query(ctx, query) }
// account for external labels in filter
var labelsFilter string
@@ -420,29 +511,19 @@ func (ar *AlertingRule) Restore(ctx context.Context, q datasource.Querier, lookb
// remote write protocol which is used for state persistence in vmalert.
expr := fmt.Sprintf("last_over_time(%s{alertname=%q%s}[%ds])",
alertForStateMetricName, ar.Name, labelsFilter, int(lookback.Seconds()))
qMetrics, err := q.Query(ctx, expr, ar.Type)
qMetrics, err := q.Query(ctx, expr)
if err != nil {
return err
}
for _, m := range qMetrics {
labels := m.Labels
m.Labels = make([]datasource.Label, 0)
// drop all extra labels, so hash key will
// be identical to time series received in Exec
for _, l := range labels {
if l.Name == alertNameLabel || l.Name == alertGroupNameLabel {
continue
}
m.Labels = append(m.Labels, l)
}
a, err := ar.newAlert(m, time.Unix(int64(m.Value), 0), qFn)
a, err := ar.newAlert(m, time.Unix(int64(m.Values[0]), 0), qFn)
if err != nil {
return fmt.Errorf("failed to create alert: %w", err)
}
a.ID = hash(m)
a.State = notifier.StatePending
a.Restored = true
ar.alerts[a.ID] = a
logger.Infof("alert %q (%d) restored to state at %v", a.Name, a.ID, a.Start)
}

View File

@@ -24,11 +24,10 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
newTestAlertingRule("instant", 0),
&notifier.Alert{State: notifier.StateFiring},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
alertNameLabel: "instant",
}, timestamp),
}),
},
},
{
@@ -38,13 +37,12 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
"instance": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
alertNameLabel: "instant extra labels",
"job": "foo",
"instance": "bar",
}, timestamp),
}),
},
},
{
@@ -54,48 +52,47 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
"__name__": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
alertNameLabel: "instant labels override",
}, timestamp),
}),
},
},
{
newTestAlertingRule("for", time.Second),
&notifier.Alert{State: notifier.StateFiring, Start: timestamp.Add(time.Second)},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
alertNameLabel: "for",
}, timestamp),
newTimeSeries(float64(timestamp.Add(time.Second).Unix()), map[string]string{
"__name__": alertForStateMetricName,
alertNameLabel: "for",
}, timestamp),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
}),
},
},
{
newTestAlertingRule("for pending", 10*time.Second),
&notifier.Alert{State: notifier.StatePending, Start: timestamp.Add(time.Second)},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StatePending.String(),
alertNameLabel: "for pending",
}, timestamp),
newTimeSeries(float64(timestamp.Add(time.Second).Unix()), map[string]string{
"__name__": alertForStateMetricName,
alertNameLabel: "for pending",
}, timestamp),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
}),
},
},
}
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
tc.rule.alerts[tc.alert.ID] = tc.alert
tss := tc.rule.toTimeSeries(timestamp)
tss := tc.rule.toTimeSeries(timestamp.Unix())
if err := compareTimeSeries(t, tc.expTS, tss); err != nil {
t.Fatalf("timeseries missmatch: %s", err)
}
@@ -105,23 +102,27 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
func TestAlertingRule_Exec(t *testing.T) {
const defaultStep = 5 * time.Millisecond
type testAlert struct {
labels []string
alert *notifier.Alert
}
testCases := []struct {
rule *AlertingRule
steps [][]datasource.Metric
expAlerts map[uint64]*notifier.Alert
expAlerts []testAlert
}{
{
newTestAlertingRule("empty", 0),
[][]datasource.Metric{},
map[uint64]*notifier.Alert{},
nil,
},
{
newTestAlertingRule("empty labels", 0),
[][]datasource.Metric{
{datasource.Metric{}},
{datasource.Metric{Values: []float64{1}, Timestamps: []int64{1}}},
},
map[uint64]*notifier.Alert{
hash(datasource.Metric{}): {State: notifier.StateFiring},
[]testAlert{
{alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -129,8 +130,8 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -139,8 +140,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateInactive},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
@@ -150,8 +151,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -162,8 +163,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateInactive},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
@@ -175,7 +176,7 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{},
},
map[uint64]*notifier.Alert{},
nil,
},
{
newTestAlertingRule("single-firing=>inactive=>firing=>inactive=>empty=>firing", 0),
@@ -187,8 +188,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -200,10 +201,10 @@ func TestAlertingRule_Exec(t *testing.T) {
metricWithLabels(t, "name", "foo2"),
},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
hash(metricWithLabels(t, "name", "foo1")): {State: notifier.StateFiring},
hash(metricWithLabels(t, "name", "foo2")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateFiring}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -216,9 +217,9 @@ func TestAlertingRule_Exec(t *testing.T) {
// 1: fire first alert
// 2: fire second alert, set first inactive
// 3: fire third alert, set second inactive, delete first one
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo1")): {State: notifier.StateInactive},
hash(metricWithLabels(t, "name", "foo2")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo1"}, alert: &notifier.Alert{State: notifier.StateInactive}},
{labels: []string{"name", "foo2"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -226,8 +227,8 @@ func TestAlertingRule_Exec(t *testing.T) {
[][]datasource.Metric{
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StatePending},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}},
},
},
{
@@ -236,8 +237,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
{
@@ -248,7 +249,7 @@ func TestAlertingRule_Exec(t *testing.T) {
// empty step to reset and delete pending alerts
{},
},
map[uint64]*notifier.Alert{},
nil,
},
{
newTestAlertingRule("for-pending=>firing=>inactive", defaultStep),
@@ -258,8 +259,8 @@ func TestAlertingRule_Exec(t *testing.T) {
// empty step to reset pending alerts
{},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateInactive},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateInactive}},
},
},
{
@@ -271,8 +272,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{},
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StatePending},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StatePending}},
},
},
{
@@ -285,8 +286,8 @@ func TestAlertingRule_Exec(t *testing.T) {
{metricWithLabels(t, "name", "foo")},
{metricWithLabels(t, "name", "foo")},
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "name", "foo")): {State: notifier.StateFiring},
[]testAlert{
{labels: []string{"name", "foo"}, alert: &notifier.Alert{State: notifier.StateFiring}},
},
},
}
@@ -294,11 +295,12 @@ func TestAlertingRule_Exec(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.q = fq
tc.rule.GroupID = fakeGroup.ID()
for _, step := range tc.steps {
fq.reset()
fq.add(step...)
if _, err := tc.rule.Exec(context.TODO(), fq, false); err != nil {
if _, err := tc.rule.Exec(context.TODO()); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// artificial delay between applying steps
@@ -307,7 +309,15 @@ func TestAlertingRule_Exec(t *testing.T) {
if len(tc.rule.alerts) != len(tc.expAlerts) {
t.Fatalf("expected %d alerts; got %d", len(tc.expAlerts), len(tc.rule.alerts))
}
for key, exp := range tc.expAlerts {
expAlerts := make(map[uint64]*notifier.Alert)
for _, ta := range tc.expAlerts {
labels := ta.labels
labels = append(labels, alertNameLabel)
labels = append(labels, tc.rule.Name)
h := hash(metricWithLabels(t, labels...))
expAlerts[h] = ta.alert
}
for key, exp := range expAlerts {
got, ok := tc.rule.alerts[key]
if !ok {
t.Fatalf("expected to have key %d", key)
@@ -320,6 +330,171 @@ func TestAlertingRule_Exec(t *testing.T) {
}
}
func TestAlertingRule_ExecRange(t *testing.T) {
testCases := []struct {
rule *AlertingRule
data []datasource.Metric
expAlerts []*notifier.Alert
}{
{
newTestAlertingRule("empty", 0),
[]datasource.Metric{},
nil,
},
{
newTestAlertingRule("empty labels", 0),
[]datasource.Metric{
{Values: []float64{1}, Timestamps: []int64{1}},
},
[]*notifier.Alert{
{State: notifier.StateFiring},
},
},
{
newTestAlertingRule("single-firing", 0),
[]datasource.Metric{
metricWithLabels(t, "name", "foo"),
},
[]*notifier.Alert{
{
Labels: map[string]string{"name": "foo"},
State: notifier.StateFiring,
},
},
},
{
newTestAlertingRule("single-firing-on-range", 0),
[]datasource.Metric{
{Values: []float64{1, 1, 1}, Timestamps: []int64{1e3, 2e3, 3e3}},
},
[]*notifier.Alert{
{State: notifier.StateFiring},
{State: notifier.StateFiring},
{State: notifier.StateFiring},
},
},
{
newTestAlertingRule("for-pending", time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1}, Timestamps: []int64{1, 3, 5}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(3, 0)},
{State: notifier.StatePending, Start: time.Unix(5, 0)},
},
},
{
newTestAlertingRule("for-firing", 3*time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1}, Timestamps: []int64{1, 3, 5}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
},
},
{
newTestAlertingRule("for=>pending=>firing=>pending=>firing=>pending", time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1, 1, 1}, Timestamps: []int64{1, 2, 5, 6, 20}},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(5, 0)},
{State: notifier.StateFiring, Start: time.Unix(5, 0)},
{State: notifier.StatePending, Start: time.Unix(20, 0)},
},
},
{
newTestAlertingRule("multi-series-for=>pending=>pending=>firing", 3*time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1}, Timestamps: []int64{1, 3, 5}},
{Values: []float64{1, 1}, Timestamps: []int64{1, 5},
Labels: []datasource.Label{{Name: "foo", Value: "bar"}},
},
},
[]*notifier.Alert{
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StatePending, Start: time.Unix(1, 0)},
{State: notifier.StateFiring, Start: time.Unix(1, 0)},
//
{State: notifier.StatePending, Start: time.Unix(1, 0),
Labels: map[string]string{
"foo": "bar",
}},
{State: notifier.StatePending, Start: time.Unix(5, 0),
Labels: map[string]string{
"foo": "bar",
}},
},
},
{
newTestRuleWithLabels("multi-series-firing", "source", "vm"),
[]datasource.Metric{
{Values: []float64{1, 1}, Timestamps: []int64{1, 100}},
{Values: []float64{1, 1}, Timestamps: []int64{1, 5},
Labels: []datasource.Label{{Name: "foo", Value: "bar"}},
},
},
[]*notifier.Alert{
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
//
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
},
},
}
fakeGroup := Group{Name: "TestRule_ExecRange"}
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.q = fq
tc.rule.GroupID = fakeGroup.ID()
fq.add(tc.data...)
gotTS, err := tc.rule.ExecRange(context.TODO(), time.Now(), time.Now())
if err != nil {
t.Fatalf("unexpected err: %s", err)
}
var expTS []prompbmarshal.TimeSeries
var j int
for _, series := range tc.data {
for _, timestamp := range series.Timestamps {
a := tc.expAlerts[j]
if a.Labels == nil {
a.Labels = make(map[string]string)
}
a.Labels[alertNameLabel] = tc.rule.Name
expTS = append(expTS, tc.rule.alertToTimeSeries(tc.expAlerts[j], timestamp)...)
j++
}
}
if len(gotTS) != len(expTS) {
t.Fatalf("expected %d time series; got %d", len(expTS), len(gotTS))
}
for i := range expTS {
got, exp := gotTS[i], expTS[i]
if !reflect.DeepEqual(got, exp) {
t.Fatalf("%d: expected \n%v but got \n%v", i, exp, got)
}
}
})
}
}
func TestAlertingRule_Restore(t *testing.T) {
testCases := []struct {
rule *AlertingRule
@@ -331,7 +506,6 @@ func TestAlertingRule_Restore(t *testing.T) {
[]datasource.Metric{
metricWithValueAndLabels(t, float64(time.Now().Truncate(time.Hour).Unix()),
"__name__", alertForStateMetricName,
alertNameLabel, "",
),
},
map[uint64]*notifier.Alert{
@@ -344,7 +518,7 @@ func TestAlertingRule_Restore(t *testing.T) {
[]datasource.Metric{
metricWithValueAndLabels(t, float64(time.Now().Truncate(time.Hour).Unix()),
"__name__", alertForStateMetricName,
alertNameLabel, "",
alertNameLabel, "metric labels",
alertGroupNameLabel, "groupID",
"foo", "bar",
"namespace", "baz",
@@ -352,6 +526,8 @@ func TestAlertingRule_Restore(t *testing.T) {
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t,
alertNameLabel, "metric labels",
alertGroupNameLabel, "groupID",
"foo", "bar",
"namespace", "baz",
)): {State: notifier.StatePending,
@@ -363,7 +539,6 @@ func TestAlertingRule_Restore(t *testing.T) {
[]datasource.Metric{
metricWithValueAndLabels(t, float64(time.Now().Truncate(time.Hour).Unix()),
"__name__", alertForStateMetricName,
alertNameLabel, "",
"foo", "bar",
"namespace", "baz",
// extra labels set by rule
@@ -410,6 +585,7 @@ func TestAlertingRule_Restore(t *testing.T) {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.GroupID = fakeGroup.ID()
tc.rule.q = fq
fq.add(tc.metrics...)
if err := tc.rule.Restore(context.TODO(), fq, time.Hour, nil); err != nil {
t.Fatalf("unexpected err: %s", err)
@@ -437,17 +613,18 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
fq := &fakeQuerier{}
ar := newTestAlertingRule("test", 0)
ar.Labels = map[string]string{"job": "test"}
ar.q = fq
// successful attempt
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "bar"))
_, err := ar.Exec(context.TODO(), fq, false)
_, err := ar.Exec(context.TODO())
if err != nil {
t.Fatal(err)
}
// label `job` will collide with rule extra label and will make both time series equal
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "baz"))
_, err = ar.Exec(context.TODO(), fq, false)
_, err = ar.Exec(context.TODO())
if !errors.Is(err, errDuplicate) {
t.Fatalf("expected to have %s error; got %s", errDuplicate, err)
}
@@ -456,7 +633,7 @@ func TestAlertingRule_Exec_Negative(t *testing.T) {
expErr := "connection reset by peer"
fq.setErr(errors.New(expErr))
_, err = ar.Exec(context.TODO(), fq, false)
_, err = ar.Exec(context.TODO())
if err == nil {
t.Fatalf("expected to get err; got nil")
}
@@ -478,20 +655,20 @@ func TestAlertingRule_Template(t *testing.T) {
metricWithValueAndLabels(t, 1, "instance", "bar"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "region", "east", "instance", "foo")): {
hash(metricWithLabels(t, alertNameLabel, "common", "region", "east", "instance", "foo")): {
Annotations: map[string]string{},
Labels: map[string]string{
alertGroupNameLabel: "",
"region": "east",
"instance": "foo",
alertNameLabel: "common",
"region": "east",
"instance": "foo",
},
},
hash(metricWithLabels(t, "region", "east", "instance", "bar")): {
hash(metricWithLabels(t, alertNameLabel, "common", "region", "east", "instance", "bar")): {
Annotations: map[string]string{},
Labels: map[string]string{
alertGroupNameLabel: "",
"region": "east",
"instance": "bar",
alertNameLabel: "common",
"region": "east",
"instance": "bar",
},
},
},
@@ -514,22 +691,22 @@ func TestAlertingRule_Template(t *testing.T) {
metricWithValueAndLabels(t, 10, "instance", "bar"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, "region", "east", "instance", "foo")): {
hash(metricWithLabels(t, alertNameLabel, "override label", "region", "east", "instance", "foo")): {
Labels: map[string]string{
alertGroupNameLabel: "",
"instance": "foo",
"region": "east",
alertNameLabel: "override label",
"instance": "foo",
"region": "east",
},
Annotations: map[string]string{
"summary": `Too high connection number for "foo" for region east`,
"description": `It is 2 connections for "foo"`,
},
},
hash(metricWithLabels(t, "region", "east", "instance", "bar")): {
hash(metricWithLabels(t, alertNameLabel, "override label", "region", "east", "instance", "bar")): {
Labels: map[string]string{
alertGroupNameLabel: "",
"instance": "bar",
"region": "east",
alertNameLabel: "override label",
"instance": "bar",
"region": "east",
},
Annotations: map[string]string{
"summary": `Too high connection number for "bar" for region east`,
@@ -538,14 +715,53 @@ func TestAlertingRule_Template(t *testing.T) {
},
},
},
{
&AlertingRule{
Name: "ExtraTemplating",
GroupName: "Testing",
Labels: map[string]string{
"name": "alert_{{ $labels.alertname }}",
"group": "group_{{ $labels.alertgroup }}",
"instance": "{{ $labels.instance }}",
},
Annotations: map[string]string{
"summary": `Alert "{{ $labels.alertname }}({{ $labels.alertgroup }})" for instance {{ $labels.instance }}`,
"description": `Alert "{{ $labels.name }}({{ $labels.group }})" for instance {{ $labels.instance }}`,
},
alerts: make(map[uint64]*notifier.Alert),
},
[]datasource.Metric{
metricWithValueAndLabels(t, 1, "instance", "foo"),
},
map[uint64]*notifier.Alert{
hash(metricWithLabels(t, alertNameLabel, "ExtraTemplating",
"name", "alert_ExtraTemplating",
alertGroupNameLabel, "Testing",
"group", "group_Testing",
"instance", "foo")): {
Labels: map[string]string{
alertNameLabel: "ExtraTemplating",
"name": "alert_ExtraTemplating",
alertGroupNameLabel: "Testing",
"group": "group_Testing",
"instance": "foo",
},
Annotations: map[string]string{
"summary": `Alert "ExtraTemplating(Testing)" for instance foo`,
"description": `Alert "alert_ExtraTemplating(group_Testing)" for instance foo`,
},
},
},
},
}
fakeGroup := Group{Name: "TestRule_Exec"}
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
tc.rule.GroupID = fakeGroup.ID()
tc.rule.q = fq
fq.add(tc.metrics...)
if _, err := tc.rule.Exec(context.TODO(), fq, false); err != nil {
if _, err := tc.rule.Exec(context.TODO()); err != nil {
t.Fatalf("unexpected err: %s", err)
}
for hash, expAlert := range tc.expAlerts {
@@ -575,5 +791,5 @@ func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
}
func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
return &AlertingRule{Name: name, alerts: make(map[uint64]*notifier.Alert), For: waitFor}
return &AlertingRule{Name: name, alerts: make(map[uint64]*notifier.Alert), For: waitFor, EvalInterval: waitFor}
}

View File

@@ -5,19 +5,18 @@ import (
"fmt"
"hash/fnv"
"io/ioutil"
"net/url"
"path/filepath"
"sort"
"strings"
"time"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/metricsql"
"gopkg.in/yaml.v2"
)
// Group contains list of Rules grouped into
@@ -25,13 +24,23 @@ import (
type Group struct {
Type datasource.Type `yaml:"type,omitempty"`
File string
Name string `yaml:"name"`
Interval time.Duration `yaml:"interval,omitempty"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
Name string `yaml:"name"`
Interval utils.PromDuration `yaml:"interval"`
Rules []Rule `yaml:"rules"`
Concurrency int `yaml:"concurrency"`
// ExtraFilterLabels is a list label filters applied to every rule
// request withing a group. Is compatible only with VM datasources.
// See https://docs.victoriametrics.com#prometheus-querying-api-enhancements
// DEPRECATED: use Params field instead
ExtraFilterLabels map[string]string `yaml:"extra_filter_labels"`
// Labels is a set of label value pairs, that will be added to every rule.
// It has priority over the external labels.
Labels map[string]string `yaml:"labels"`
// Checksum stores the hash of yaml definition for this group.
// May be used to detect any changes like rules re-ordering etc.
Checksum string
// Optional HTTP URL parameters added to each rule request
Params url.Values `yaml:"params"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
@@ -51,12 +60,20 @@ func (g *Group) UnmarshalYAML(unmarshal func(interface{}) error) error {
if g.Type.Get() == "" {
g.Type.Set(datasource.NewPrometheusType())
}
// update rules with empty type.
for i, r := range g.Rules {
if r.Type.Get() == "" {
r.Type.Set(g.Type)
r.ID = HashRule(r)
g.Rules[i] = r
// backward compatibility with deprecated `ExtraFilterLabels` param
if len(g.ExtraFilterLabels) > 0 {
if g.Params == nil {
g.Params = url.Values{}
}
// Sort extraFilters for consistent order for query args across runs.
extraFilters := make([]string, 0, len(g.ExtraFilterLabels))
for k, v := range g.ExtraFilterLabels {
extraFilters = append(extraFilters, fmt.Sprintf("%s=%s", k, v))
}
sort.Strings(extraFilters)
for _, extraFilter := range extraFilters {
g.Params.Add("extra_label", extraFilter)
}
}
@@ -71,9 +88,6 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
if g.Name == "" {
return fmt.Errorf("group name must be set")
}
if len(g.Rules) == 0 {
return fmt.Errorf("group %q can't contain no rules", g.Name)
}
uniqueRules := map[uint64]struct{}{}
for _, r := range g.Rules {
@@ -92,9 +106,6 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
// its needed only for tests.
// because correct types must be inherited after unmarshalling.
exprValidator := g.Type.ValidateExpr
if r.Type.Get() != "" {
exprValidator = r.Type.ValidateExpr
}
if err := exprValidator(r.Expr); err != nil {
return fmt.Errorf("invalid expression for rule %q.%q: %w", g.Name, ruleName, err)
}
@@ -115,54 +126,17 @@ func (g *Group) Validate(validateAnnotations, validateExpressions bool) error {
// recording rule or alerting rule.
type Rule struct {
ID uint64
Type datasource.Type `yaml:"type,omitempty"`
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For PromDuration `yaml:"for"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
Record string `yaml:"record,omitempty"`
Alert string `yaml:"alert,omitempty"`
Expr string `yaml:"expr"`
For utils.PromDuration `yaml:"for"`
Labels map[string]string `yaml:"labels,omitempty"`
Annotations map[string]string `yaml:"annotations,omitempty"`
// Catches all undefined fields and must be empty after parsing.
XXX map[string]interface{} `yaml:",inline"`
}
// PromDuration is Prometheus duration.
type PromDuration struct {
milliseconds int64
}
// NewPromDuration returns PromDuration for given d.
func NewPromDuration(d time.Duration) PromDuration {
return PromDuration{
milliseconds: d.Milliseconds(),
}
}
// MarshalYAML implements yaml.Marshaler interface.
func (pd PromDuration) MarshalYAML() (interface{}, error) {
return pd.Duration().String(), nil
}
// UnmarshalYAML implements yaml.Unmarshaler interface.
func (pd *PromDuration) UnmarshalYAML(unmarshal func(interface{}) error) error {
var s string
if err := unmarshal(&s); err != nil {
return err
}
ms, err := metricsql.DurationValue(s, 0)
if err != nil {
return err
}
pd.milliseconds = ms
return nil
}
// Duration returns duration for pd.
func (pd *PromDuration) Duration() time.Duration {
return time.Duration(pd.milliseconds) * time.Millisecond
}
// UnmarshalYAML implements the yaml.Unmarshaler interface.
func (r *Rule) UnmarshalYAML(unmarshal func(interface{}) error) error {
type rule Rule
@@ -193,7 +167,6 @@ func HashRule(r Rule) uint64 {
h.Write([]byte("alerting"))
h.Write([]byte(r.Alert))
}
h.Write([]byte(r.Type.Get()))
kv := sortMap(r.Labels)
for _, i := range kv {
h.Write([]byte(i.key))
@@ -225,6 +198,7 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
fp = append(fp, matches...)
}
errGroup := new(utils.ErrGroup)
var isExtraFilterLabelsUsed bool
var groups []Group
for _, file := range fp {
uniqueGroups := map[string]struct{}{}
@@ -244,6 +218,9 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
}
uniqueGroups[g.Name] = struct{}{}
g.File = file
if len(g.ExtraFilterLabels) > 0 {
isExtraFilterLabelsUsed = true
}
groups = append(groups, g)
}
}
@@ -253,6 +230,9 @@ func Parse(pathPatterns []string, validateAnnotations, validateExpressions bool)
if len(groups) < 1 {
logger.Warnf("no groups found in %s", strings.Join(pathPatterns, ";"))
}
if isExtraFilterLabelsUsed {
logger.Warnf("field `extra_filter_labels` is deprecated - use `params` instead")
}
return groups, nil
}

View File

@@ -7,11 +7,11 @@ import (
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
func TestMain(m *testing.M) {
@@ -96,10 +96,6 @@ func TestGroup_Validate(t *testing.T) {
group: &Group{},
expErr: "group name must be set",
},
{
group: &Group{Name: "test"},
expErr: "contain no rules",
},
{
group: &Group{Name: "test",
Rules: []Rule{
@@ -264,11 +260,10 @@ func TestGroup_Validate(t *testing.T) {
Rules: []Rule{
{
Expr: "sumSeries(time('foo.bar',10))",
For: PromDuration{milliseconds: 10},
For: utils.NewPromDuration(10 * time.Millisecond),
},
{
Expr: "sum(up == 0 ) by (host)",
Type: datasource.NewPrometheusType(),
},
},
},
@@ -280,11 +275,10 @@ func TestGroup_Validate(t *testing.T) {
Rules: []Rule{
{
Expr: "sum(up == 0 ) by (host)",
For: PromDuration{milliseconds: 10},
For: utils.NewPromDuration(10 * time.Millisecond),
},
{
Expr: "sumSeries(time('foo.bar',10))",
Type: datasource.NewPrometheusType(),
},
},
},
@@ -348,7 +342,7 @@ func TestHashRule(t *testing.T) {
true,
},
{
Rule{Alert: "alert", Expr: "up == 1", For: NewPromDuration(time.Minute)},
Rule{Alert: "alert", Expr: "up == 1", For: utils.NewPromDuration(time.Minute)},
Rule{Alert: "alert", Expr: "up == 1"},
true,
},
@@ -437,7 +431,7 @@ rules:
`)
})
t.Run("Ok, `for` must change cs", func(t *testing.T) {
t.Run("`for` change", func(t *testing.T) {
f(t, `
name: TestGroup
rules:
@@ -446,10 +440,118 @@ rules:
for: 5m
`, `
name: TestGroup
rules:
- alert: ExampleAlertWithFor
expr: sum by(job) (up == 1)
`)
})
t.Run("`interval` change", func(t *testing.T) {
f(t, `
name: TestGroup
interval: 2s
rules:
- alert: ExampleAlertWithFor
expr: sum by(job) (up == 1)
`, `
name: TestGroup
interval: 4s
rules:
- alert: ExampleAlertWithFor
expr: sum by(job) (up == 1)
`)
})
t.Run("`concurrency` change", func(t *testing.T) {
f(t, `
name: TestGroup
concurrency: 2
rules:
- alert: ExampleAlertWithFor
expr: sum by(job) (up == 1)
`, `
name: TestGroup
concurrency: 16
rules:
- alert: ExampleAlertWithFor
expr: sum by(job) (up == 1)
`)
})
t.Run("`params` change", func(t *testing.T) {
f(t, `
name: TestGroup
params:
nocache: ["1"]
rules:
- alert: foo
expr: sum by(job) (up == 1)
`, `
name: TestGroup
params:
nocache: ["0"]
rules:
- alert: foo
expr: sum by(job) (up == 1)
`)
})
}
func TestGroupParams(t *testing.T) {
f := func(t *testing.T, data string, expParams url.Values) {
t.Helper()
var g Group
if err := yaml.Unmarshal([]byte(data), &g); err != nil {
t.Fatalf("failed to unmarshal: %s", err)
}
got, exp := g.Params.Encode(), expParams.Encode()
if got != exp {
t.Fatalf("expected to have %q; got %q", exp, got)
}
}
t.Run("no params", func(t *testing.T) {
f(t, `
name: TestGroup
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{})
})
t.Run("params", func(t *testing.T) {
f(t, `
name: TestGroup
params:
nocache: ["1"]
denyPartialResponse: ["true"]
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{"nocache": {"1"}, "denyPartialResponse": {"true"}})
})
t.Run("extra labels", func(t *testing.T) {
f(t, `
name: TestGroup
extra_filter_labels:
job: victoriametrics
env: prod
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{"extra_label": {"env=prod", "job=victoriametrics"}})
})
t.Run("extra labels and params", func(t *testing.T) {
f(t, `
name: TestGroup
extra_filter_labels:
job: victoriametrics
params:
nocache: ["1"]
extra_label: ["env=prod"]
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
`, url.Values{"nocache": {"1"}, "extra_label": {"env=prod", "job=victoriametrics"}})
})
}

View File

@@ -1,13 +0,0 @@
groups:
- name: TestUpdateGroup
interval: 2s
concurrency: 2
type: prometheus
rules:
- alert: up
expr: up == 0
for: 30s
- alert: up graphite
expr: filterSeries(time('host.1',20),'>','0')
for: 30s
type: graphite

View File

@@ -1,12 +0,0 @@
groups:
- name: TestUpdateGroup
interval: 30s
type: graphite
rules:
- alert: up
expr: filterSeries(time('host.2',20),'>','0')
for: 30s
- alert: up graphite
expr: filterSeries(time('host.1',20),'>','0')
for: 30s
type: graphite

View File

@@ -1,5 +1,7 @@
groups:
- name: duplicatedGroupDiffFiles
labels:
dc: gcp
rules:
- alert: VMRows
for: 5m

View File

@@ -0,0 +1,15 @@
groups:
- name: alertmanager.rules
rules:
- alert: AlertmanagerConfigInconsistent
annotations:
message: |
The configuration of the instances of the Alertmanager cluster `{{ $labels.namespace }}/{{ $labels.service }}` are out of sync.
{{ range printf "alertmanager_config_hash{namespace=\"%s\",service=\"%s\"}" $labels.namespace $labels.service | query }}
Configuration hash for pod {{ .Labels.pod }} is "{{ printf "%.f" .Value }}"
{{ end }}
expr: |
count by(namespace,service) (count_values by(namespace,service) ("config_hash", alertmanager_config_hash{job="alertmanager-main",namespace="openshift-monitoring"})) != 1
for: 5m
labels:
severity: critical

View File

@@ -0,0 +1,39 @@
groups:
- name: ReplayGroup
interval: 1m
concurrency: 1
rules:
- record: type:vm_cache_entries:rate5m
expr: sum(rate(vm_cache_entries[5m])) by (type)
labels:
recording: true
- record: go_cgo_calls_count:rate5m
expr: rate(go_cgo_calls_count{job="vmdb"}[5m])
labels:
recording: true
- name: vmsingleReplay
interval: 30s
concurrency: 2
rules:
- alert: RequestErrorsToAPI
expr: increase(vm_http_request_errors_total[5m]) > 0
for: 15m
labels:
severity: warning
annotations:
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=35&var-instance={{ $labels.instance }}"
summary: "Too many errors served for path {{ $labels.path }} (instance {{ $labels.instance }})"
description: "Requests to path {{ $labels.path }} are receiving errors.
Please verify if clients are sending correct requests."
- alert: TooManyLogs
expr: sum(increase(vm_log_messages_total{level!="info"}[5m])) by (job, instance) > 0
for: 15m
labels:
severity: warning
annotations:
dashboard: "http://localhost:3000/d/wNf0q_kZk?viewPanel=67&var-instance={{ $labels.instance }}"
summary: "Too many logs printed for job \"{{ $labels.job }}\" ({{ $labels.instance }})"
description: "Logging rate for job \"{{ $labels.job }}\" ({{ $labels.instance }}) is {{ $value }} for last 15m.\n
Worth to check logs for specific error messages."

View File

@@ -1,5 +1,8 @@
groups:
- name: groupGorSingleAlert
params:
nocache: ["1"]
denyPartialResponse: ["true"]
rules:
- alert: VMRows
for: 10s

View File

@@ -2,6 +2,11 @@ groups:
- name: TestGroup
interval: 2s
concurrency: 2
extra_filter_labels: # deprecated param, use `params` instead
job: victoriametrics
params:
denyPartialResponse: ["true"]
extra_label: ["env=dev"]
rules:
- alert: Conns
expr: sum(vm_tcplistener_conns) by(instance) > 1

View File

@@ -21,10 +21,3 @@ groups:
annotations:
summary: Too high connection number for {{$labels.instance}}
description: "It is {{ $value }} connections for {{$labels.instance}}"
- alert: HostDown
type: graphite
expr: filterSeries(sumSeries(host.receiver.interface.up),'last','=', 0)
for: 3m
annotations:
summary: Too high connection number for {{$labels.instance}}
description: "It is {{ $value }} connections for {{$labels.instance}}"

View File

@@ -0,0 +1,8 @@
groups:
- name: TestEmptyRules
interval: 2s
concurrency: 2
rules:
- name: TestNoRules
type: prometheus

View File

@@ -2,21 +2,33 @@ package datasource
import (
"context"
"net/url"
"time"
)
// Querier interface wraps Query method which
// executes given query and returns list of Metrics
// as result
// Querier interface wraps Query and QueryRange methods
type Querier interface {
Query(ctx context.Context, query string, engine Type) ([]Metric, error)
Query(ctx context.Context, query string) ([]Metric, error)
QueryRange(ctx context.Context, query string, from, to time.Time) ([]Metric, error)
}
// QuerierBuilder builds Querier with given params.
type QuerierBuilder interface {
BuildWithParams(params QuerierParams) Querier
}
// QuerierParams params for Querier.
type QuerierParams struct {
DataSourceType *Type
EvaluationInterval time.Duration
QueryParams url.Values
}
// Metric is the basic entity which should be return by datasource
// It represents single data point with full list of labels
type Metric struct {
Labels []Label
Timestamp int64
Value float64
Labels []Label
Timestamps []int64
Values []float64
}
// SetLabel adds or updates existing one label

View File

@@ -0,0 +1,18 @@
package datasource
import "testing"
func TestMetric_Label(t *testing.T) {
m := &Metric{}
m.AddLabel("foo", "bar")
checkEqualString(t, "bar", m.Label("foo"))
m.SetLabel("foo", "baz")
checkEqualString(t, "baz", m.Label("foo"))
m.SetLabel("qux", "quux")
checkEqualString(t, "quux", m.Label("qux"))
checkEqualString(t, "", m.Label("non-existing"))
}

View File

@@ -4,43 +4,79 @@ import (
"flag"
"fmt"
"net/http"
"net/url"
"strings"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
var (
addr = flag.String("datasource.url", "", "Victoria Metrics or VMSelect url. Required parameter."+
" E.g. http://127.0.0.1:8428")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to VMSelect URL.")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password for -datasource.url")
addr = flag.String("datasource.url", "", "VictoriaMetrics or vmselect url. Required parameter. "+
"E.g. http://127.0.0.1:8428")
appendTypePrefix = flag.Bool("datasource.appendTypePrefix", false, "Whether to add type prefix to -datasource.url based on the query type. Set to true if sending different query types to the vmselect URL.")
basicAuthUsername = flag.String("datasource.basicAuth.username", "", "Optional basic auth username for -datasource.url")
basicAuthPassword = flag.String("datasource.basicAuth.password", "", "Optional basic auth password for -datasource.url")
basicAuthPasswordFile = flag.String("datasource.basicAuth.passwordFile", "", "Optional path to basic auth password to use for -datasource.url")
bearerToken = flag.String("datasource.bearerToken", "", "Optional bearer auth token to use for -datasource.url.")
bearerTokenFile = flag.String("datasource.bearerTokenFile", "", "Optional path to bearer token file to use for -datasource.url.")
tlsInsecureSkipVerify = flag.Bool("datasource.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -datasource.url")
tlsCertFile = flag.String("datasource.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -datasource.url")
tlsKeyFile = flag.String("datasource.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -datasource.url")
tlsCAFile = flag.String("datasource.tlsCAFile", "", "Optional path to TLS CA file to use for verifying connections to -datasource.url. "+
"By default system CA is used")
tlsServerName = flag.String("datasource.tlsServerName", "", "Optional TLS server name to use for connections to -datasource.url. "+
"By default the server name from -datasource.url is used")
tlsCAFile = flag.String("datasource.tlsCAFile", "", `Optional path to TLS CA file to use for verifying connections to -datasource.url. By default, system CA is used`)
tlsServerName = flag.String("datasource.tlsServerName", "", `Optional TLS server name to use for connections to -datasource.url. By default, the server name from -datasource.url is used`)
lookBack = flag.Duration("datasource.lookback", 0, "Lookback defines how far to look into past when evaluating queries. "+
"For example, if datasource.lookback=5m then param \"time\" with value now()-5m will be added to every query.")
lookBack = flag.Duration("datasource.lookback", 0, `Lookback defines how far into the past to look when evaluating queries. For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.`)
queryStep = flag.Duration("datasource.queryStep", 0, "queryStep defines how far a value can fallback to when evaluating queries. "+
"For example, if datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query.")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, "Defines the number of idle (keep-alive connections) to configured datasource."+
"Consider to set this value equal to the value: groups_total * group.concurrency. Too low value may result into high number of sockets in TIME_WAIT state.")
"For example, if datasource.queryStep=15s then param \"step\" with value \"15s\" will be added to every query."+
"If queryStep isn't specified, rule's evaluationInterval will be used instead.")
maxIdleConnections = flag.Int("datasource.maxIdleConnections", 100, `Defines the number of idle (keep-alive connections) to each configured datasource. Consider setting this value equal to the value: groups_total * group.concurrency. Too low a value may result in a high number of sockets in TIME_WAIT state.`)
roundDigits = flag.Int("datasource.roundDigits", 0, `Adds "round_digits" GET param to datasource requests. `+
`In VM "round_digits" limits the number of digits after the decimal point in response values.`)
)
// Param represents an HTTP GET param
type Param struct {
Key, Value string
}
// Init creates a Querier from provided flag values.
func Init() (Querier, error) {
// Provided extraParams will be added as GET params for
// each request.
func Init(extraParams url.Values) (QuerierBuilder, error) {
if *addr == "" {
return nil, fmt.Errorf("datasource.url is empty")
}
tr, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
tr.MaxIdleConns = *maxIdleConnections
c := &http.Client{Transport: tr}
return NewVMStorage(*addr, *basicAuthUsername, *basicAuthPassword, *lookBack, *queryStep, *appendTypePrefix, c), nil
tr.MaxIdleConnsPerHost = *maxIdleConnections
if tr.MaxIdleConns != 0 && tr.MaxIdleConns < tr.MaxIdleConnsPerHost {
tr.MaxIdleConns = tr.MaxIdleConnsPerHost
}
if extraParams == nil {
extraParams = url.Values{}
}
if *roundDigits > 0 {
extraParams.Set("round_digits", fmt.Sprintf("%d", *roundDigits))
}
authCfg, err := utils.AuthConfig(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile, *bearerToken, *bearerTokenFile)
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}
return &VMStorage{
c: &http.Client{Transport: tr},
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(*addr, "/"),
appendTypePrefix: *appendTypePrefix,
lookBack: *lookBack,
queryStep: *queryStep,
dataSourceType: NewPrometheusType(),
extraParams: extraParams,
}, nil
}

View File

@@ -7,9 +7,6 @@ import (
"github.com/VictoriaMetrics/metricsql"
)
const graphiteType = "graphite"
const prometheusType = "prometheus"
// Type represents data source type
type Type struct {
name string
@@ -17,12 +14,16 @@ type Type struct {
// NewPrometheusType returns prometheus datasource type
func NewPrometheusType() Type {
return Type{name: prometheusType}
return Type{
name: "prometheus",
}
}
// NewGraphiteType returns graphite datasource type
func NewGraphiteType() Type {
return Type{name: graphiteType}
return Type{
name: "graphite",
}
}
// NewRawType returns datasource type from raw string
@@ -44,19 +45,19 @@ func (t *Type) Set(d Type) {
// String implements String interface with default value.
func (t Type) String() string {
if t.name == "" {
return prometheusType
return "prometheus"
}
return t.name
}
// ValidateExpr validates query expression with datasource ql.
func (t *Type) ValidateExpr(expr string) error {
switch t.name {
case graphiteType:
switch t.String() {
case "graphite":
if _, err := graphiteql.Parse(expr); err != nil {
return fmt.Errorf("bad graphite expr: %q, err: %w", expr, err)
}
case "", prometheusType:
case "prometheus":
if _, err := metricsql.Parse(expr); err != nil {
return fmt.Errorf("bad prometheus expr: %q, err: %w", expr, err)
}
@@ -72,12 +73,13 @@ func (t *Type) UnmarshalYAML(unmarshal func(interface{}) error) error {
if err := unmarshal(&s); err != nil {
return err
}
if s == "" {
s = "prometheus"
}
switch s {
case "":
s = prometheusType
case graphiteType, prometheusType:
case "graphite", "prometheus":
default:
return fmt.Errorf("unknown datasource type=%q, want %q or %q", s, prometheusType, graphiteType)
return fmt.Errorf("unknown datasource type=%q, want %q or %q", s, "prometheus", "graphite")
}
t.name = s
return nil

View File

@@ -2,205 +2,157 @@ package datasource
import (
"context"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"strconv"
"net/url"
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
type response struct {
Status string `json:"status"`
Data struct {
ResultType string `json:"resultType"`
Result []struct {
Labels map[string]string `json:"metric"`
TV [2]interface{} `json:"value"`
} `json:"result"`
} `json:"data"`
ErrorType string `json:"errorType"`
Error string `json:"error"`
}
func (r response) metrics() ([]Metric, error) {
var ms []Metric
var m Metric
var f float64
var err error
for i, res := range r.Data.Result {
f, err = strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, res.TV[1], err)
}
m.Labels = nil
for k, v := range r.Data.Result[i].Labels {
m.AddLabel(k, v)
}
m.Timestamp = int64(res.TV[0].(float64))
m.Value = f
ms = append(ms, m)
}
return ms, nil
}
type graphiteResponse []graphiteResponseTarget
type graphiteResponseTarget struct {
Target string `json:"target"`
Tags map[string]string `json:"tags"`
DataPoints [][2]float64 `json:"datapoints"`
}
func (r graphiteResponse) metrics() []Metric {
var ms []Metric
for _, res := range r {
if len(res.DataPoints) < 1 {
continue
}
var m Metric
// add only last value to the result.
last := res.DataPoints[len(res.DataPoints)-1]
m.Value = last[0]
m.Timestamp = int64(last[1])
for k, v := range res.Tags {
m.AddLabel(k, v)
}
ms = append(ms, m)
}
return ms
}
// VMStorage represents vmstorage entity with ability to read and write metrics
type VMStorage struct {
c *http.Client
authCfg *promauth.Config
datasourceURL string
basicAuthUser string
basicAuthPass string
appendTypePrefix bool
lookBack time.Duration
queryStep time.Duration
dataSourceType Type
evaluationInterval time.Duration
extraParams url.Values
disablePathAppend bool
}
const queryPath = "/api/v1/query"
const graphitePath = "/render"
// Clone makes clone of VMStorage, shares http client.
func (s *VMStorage) Clone() *VMStorage {
return &VMStorage{
c: s.c,
authCfg: s.authCfg,
datasourceURL: s.datasourceURL,
lookBack: s.lookBack,
queryStep: s.queryStep,
appendTypePrefix: s.appendTypePrefix,
dataSourceType: s.dataSourceType,
disablePathAppend: s.disablePathAppend,
}
}
const prometheusPrefix = "/prometheus"
const graphitePrefix = "/graphite"
// ApplyParams - changes given querier params.
func (s *VMStorage) ApplyParams(params QuerierParams) *VMStorage {
if params.DataSourceType != nil {
s.dataSourceType = *params.DataSourceType
}
s.evaluationInterval = params.EvaluationInterval
s.extraParams = params.QueryParams
return s
}
// BuildWithParams - implements interface.
func (s *VMStorage) BuildWithParams(params QuerierParams) Querier {
return s.Clone().ApplyParams(params)
}
// NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL, basicAuthUser, basicAuthPass string, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client, disablePathAppend bool) *VMStorage {
return &VMStorage{
c: c,
basicAuthUser: basicAuthUser,
basicAuthPass: basicAuthPass,
datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix,
lookBack: lookBack,
queryStep: queryStep,
c: c,
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix,
lookBack: lookBack,
queryStep: queryStep,
dataSourceType: NewPrometheusType(),
disablePathAppend: disablePathAppend,
}
}
// Query reads metrics from datasource by given query and type
func (s *VMStorage) Query(ctx context.Context, query string, dataSourceType Type) ([]Metric, error) {
switch dataSourceType.name {
case "", prometheusType:
return s.queryDataSource(ctx, query, s.setPrometheusReqParams, parsePrometheusResponse)
case graphiteType:
return s.queryDataSource(ctx, query, s.setGraphiteReqParams, parseGraphiteResponse)
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string) ([]Metric, error) {
req, err := s.newRequestPOST()
if err != nil {
return nil, err
}
ts := time.Now()
switch s.dataSourceType.String() {
case "prometheus":
s.setPrometheusInstantReqParams(req, query, ts)
case "graphite":
s.setGraphiteReqParams(req, query, ts)
default:
return nil, fmt.Errorf("engine not found: %q", dataSourceType)
return nil, fmt.Errorf("engine not found: %q", s.dataSourceType.name)
}
resp, err := s.do(ctx, req)
if err != nil {
return nil, err
}
defer func() {
_ = resp.Body.Close()
}()
parseFn := parsePrometheusResponse
if s.dataSourceType.name != "prometheus" {
parseFn = parseGraphiteResponse
}
return parseFn(req, resp)
}
func (s *VMStorage) queryDataSource(
ctx context.Context,
query string,
setReqParams func(r *http.Request, query string),
processResponse func(r *http.Request, resp *http.Response,
) ([]Metric, error)) ([]Metric, error) {
// QueryRange executes the given query on the given time range.
// For Prometheus type see https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries
// Graphite type isn't supported.
func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end time.Time) ([]Metric, error) {
if s.dataSourceType.name != "prometheus" {
return nil, fmt.Errorf("%q is not supported for QueryRange", s.dataSourceType.name)
}
req, err := s.newRequestPOST()
if err != nil {
return nil, err
}
if start.IsZero() {
return nil, fmt.Errorf("start param is missing")
}
if end.IsZero() {
return nil, fmt.Errorf("end param is missing")
}
s.setPrometheusRangeReqParams(req, query, start, end)
resp, err := s.do(ctx, req)
if err != nil {
return nil, err
}
defer func() {
_ = resp.Body.Close()
}()
return parsePrometheusResponse(req, resp)
}
func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response, error) {
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", req.URL.Redacted(), err)
}
if resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
_ = resp.Body.Close()
return nil, fmt.Errorf("unexpected response code %d for %s. Response body %s", resp.StatusCode, req.URL.Redacted(), body)
}
return resp, nil
}
func (s *VMStorage) newRequestPOST() (*http.Request, error) {
req, err := http.NewRequest("POST", s.datasourceURL, nil)
if err != nil {
return nil, err
}
req.Header.Set("Content-Type", "application/json; charset=utf-8")
if s.basicAuthPass != "" {
req.SetBasicAuth(s.basicAuthUser, s.basicAuthPass)
req.Header.Set("Content-Type", "application/json")
if s.authCfg != nil {
if auth := s.authCfg.GetAuthHeader(); auth != "" {
req.Header.Set("Authorization", auth)
}
}
setReqParams(req, query)
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", req.URL, err)
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
return nil, fmt.Errorf("datasource returns unexpected response code %d for %s. Response body %s", resp.StatusCode, req.URL, body)
}
return processResponse(req, resp)
}
func (s *VMStorage) setPrometheusReqParams(r *http.Request, query string) {
if s.appendTypePrefix {
r.URL.Path += prometheusPrefix
}
r.URL.Path += queryPath
q := r.URL.Query()
q.Set("query", query)
if s.lookBack > 0 {
lookBack := time.Now().Add(-s.lookBack)
q.Set("time", fmt.Sprintf("%d", lookBack.Unix()))
}
if s.queryStep > 0 {
q.Set("step", s.queryStep.String())
}
r.URL.RawQuery = q.Encode()
}
func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string) {
if s.appendTypePrefix {
r.URL.Path += graphitePrefix
}
r.URL.Path += graphitePath
q := r.URL.Query()
q.Set("format", "json")
q.Set("target", query)
from := "-5min"
if s.lookBack > 0 {
lookBack := time.Now().Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("until", "now")
r.URL.RawQuery = q.Encode()
}
const (
statusSuccess, statusError, rtVector = "success", "error", "vector"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &response{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing prometheus metrics for %s: %w", req.URL, err)
}
if r.Status == statusError {
return nil, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL, r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return nil, fmt.Errorf("unknown status: %s, Expected success or error ", r.Status)
}
if r.Data.ResultType != rtVector {
return nil, fmt.Errorf("unknown result type:%s. Expected vector", r.Data.ResultType)
}
return r.metrics()
}
func parseGraphiteResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &graphiteResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL, err)
}
return r.metrics(), nil
return req, nil
}

View File

@@ -0,0 +1,75 @@
package datasource
import (
"encoding/json"
"fmt"
"net/http"
"strconv"
"time"
)
type graphiteResponse []graphiteResponseTarget
type graphiteResponseTarget struct {
Target string `json:"target"`
Tags map[string]string `json:"tags"`
DataPoints [][2]float64 `json:"datapoints"`
}
func (r graphiteResponse) metrics() []Metric {
var ms []Metric
for _, res := range r {
if len(res.DataPoints) < 1 {
continue
}
var m Metric
// add only last value to the result.
last := res.DataPoints[len(res.DataPoints)-1]
m.Values = append(m.Values, last[0])
m.Timestamps = append(m.Timestamps, int64(last[1]))
for k, v := range res.Tags {
m.AddLabel(k, v)
}
ms = append(ms, m)
}
return ms
}
func parseGraphiteResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &graphiteResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing graphite metrics for %s: %w", req.URL.Redacted(), err)
}
return r.metrics(), nil
}
const (
graphitePath = "/render"
graphitePrefix = "/graphite"
)
func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestamp time.Time) {
if s.appendTypePrefix {
r.URL.Path += graphitePrefix
}
r.URL.Path += graphitePath
q := r.URL.Query()
for k, vs := range s.extraParams {
if q.Has(k) { // extraParams are prior to params in URL
q.Del(k)
}
for _, v := range vs {
q.Add(k, v)
}
}
q.Set("format", "json")
q.Set("target", query)
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("until", "now")
r.URL.RawQuery = q.Encode()
}

View File

@@ -0,0 +1,173 @@
package datasource
import (
"encoding/json"
"fmt"
"net/http"
"strconv"
"time"
)
type promResponse struct {
Status string `json:"status"`
ErrorType string `json:"errorType"`
Error string `json:"error"`
Data struct {
ResultType string `json:"resultType"`
Result json.RawMessage `json:"result"`
} `json:"data"`
}
type promInstant struct {
Result []struct {
Labels map[string]string `json:"metric"`
TV [2]interface{} `json:"value"`
} `json:"result"`
}
type promRange struct {
Result []struct {
Labels map[string]string `json:"metric"`
TVs [][2]interface{} `json:"values"`
} `json:"result"`
}
func (r promInstant) metrics() ([]Metric, error) {
var result []Metric
for i, res := range r.Result {
f, err := strconv.ParseFloat(res.TV[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, res.TV[1], err)
}
var m Metric
for k, v := range r.Result[i].Labels {
m.AddLabel(k, v)
}
m.Timestamps = append(m.Timestamps, int64(res.TV[0].(float64)))
m.Values = append(m.Values, f)
result = append(result, m)
}
return result, nil
}
func (r promRange) metrics() ([]Metric, error) {
var result []Metric
for i, res := range r.Result {
var m Metric
for _, tv := range res.TVs {
f, err := strconv.ParseFloat(tv[1].(string), 64)
if err != nil {
return nil, fmt.Errorf("metric %v, unable to parse float64 from %s: %w", res, tv[1], err)
}
m.Values = append(m.Values, f)
m.Timestamps = append(m.Timestamps, int64(tv[0].(float64)))
}
if len(m.Values) < 1 || len(m.Timestamps) < 1 {
return nil, fmt.Errorf("metric %v contains no values", res)
}
m.Labels = nil
for k, v := range r.Result[i].Labels {
m.AddLabel(k, v)
}
result = append(result, m)
}
return result, nil
}
const (
statusSuccess, statusError = "success", "error"
rtVector, rtMatrix = "vector", "matrix"
)
func parsePrometheusResponse(req *http.Request, resp *http.Response) ([]Metric, error) {
r := &promResponse{}
if err := json.NewDecoder(resp.Body).Decode(r); err != nil {
return nil, fmt.Errorf("error parsing prometheus metrics for %s: %w", req.URL.Redacted(), err)
}
if r.Status == statusError {
return nil, fmt.Errorf("response error, query: %s, errorType: %s, error: %s", req.URL.Redacted(), r.ErrorType, r.Error)
}
if r.Status != statusSuccess {
return nil, fmt.Errorf("unknown status: %s, Expected success or error ", r.Status)
}
switch r.Data.ResultType {
case rtVector:
var pi promInstant
if err := json.Unmarshal(r.Data.Result, &pi.Result); err != nil {
return nil, fmt.Errorf("umarshal err %s; \n %#v", err, string(r.Data.Result))
}
return pi.metrics()
case rtMatrix:
var pr promRange
if err := json.Unmarshal(r.Data.Result, &pr.Result); err != nil {
return nil, err
}
return pr.metrics()
default:
return nil, fmt.Errorf("unknown result type %q", r.Data.ResultType)
}
}
const (
prometheusInstantPath = "/api/v1/query"
prometheusRangePath = "/api/v1/query_range"
prometheusPrefix = "/prometheus"
)
func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string, timestamp time.Time) {
if s.appendTypePrefix {
r.URL.Path += prometheusPrefix
}
if !s.disablePathAppend {
r.URL.Path += prometheusInstantPath
}
q := r.URL.Query()
if s.lookBack > 0 {
timestamp = timestamp.Add(-s.lookBack)
}
if s.evaluationInterval > 0 {
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1232
timestamp = timestamp.Truncate(s.evaluationInterval)
}
q.Set("time", fmt.Sprintf("%d", timestamp.Unix()))
r.URL.RawQuery = q.Encode()
s.setPrometheusReqParams(r, query)
}
func (s *VMStorage) setPrometheusRangeReqParams(r *http.Request, query string, start, end time.Time) {
if s.appendTypePrefix {
r.URL.Path += prometheusPrefix
}
if !s.disablePathAppend {
r.URL.Path += prometheusRangePath
}
q := r.URL.Query()
q.Add("start", fmt.Sprintf("%d", start.Unix()))
q.Add("end", fmt.Sprintf("%d", end.Unix()))
r.URL.RawQuery = q.Encode()
s.setPrometheusReqParams(r, query)
}
func (s *VMStorage) setPrometheusReqParams(r *http.Request, query string) {
q := r.URL.Query()
for k, vs := range s.extraParams {
if q.Has(k) { // extraParams are prior to params in URL
q.Del(k)
}
for _, v := range vs {
q.Add(k, v)
}
}
q.Set("query", query)
if s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.evaluationInterval.Seconds())))
}
if s.queryStep > 0 { // override step with user-specified value
// always convert to seconds to keep compatibility with older
// Prometheus versions. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1943
q.Set("step", fmt.Sprintf("%ds", int(s.queryStep.Seconds())))
}
r.URL.RawQuery = q.Encode()
}

View File

@@ -2,22 +2,32 @@ package datasource
import (
"context"
"fmt"
"net/http"
"net/http/httptest"
"net/url"
"reflect"
"strconv"
"strings"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
var (
ctx = context.Background()
basicAuthName = "foo"
basicAuthPass = "bar"
query = "vm_rows"
queryRender = "constantLine(10)"
baCfg = &promauth.BasicAuthConfig{
Username: basicAuthName,
Password: promauth.NewSecret(basicAuthPass),
}
query = "vm_rows"
queryRender = "constantLine(10)"
)
func TestVMSelectQuery(t *testing.T) {
func TestVMInstantQuery(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
@@ -63,32 +73,141 @@ func TestVMSelectQuery(t *testing.T) {
case 5:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix"}}`))
case 6:
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]}]}}`))
w.Write([]byte(`{"status":"success","data":{"resultType":"vector","result":[{"metric":{"__name__":"vm_rows"},"value":[1583786142,"13763"]},{"metric":{"__name__":"vm_requests"},"value":[1583786140,"2000"]}]}}`))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
am := NewVMStorage(srv.URL, basicAuthName, basicAuthPass, time.Minute, 0, false, srv.Client())
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected connection error got nil")
}
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected invalid response status error got nil")
}
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected response body error got nil")
}
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected error status got nil")
}
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected unknown status got nil")
}
if _, err := am.Query(ctx, query, NewPrometheusType()); err == nil {
if _, err := pq.Query(ctx, query); err == nil {
t.Fatalf("expected non-vector resultType error got nil")
}
m, err := am.Query(ctx, query, NewPrometheusType())
m, err := pq.Query(ctx, query)
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(m) != 2 {
t.Fatalf("expected 2 metrics got %d in %+v", len(m), m)
}
expected := []Metric{
{
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
Timestamps: []int64{1583786142},
Values: []float64{13763},
},
{
Labels: []Label{{Value: "vm_requests", Name: "__name__"}},
Timestamps: []int64{1583786140},
Values: []float64{2000},
},
}
if !reflect.DeepEqual(m, expected) {
t.Fatalf("unexpected metric %+v want %+v", m, expected)
}
g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
m, err = gq.Query(ctx, queryRender)
if err != nil {
t.Fatalf("unexpected %s", err)
}
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
exp := Metric{
Labels: []Label{{Value: "constantLine(10)", Name: "name"}},
Timestamps: []int64{1611758403},
Values: []float64{10},
}
if !reflect.DeepEqual(m[0], exp) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
}
func TestVMRangeQuery(t *testing.T) {
mux := http.NewServeMux()
mux.HandleFunc("/", func(_ http.ResponseWriter, _ *http.Request) {
t.Errorf("should not be called")
})
c := -1
mux.HandleFunc("/api/v1/query_range", func(w http.ResponseWriter, r *http.Request) {
c++
if r.Method != http.MethodPost {
t.Errorf("expected POST method got %s", r.Method)
}
if name, pass, _ := r.BasicAuth(); name != basicAuthName || pass != basicAuthPass {
t.Errorf("expected %s:%s as basic auth got %s:%s", basicAuthName, basicAuthPass, name, pass)
}
if r.URL.Query().Get("query") != query {
t.Errorf("expected %s in query param, got %s", query, r.URL.Query().Get("query"))
}
startTS := r.URL.Query().Get("start")
if startTS == "" {
t.Errorf("expected 'start' in query param, got nil instead")
}
if _, err := strconv.ParseInt(startTS, 10, 64); err != nil {
t.Errorf("failed to parse 'start' query param: %s", err)
}
endTS := r.URL.Query().Get("end")
if endTS == "" {
t.Errorf("expected 'end' in query param, got nil instead")
}
if _, err := strconv.ParseInt(endTS, 10, 64); err != nil {
t.Errorf("failed to parse 'end' query param: %s", err)
}
switch c {
case 0:
w.Write([]byte(`{"status":"success","data":{"resultType":"matrix","result":[{"metric":{"__name__":"vm_rows"},"values":[[1583786142,"13763"]]}]}}`))
}
})
srv := httptest.NewServer(mux)
defer srv.Close()
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client(), false)
p := NewPrometheusType()
pq := s.BuildWithParams(QuerierParams{DataSourceType: &p, EvaluationInterval: 15 * time.Second})
_, err = pq.QueryRange(ctx, query, time.Now(), time.Time{})
expectError(t, err, "is missing")
_, err = pq.QueryRange(ctx, query, time.Time{}, time.Now())
expectError(t, err, "is missing")
start, end := time.Now().Add(-time.Minute), time.Now()
m, err := pq.QueryRange(ctx, query, start, end)
if err != nil {
t.Fatalf("unexpected %s", err)
}
@@ -96,32 +215,321 @@ func TestVMSelectQuery(t *testing.T) {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
}
expected := Metric{
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
Timestamp: 1583786142,
Value: 13763,
Labels: []Label{{Value: "vm_rows", Name: "__name__"}},
Timestamps: []int64{1583786142},
Values: []float64{13763},
}
if m[0].Timestamp != expected.Timestamp &&
m[0].Value != expected.Value &&
m[0].Labels[0].Value != expected.Labels[0].Value &&
m[0].Labels[0].Name != expected.Labels[0].Name {
if !reflect.DeepEqual(m[0], expected) {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
}
m, err = am.Query(ctx, queryRender, NewGraphiteType())
g := NewGraphiteType()
gq := s.BuildWithParams(QuerierParams{DataSourceType: &g})
_, err = gq.QueryRange(ctx, queryRender, start, end)
expectError(t, err, "is not supported")
}
func TestRequestParams(t *testing.T) {
authCfg, err := promauth.NewConfig(".", nil, baCfg, "", "", nil, nil)
if err != nil {
t.Fatalf("unexpected %s", err)
t.Fatalf("unexpected: %s", err)
}
if len(m) != 1 {
t.Fatalf("expected 1 metric got %d in %+v", len(m), m)
query := "up"
timestamp := time.Date(2001, 2, 3, 4, 5, 6, 0, time.UTC)
testCases := []struct {
name string
queryRange bool
vm *VMStorage
checkFn func(t *testing.T, r *http.Request)
}{
{
"prometheus path",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusInstantPath, r.URL.Path)
},
},
{
"prometheus path with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
},
},
{
"prometheus prefix",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusInstantPath, r.URL.Path)
},
},
{
"prometheus prefix with disablePathAppend",
false,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
},
},
{
"prometheus range path",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusRangePath, r.URL.Path)
},
},
{
"prometheus range path with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, "", r.URL.Path)
},
},
{
"prometheus range prefix",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix+prometheusRangePath, r.URL.Path)
},
},
{
"prometheus range prefix with disablePathAppend",
true,
&VMStorage{
dataSourceType: NewPrometheusType(),
appendTypePrefix: true,
disablePathAppend: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, prometheusPrefix, r.URL.Path)
},
},
{
"graphite path",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, graphitePath, r.URL.Path)
},
},
{
"graphite prefix",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
appendTypePrefix: true,
},
func(t *testing.T, r *http.Request) {
checkEqualString(t, graphitePrefix+graphitePath, r.URL.Path)
},
},
{
"default params",
false,
&VMStorage{},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"default range params",
true,
&VMStorage{},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("end=%d&query=%s&start=%d", timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"basic auth",
false,
&VMStorage{authCfg: authCfg},
func(t *testing.T, r *http.Request) {
u, p, _ := r.BasicAuth()
checkEqualString(t, "foo", u)
checkEqualString(t, "bar", p)
},
},
{
"basic auth range",
true,
&VMStorage{authCfg: authCfg},
func(t *testing.T, r *http.Request) {
u, p, _ := r.BasicAuth()
checkEqualString(t, "foo", u)
checkEqualString(t, "bar", p)
},
},
{
"lookback",
false,
&VMStorage{
lookBack: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&time=%d", query, timestamp.Add(-time.Minute).Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"evaluation interval",
false,
&VMStorage{
evaluationInterval: 15 * time.Second,
},
func(t *testing.T, r *http.Request) {
evalInterval := 15 * time.Second
tt := timestamp.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"lookback + evaluation interval",
false,
&VMStorage{
lookBack: time.Minute,
evaluationInterval: 15 * time.Second,
},
func(t *testing.T, r *http.Request) {
evalInterval := 15 * time.Second
tt := timestamp.Add(-time.Minute)
tt = tt.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%v&time=%d", query, evalInterval, tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"step override",
false,
&VMStorage{
queryStep: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(time.Minute.Seconds()), timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"step to seconds",
false,
&VMStorage{
evaluationInterval: 3 * time.Hour,
},
func(t *testing.T, r *http.Request) {
evalInterval := 3 * time.Hour
tt := timestamp.Truncate(evalInterval)
exp := fmt.Sprintf("query=%s&step=%ds&time=%d", query, int(evalInterval.Seconds()), tt.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"prometheus extra params",
false,
&VMStorage{
extraParams: url.Values{"round_digits": {"10"}},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("query=%s&round_digits=10&time=%d", query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"prometheus extra params range",
true,
&VMStorage{
extraParams: url.Values{
"nocache": {"1"},
"max_lookback": {"1h"},
},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("end=%d&max_lookback=1h&nocache=1&query=%s&start=%d",
timestamp.Unix(), query, timestamp.Unix())
checkEqualString(t, exp, r.URL.RawQuery)
},
},
{
"graphite extra params",
false,
&VMStorage{
dataSourceType: NewGraphiteType(),
extraParams: url.Values{
"nocache": {"1"},
"max_lookback": {"1h"},
},
},
func(t *testing.T, r *http.Request) {
exp := fmt.Sprintf("format=json&from=-5min&max_lookback=1h&nocache=1&target=%s&until=now", query)
checkEqualString(t, exp, r.URL.RawQuery)
},
},
}
expected = Metric{
Labels: []Label{{Value: "constantLine(10)", Name: "name"}},
Timestamp: 1611758403,
Value: 10,
}
if m[0].Timestamp != expected.Timestamp &&
m[0].Value != expected.Value &&
m[0].Labels[0].Value != expected.Labels[0].Value &&
m[0].Labels[0].Name != expected.Labels[0].Name {
t.Fatalf("unexpected metric %+v want %+v", m[0], expected)
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
req, err := tc.vm.newRequestPOST()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
switch tc.vm.dataSourceType.String() {
case "prometheus":
if tc.queryRange {
tc.vm.setPrometheusRangeReqParams(req, query, timestamp, timestamp)
} else {
tc.vm.setPrometheusInstantReqParams(req, query, timestamp)
}
case "graphite":
tc.vm.setGraphiteReqParams(req, query, timestamp)
}
tc.checkFn(t, req)
})
}
}
func checkEqualString(t *testing.T, exp, got string) {
t.Helper()
if got != exp {
t.Errorf("expected to get: \n%q; \ngot: \n%q", exp, got)
}
}
func expectError(t *testing.T, err error, exp string) {
t.Helper()
if err == nil {
t.Errorf("expected non-nil error")
}
if !strings.Contains(err.Error(), exp) {
t.Errorf("expected error %q to contain %q", err, exp)
}
}

View File

@@ -4,6 +4,7 @@ import (
"context"
"fmt"
"hash/fnv"
"net/url"
"sync"
"time"
@@ -27,6 +28,9 @@ type Group struct {
Concurrency int
Checksum string
Labels map[string]string
Params url.Values
doneCh chan struct{}
finishedCh chan struct{}
// channel accepts new Group obj
@@ -49,17 +53,37 @@ func newGroupMetrics(name, file string) *groupMetrics {
return m
}
func newGroup(cfg config.Group, defaultInterval time.Duration, labels map[string]string) *Group {
// merges group rule labels into result map
// set2 has priority over set1.
func mergeLabels(groupName, ruleName string, set1, set2 map[string]string) map[string]string {
r := map[string]string{}
for k, v := range set1 {
r[k] = v
}
for k, v := range set2 {
if prevV, ok := r[k]; ok {
logger.Infof("label %q=%q for rule %q.%q overwritten with external label %q=%q",
k, prevV, groupName, ruleName, k, v)
}
r[k] = v
}
return r
}
func newGroup(cfg config.Group, qb datasource.QuerierBuilder, defaultInterval time.Duration, labels map[string]string) *Group {
g := &Group{
Type: cfg.Type,
Name: cfg.Name,
File: cfg.File,
Interval: cfg.Interval,
Interval: cfg.Interval.Duration(),
Concurrency: cfg.Concurrency,
Checksum: cfg.Checksum,
doneCh: make(chan struct{}),
finishedCh: make(chan struct{}),
updateCh: make(chan *Group),
Params: cfg.Params,
Labels: cfg.Labels,
doneCh: make(chan struct{}),
finishedCh: make(chan struct{}),
updateCh: make(chan *Group),
}
g.metrics = newGroupMetrics(g.Name, g.File)
if g.Interval == 0 {
@@ -70,33 +94,39 @@ func newGroup(cfg config.Group, defaultInterval time.Duration, labels map[string
}
rules := make([]Rule, len(cfg.Rules))
for i, r := range cfg.Rules {
// override rule labels with external labels
for k, v := range labels {
if prevV, ok := r.Labels[k]; ok {
logger.Infof("label %q=%q for rule %q.%q overwritten with external label %q=%q",
k, prevV, g.Name, r.Name(), k, v)
}
if r.Labels == nil {
r.Labels = map[string]string{}
}
r.Labels[k] = v
var extraLabels map[string]string
// apply external labels
if len(labels) > 0 {
extraLabels = labels
}
rules[i] = g.newRule(r)
// apply group labels, it has priority on external labels
if len(cfg.Labels) > 0 {
extraLabels = mergeLabels(g.Name, r.Name(), extraLabels, g.Labels)
}
// apply rules labels, it has priority on other labels
if len(extraLabels) > 0 {
r.Labels = mergeLabels(g.Name, r.Name(), extraLabels, r.Labels)
}
rules[i] = g.newRule(qb, r)
}
g.Rules = rules
return g
}
func (g *Group) newRule(rule config.Rule) Rule {
func (g *Group) newRule(qb datasource.QuerierBuilder, rule config.Rule) Rule {
if rule.Alert != "" {
return newAlertingRule(g, rule)
return newAlertingRule(qb, g, rule)
}
return newRecordingRule(g, rule)
return newRecordingRule(qb, g, rule)
}
// ID return unique group ID that consists of
// rules file and group name
func (g *Group) ID() uint64 {
g.mu.RLock()
defer g.mu.RUnlock()
hash := fnv.New64a()
hash.Write([]byte(g.File))
hash.Write([]byte("\xff"))
@@ -106,7 +136,8 @@ func (g *Group) ID() uint64 {
}
// Restore restores alerts state for group rules
func (g *Group) Restore(ctx context.Context, q datasource.Querier, lookback time.Duration, labels map[string]string) error {
func (g *Group) Restore(ctx context.Context, qb datasource.QuerierBuilder, lookback time.Duration, labels map[string]string) error {
labels = mergeLabels(g.Name, "", labels, g.Labels)
for _, rule := range g.Rules {
rr, ok := rule.(*AlertingRule)
if !ok {
@@ -115,6 +146,9 @@ func (g *Group) Restore(ctx context.Context, q datasource.Querier, lookback time
if rr.For < 1 {
continue
}
// ignore g.ExtraFilterLabels on purpose, so it
// won't affect the restore procedure.
q := qb.BuildWithParams(datasource.QuerierParams{})
if err := rr.Restore(ctx, q, lookback, labels); err != nil {
return fmt.Errorf("error while restoring rule %q: %w", rule, err)
}
@@ -160,19 +194,18 @@ func (g *Group) updateWith(newGroup *Group) error {
for _, nr := range rulesRegistry {
newRules = append(newRules, nr)
}
// note that g.Interval is not updated here
// so the value can be compared later in
// group.Start function
g.Type = newGroup.Type
g.Concurrency = newGroup.Concurrency
g.Params = newGroup.Params
g.Labels = newGroup.Labels
g.Checksum = newGroup.Checksum
g.Rules = newRules
return nil
}
var (
alertsFired = metrics.NewCounter(`vmalert_alerts_fired_total`)
alertsSent = metrics.NewCounter(`vmalert_alerts_sent_total`)
alertsSendErrors = metrics.NewCounter(`vmalert_alerts_send_errors_total`)
)
func (g *Group) close() {
if g.doneCh == nil {
return
@@ -189,12 +222,12 @@ func (g *Group) close() {
var skipRandSleepOnGroupStart bool
func (g *Group) start(ctx context.Context, querier datasource.Querier, nts []notifier.Notifier, rw *remotewrite.Client) {
func (g *Group) start(ctx context.Context, nts []notifier.Notifier, rw *remotewrite.Client) {
defer func() { close(g.finishedCh) }()
// Spread group rules evaluation over time in order to reduce load on VictoriaMetrics.
if !skipRandSleepOnGroupStart {
randSleep := uint64(float64(g.Interval) * (float64(uint32(g.ID())) / (1 << 32)))
randSleep := uint64(float64(g.Interval) * (float64(g.ID()) / (1 << 64)))
sleepOffset := uint64(time.Now().UnixNano()) % uint64(g.Interval)
if randSleep < sleepOffset {
randSleep += uint64(g.Interval)
@@ -213,7 +246,16 @@ func (g *Group) start(ctx context.Context, querier datasource.Querier, nts []not
}
logger.Infof("group %q started; interval=%v; concurrency=%d", g.Name, g.Interval, g.Concurrency)
e := &executor{querier, nts, rw}
e := &executor{rw: rw}
for _, nt := range nts {
ent := eNotifier{
Notifier: nt,
alertsSent: getOrCreateCounter(fmt.Sprintf("vmalert_alerts_sent_total{addr=%q}", nt.Addr())),
alertsSendErrors: getOrCreateCounter(fmt.Sprintf("vmalert_alerts_send_errors_total{addr=%q}", nt.Addr())),
}
e.notifiers = append(e.notifiers, ent)
}
t := time.NewTicker(g.Interval)
defer t.Stop()
for {
@@ -242,36 +284,48 @@ func (g *Group) start(ctx context.Context, querier datasource.Querier, nts []not
case <-t.C:
g.metrics.iterationTotal.Inc()
iterationStart := time.Now()
errs := e.execConcurrently(ctx, g.Rules, g.Concurrency, g.Interval)
for err := range errs {
if err != nil {
logger.Errorf("group %q: %s", g.Name, err)
if len(g.Rules) > 0 {
resolveDuration := getResolveDuration(g.Interval)
errs := e.execConcurrently(ctx, g.Rules, g.Concurrency, resolveDuration)
for err := range errs {
if err != nil {
logger.Errorf("group %q: %s", g.Name, err)
}
}
}
g.metrics.iterationDuration.UpdateDuration(iterationStart)
}
}
}
// resolveDuration for alerts is equal to 3 interval evaluations
// so in case if vmalert stops sending updates for some reason,
// notifier could automatically resolve the alert.
func getResolveDuration(groupInterval time.Duration) time.Duration {
resolveInterval := groupInterval * 3
if *maxResolveDuration > 0 && (resolveInterval > *maxResolveDuration) {
return *maxResolveDuration
}
return resolveInterval
}
type executor struct {
querier datasource.Querier
notifiers []notifier.Notifier
notifiers []eNotifier
rw *remotewrite.Client
}
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurrency int, interval time.Duration) chan error {
res := make(chan error, len(rules))
var returnSeries bool
if e.rw != nil {
returnSeries = true
}
type eNotifier struct {
notifier.Notifier
alertsSent *counter
alertsSendErrors *counter
}
func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurrency int, resolveDuration time.Duration) chan error {
res := make(chan error, len(rules))
if concurrency == 1 {
// fast path
for _, rule := range rules {
res <- e.exec(ctx, rule, returnSeries, interval)
res <- e.exec(ctx, rule, resolveDuration)
}
close(res)
return res
@@ -284,7 +338,7 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurren
sem <- struct{}{}
wg.Add(1)
go func(r Rule) {
res <- e.exec(ctx, r, returnSeries, interval)
res <- e.exec(ctx, r, resolveDuration)
<-sem
wg.Done()
}(rule)
@@ -296,21 +350,19 @@ func (e *executor) execConcurrently(ctx context.Context, rules []Rule, concurren
}
var (
execTotal = metrics.NewCounter(`vmalert_execution_total`)
execErrors = metrics.NewCounter(`vmalert_execution_errors_total`)
execDuration = metrics.NewSummary(`vmalert_execution_duration_seconds`)
alertsFired = metrics.NewCounter(`vmalert_alerts_fired_total`)
execTotal = metrics.NewCounter(`vmalert_execution_total`)
execErrors = metrics.NewCounter(`vmalert_execution_errors_total`)
remoteWriteErrors = metrics.NewCounter(`vmalert_remotewrite_errors_total`)
remoteWriteTotal = metrics.NewCounter(`vmalert_remotewrite_total`)
)
func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, interval time.Duration) error {
func (e *executor) exec(ctx context.Context, rule Rule, resolveDuration time.Duration) error {
execTotal.Inc()
execStart := time.Now()
defer func() {
execDuration.UpdateDuration(execStart)
}()
tss, err := rule.Exec(ctx, e.querier, returnSeries)
tss, err := rule.Exec(ctx)
if err != nil {
execErrors.Inc()
return fmt.Errorf("rule %q: failed to execute: %w", rule, err)
@@ -318,6 +370,7 @@ func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, inter
if len(tss) > 0 && e.rw != nil {
for _, ts := range tss {
remoteWriteTotal.Inc()
if err := e.rw.Push(ts); err != nil {
remoteWriteErrors.Inc()
return fmt.Errorf("rule %q: remote write failure: %w", rule, err)
@@ -333,10 +386,7 @@ func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, inter
for _, a := range ar.alerts {
switch a.State {
case notifier.StateFiring:
// set End to execStart + 3 intervals
// so notifier can resolve it automatically if `vmalert`
// won't be able to send resolve for some reason
a.End = time.Now().Add(3 * interval)
a.End = time.Now().Add(resolveDuration)
alerts = append(alerts, *a)
case notifier.StateInactive:
// set End to execStart to notify
@@ -349,11 +399,11 @@ func (e *executor) exec(ctx context.Context, rule Rule, returnSeries bool, inter
return nil
}
alertsSent.Add(len(alerts))
errGr := new(utils.ErrGroup)
for _, nt := range e.notifiers {
nt.alertsSent.Add(len(alerts))
if err := nt.Send(ctx, alerts); err != nil {
alertsSendErrors.Inc()
nt.alertsSendErrors.Inc()
errGr.Add(fmt.Errorf("rule %q: failed to send alerts: %w", rule, err))
}
}

View File

@@ -2,12 +2,14 @@ package main
import (
"context"
"fmt"
"sort"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
func init() {
@@ -32,7 +34,7 @@ func TestUpdateWith(t *testing.T) {
[]config.Rule{{
Alert: "foo",
Expr: "up > 0",
For: config.NewPromDuration(time.Second),
For: utils.NewPromDuration(time.Second),
Labels: map[string]string{
"bar": "baz",
},
@@ -44,7 +46,7 @@ func TestUpdateWith(t *testing.T) {
[]config.Rule{{
Alert: "foo",
Expr: "up > 10",
For: config.NewPromDuration(time.Second),
For: utils.NewPromDuration(time.Second),
Labels: map[string]string{
"baz": "bar",
},
@@ -110,15 +112,16 @@ func TestUpdateWith(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
g := &Group{Name: "test"}
qb := &fakeQuerier{}
for _, r := range tc.currentRules {
r.ID = config.HashRule(r)
g.Rules = append(g.Rules, g.newRule(r))
g.Rules = append(g.Rules, g.newRule(qb, r))
}
ng := &Group{Name: "test"}
for _, r := range tc.newRules {
r.ID = config.HashRule(r)
ng.Rules = append(ng.Rules, ng.newRule(r))
ng.Rules = append(ng.Rules, ng.newRule(qb, r))
}
err := g.updateWith(ng)
@@ -156,11 +159,11 @@ func TestGroupStart(t *testing.T) {
t.Fatalf("failed to parse rules: %s", err)
}
const evalInterval = time.Millisecond
g := newGroup(groups[0], evalInterval, map[string]string{"cluster": "east-1"})
g.Concurrency = 2
fn := &fakeNotifier{}
fs := &fakeQuerier{}
fn := &fakeNotifier{}
g := newGroup(groups[0], fs, evalInterval, map[string]string{"cluster": "east-1"})
g.Concurrency = 2
const inst1, inst2, job = "foo", "bar", "baz"
m1 := metricWithLabels(t, "instance", inst1, "job", job)
@@ -177,7 +180,14 @@ func TestGroupStart(t *testing.T) {
// add rule labels - see config/testdata/rules1-good.rules
alert1.Labels["label"] = "bar"
alert1.Labels["host"] = inst1
alert1.ID = hash(m1)
// add service labels
alert1.Labels[alertNameLabel] = alert1.Name
alert1.Labels[alertGroupNameLabel] = g.Name
var labels1 []string
for k, v := range alert1.Labels {
labels1 = append(labels1, k, v)
}
alert1.ID = hash(metricWithLabels(t, labels1...))
alert2, err := r.newAlert(m2, time.Now(), nil)
if err != nil {
@@ -189,13 +199,20 @@ func TestGroupStart(t *testing.T) {
// add rule labels - see config/testdata/rules1-good.rules
alert2.Labels["label"] = "bar"
alert2.Labels["host"] = inst2
alert2.ID = hash(m2)
// add service labels
alert2.Labels[alertNameLabel] = alert2.Name
alert2.Labels[alertGroupNameLabel] = g.Name
var labels2 []string
for k, v := range alert2.Labels {
labels2 = append(labels2, k, v)
}
alert2.ID = hash(metricWithLabels(t, labels2...))
finished := make(chan struct{})
fs.add(m1)
fs.add(m2)
go func() {
g.start(context.Background(), fs, []notifier.Notifier{fn}, nil)
g.start(context.Background(), []notifier.Notifier{fn}, nil)
close(finished)
}()
@@ -221,3 +238,27 @@ func TestGroupStart(t *testing.T) {
g.close()
<-finished
}
func TestResolveDuration(t *testing.T) {
testCases := []struct {
groupInterval time.Duration
maxDuration time.Duration
expected time.Duration
}{
{time.Minute, 0, 3 * time.Minute},
{3 * time.Minute, 0, 9 * time.Minute},
{time.Minute, 2 * time.Minute, 2 * time.Minute},
{0, 0, 0},
}
defaultResolveDuration := *maxResolveDuration
defer func() { *maxResolveDuration = defaultResolveDuration }()
for _, tc := range testCases {
t.Run(fmt.Sprintf("%v-%v-%v", tc.groupInterval, tc.expected, tc.maxDuration), func(t *testing.T) {
*maxResolveDuration = tc.maxDuration
got := getResolveDuration(tc.groupInterval)
if got != tc.expected {
t.Errorf("expected to have %v; got %v", tc.expected, got)
}
})
}
}

View File

@@ -7,6 +7,7 @@ import (
"sort"
"sync"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
@@ -38,7 +39,15 @@ func (fq *fakeQuerier) add(metrics ...datasource.Metric) {
fq.Unlock()
}
func (fq *fakeQuerier) Query(_ context.Context, _ string, _ datasource.Type) ([]datasource.Metric, error) {
func (fq *fakeQuerier) BuildWithParams(_ datasource.QuerierParams) datasource.Querier {
return fq
}
func (fq *fakeQuerier) QueryRange(ctx context.Context, q string, _, _ time.Time) ([]datasource.Metric, error) {
return fq.Query(ctx, q)
}
func (fq *fakeQuerier) Query(_ context.Context, _ string) ([]datasource.Metric, error) {
fq.Lock()
defer fq.Unlock()
if fq.err != nil {
@@ -54,6 +63,7 @@ type fakeNotifier struct {
alerts []notifier.Alert
}
func (*fakeNotifier) Addr() string { return "" }
func (fn *fakeNotifier) Send(_ context.Context, alerts []notifier.Alert) error {
fn.Lock()
defer fn.Unlock()
@@ -68,9 +78,16 @@ func (fn *fakeNotifier) getAlerts() []notifier.Alert {
}
func metricWithValueAndLabels(t *testing.T, value float64, labels ...string) datasource.Metric {
return metricWithValuesAndLabels(t, []float64{value}, labels...)
}
func metricWithValuesAndLabels(t *testing.T, values []float64, labels ...string) datasource.Metric {
t.Helper()
m := metricWithLabels(t, labels...)
m.Value = value
m.Values = values
for i := range values {
m.Timestamps = append(m.Timestamps, int64(i))
}
return m
}
@@ -79,7 +96,7 @@ func metricWithLabels(t *testing.T, labels ...string) datasource.Metric {
if len(labels) == 0 || len(labels)%2 != 0 {
t.Fatalf("expected to get even number of labels")
}
m := datasource.Metric{}
m := datasource.Metric{Values: []float64{1}, Timestamps: []int64{1}}
for i := 0; i < len(labels); i += 2 {
m.Labels = append(m.Labels, datasource.Label{
Name: labels[i],
@@ -160,6 +177,9 @@ func compareAlertingRules(t *testing.T, a, b *AlertingRule) error {
if !reflect.DeepEqual(a.Labels, b.Labels) {
return fmt.Errorf("expected to have labels %#v; got %#v", a.Labels, b.Labels)
}
if a.Type.String() != b.Type.String() {
return fmt.Errorf("expected to have Type %#v; got %#v", a.Type.String(), b.Type.String())
}
return nil
}
@@ -185,7 +205,8 @@ func compareTimeSeries(t *testing.T, a, b []prompbmarshal.TimeSeries) error {
}*/
}
if len(expTS.Labels) != len(gotTS.Labels) {
return fmt.Errorf("expected number of labels %d; got %d", len(expTS.Labels), len(gotTS.Labels))
return fmt.Errorf("expected number of labels %d (%v); got %d (%v)",
len(expTS.Labels), expTS.Labels, len(gotTS.Labels), gotTS.Labels)
}
for i, exp := range expTS.Labels {
got := gotTS.Labels[i]

View File

@@ -26,31 +26,41 @@ import (
)
var (
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
rulePath = flagutil.NewArray("rule", `Path to the file with alert rules.
Supports patterns. Flag can be specified multiple times.
Examples:
-rule="/path/to/file". Path to a single file with alerting rules
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
-rule="dir/*.yaml" -rule="/*.yaml". Relative path to all .yaml files in "dir" folder,
absolute path to all .yaml files in root.
Rule files may contain %{ENV_VAR} placeholders, which are substituted by the corresponding env vars.`)
rulesCheckInterval = flag.Duration("rule.configCheckInterval", 0, "Interval for checking for changes in '-rule' files. "+
"By default the checking is disabled. Send SIGHUP signal in order to force config check for changes")
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections")
evaluationInterval = flag.Duration("evaluationInterval", time.Minute, "How often to evaluate the rules")
validateTemplates = flag.Bool("rule.validateTemplates", true, "Whether to validate annotation and label templates")
validateExpressions = flag.Bool("rule.validateExpressions", true, "Whether to validate rules expressions via MetricsQL engine")
maxResolveDuration = flag.Duration("rule.maxResolveDuration", 0, "Limits the maximum duration for automatic alert expiration, "+
"which is by default equal to 3 evaluation intervals of the parent group.")
externalURL = flag.String("external.url", "", "External URL is used as alert's source for sent alerts to the notifier")
externalAlertSource = flag.String("external.alert.source", "", `External Alert Source allows to override the Source link for alerts sent to AlertManager for cases where you want to build a custom link to Grafana, Prometheus or any other service.
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|pathEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
eg. 'explore?orgId=1&left=[\"now-1h\",\"now\",\"VictoriaMetrics\",{\"expr\": \"{{$expr|quotesEscape|crlfEscape|queryEscape}}\"},{\"mode\":\"Metrics\"},{\"ui\":[true,true,true,\"none\"]}]'.If empty '/api/v1/:groupID/alertID/status' is used`)
externalLabels = flagutil.NewArray("external.label", "Optional label in the form 'name=value' to add to all generated recording rules and alerts. "+
"Pass multiple -label flags in order to add multiple label sets.")
remoteReadLookBack = flag.Duration("remoteRead.lookback", time.Hour, "Lookback defines how far to look into past for alerts timeseries."+
" For example, if lookback=1h then range from now() to now()-1h will be scanned.")
remoteReadIgnoreRestoreErrors = flag.Bool("remoteRead.ignoreRestoreErrors", true, "Whether to ignore errors from remote storage when restoring alerts state on startup.")
disableAlertGroupLabel = flag.Bool("disableAlertgroupLabel", false, "Whether to disable adding group's name as label to generated alerts and time series.")
dryRun = flag.Bool("dryRun", false, "Whether to check only config files without running vmalert. The rules file are validated. The `-rule` flag must be specified.")
)
var alertURLGeneratorFn notifier.AlertURLGenerator
func main() {
// Write flags and help message to stdout, since it is easier to grep or pipe.
flag.CommandLine.SetOutput(os.Stdout)
@@ -64,42 +74,71 @@ func main() {
notifier.InitTemplateFunc(u)
groups, err := config.Parse(*rulePath, true, true)
if err != nil {
logger.Fatalf(err.Error())
logger.Fatalf("failed to parse %q: %s", *rulePath, err)
}
if len(groups) == 0 {
logger.Fatalf("No rules for validation. Please specify path to file(s) with alerting and/or recording rules using `-rule` flag")
}
return
}
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
if err != nil {
logger.Fatalf("failed to init `external.url`: %s", err)
}
notifier.InitTemplateFunc(eu)
alertURLGeneratorFn, err = getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
if err != nil {
logger.Fatalf("failed to init `external.alert.source`: %s", err)
}
if *replayFrom != "" || *replayTo != "" {
rw, err := remotewrite.Init(context.Background())
if err != nil {
logger.Fatalf("failed to init remoteWrite: %s", err)
}
if rw == nil {
logger.Fatalf("remoteWrite.url can't be empty in replay mode")
}
notifier.InitTemplateFunc(eu)
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil {
logger.Fatalf("cannot parse configuration file: %s", err)
}
// prevent queries from caching and boundaries aligning
// when querying VictoriaMetrics datasource.
q, err := datasource.Init(url.Values{"nocache": {"1"}})
if err != nil {
logger.Fatalf("failed to init datasource: %s", err)
}
if err := replay(groupsCfg, q, rw); err != nil {
logger.Fatalf("replay failed: %s", err)
}
return
}
ctx, cancel := context.WithCancel(context.Background())
manager, err := newManager(ctx)
if err != nil {
logger.Fatalf("failed to init: %s", err)
}
if err := manager.start(ctx, *rulePath, *validateTemplates, *validateExpressions); err != nil {
logger.Infof("reading rules configuration file from %q", strings.Join(*rulePath, ";"))
groupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil {
logger.Fatalf("cannot parse configuration file: %s", err)
}
// Register SIGHUP handler for config re-read just before manager.start call.
// This guarantees that the config will be re-read if the signal arrives during manager.start call.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1240
sighupCh := procutil.NewSighupChan()
if err := manager.start(ctx, groupsCfg); err != nil {
logger.Fatalf("failed to start: %s", err)
}
go func() {
// init reload metrics with positive values to improve alerting conditions
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
sigHup := procutil.NewSighupChan()
for {
<-sigHup
configReloads.Inc()
logger.Infof("SIGHUP received. Going to reload rules %q ...", *rulePath)
if err := manager.update(ctx, *rulePath, *validateTemplates, *validateExpressions, false); err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("error while reloading rules: %s", err)
continue
}
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("Rules reloaded successfully from %q", *rulePath)
}
}()
go configReload(ctx, manager, groupsCfg, sighupCh)
rh := &requestHandler{m: manager}
go httpserver.Serve(*httpListenAddr, rh.handler)
@@ -121,29 +160,19 @@ var (
)
func newManager(ctx context.Context) (*manager, error) {
q, err := datasource.Init()
q, err := datasource.Init(nil)
if err != nil {
return nil, fmt.Errorf("failed to init datasource: %w", err)
}
eu, err := getExternalURL(*externalURL, *httpListenAddr, httpserver.IsTLS())
if err != nil {
return nil, fmt.Errorf("failed to init `external.url`: %w", err)
}
notifier.InitTemplateFunc(eu)
aug, err := getAlertURLGenerator(eu, *externalAlertSource, *validateTemplates)
if err != nil {
return nil, fmt.Errorf("failed to init `external.alert.source`: %w", err)
}
nts, err := notifier.Init(aug)
nts, err := notifier.Init(alertURLGeneratorFn)
if err != nil {
return nil, fmt.Errorf("failed to init notifier: %w", err)
}
manager := &manager{
groups: make(map[uint64]*Group),
querier: q,
notifiers: nts,
labels: map[string]string{},
groups: make(map[uint64]*Group),
querierBuilder: q,
notifiers: nts,
labels: map[string]string{},
}
rw, err := remotewrite.Init(ctx)
if err != nil {
@@ -218,7 +247,66 @@ func usage() {
const s = `
vmalert processes alerts and recording rules.
See the docs at https://victoriametrics.github.io/vmalert.html .
See the docs at https://docs.victoriametrics.com/vmalert.html .
`
flagutil.Usage(s)
}
func configReload(ctx context.Context, m *manager, groupsCfg []config.Group, sighupCh <-chan os.Signal) {
var configCheckCh <-chan time.Time
if *rulesCheckInterval > 0 {
ticker := time.NewTicker(*rulesCheckInterval)
configCheckCh = ticker.C
defer ticker.Stop()
}
// init reload metrics with positive values to improve alerting conditions
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
for {
select {
case <-ctx.Done():
return
case <-sighupCh:
logger.Infof("SIGHUP received. Going to reload rules %q ...", *rulePath)
configReloads.Inc()
case <-configCheckCh:
}
newGroupsCfg, err := config.Parse(*rulePath, *validateTemplates, *validateExpressions)
if err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("cannot parse configuration file: %s", err)
continue
}
if configsEqual(newGroupsCfg, groupsCfg) {
// set success to 1 since previous reload
// could have been unsuccessful
configSuccess.Set(1)
// config didn't change - skip it
continue
}
if err := m.update(ctx, newGroupsCfg, false); err != nil {
configReloadErrors.Inc()
configSuccess.Set(0)
logger.Errorf("error while reloading rules: %s", err)
continue
}
groupsCfg = newGroupsCfg
configSuccess.Set(1)
configTimestamp.Set(fasttime.UnixTimestamp())
logger.Infof("Rules reloaded successfully from %q", *rulePath)
}
}
func configsEqual(a, b []config.Group) bool {
if len(a) != len(b) {
return false
}
for i := range a {
if a[i].Checksum != b[i].Checksum {
return false
}
}
return true
}

View File

@@ -1,12 +1,17 @@
package main
import (
"context"
"fmt"
"io/ioutil"
"net/url"
"os"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
func TestGetExternalURL(t *testing.T) {
@@ -51,3 +56,105 @@ func TestGetAlertURLGenerator(t *testing.T) {
t.Errorf("unexpected url want %s, got %s", exp, fn(testAlert))
}
}
func TestConfigReload(t *testing.T) {
originalRulePath := *rulePath
defer func() {
*rulePath = originalRulePath
}()
const (
rules1 = `
groups:
- name: group-1
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
- record: handler:requests:rate5m
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
`
rules2 = `
groups:
- name: group-1
rules:
- alert: ExampleAlertAlwaysFiring
expr: sum by(job) (up == 1)
- name: group-2
rules:
- record: handler:requests:rate5m
expr: sum(rate(prometheus_http_requests_total[5m])) by (handler)
`
)
f, err := ioutil.TempFile("", "")
if err != nil {
t.Fatal(err)
}
writeToFile(t, f.Name(), rules1)
*rulesCheckInterval = 200 * time.Millisecond
*rulePath = []string{f.Name()}
ctx, cancel := context.WithCancel(context.Background())
m := &manager{
querierBuilder: &fakeQuerier{},
groups: make(map[uint64]*Group),
labels: map[string]string{},
notifiers: []notifier.Notifier{&fakeNotifier{}},
rw: &remotewrite.Client{},
}
syncCh := make(chan struct{})
sighupCh := procutil.NewSighupChan()
go func() {
configReload(ctx, m, nil, sighupCh)
close(syncCh)
}()
lenLocked := func(m *manager) int {
m.groupsMu.RLock()
defer m.groupsMu.RUnlock()
return len(m.groups)
}
time.Sleep(*rulesCheckInterval * 2)
groupsLen := lenLocked(m)
if groupsLen != 1 {
t.Fatalf("expected to have exactly 1 group loaded; got %d", groupsLen)
}
writeToFile(t, f.Name(), rules2)
time.Sleep(*rulesCheckInterval * 2)
groupsLen = lenLocked(m)
if groupsLen != 2 {
fmt.Println(m.groups)
t.Fatalf("expected to have exactly 2 groups loaded; got %d", groupsLen)
}
writeToFile(t, f.Name(), rules1)
procutil.SelfSIGHUP()
time.Sleep(*rulesCheckInterval / 2)
groupsLen = lenLocked(m)
if groupsLen != 1 {
t.Fatalf("expected to have exactly 1 group loaded; got %d", groupsLen)
}
writeToFile(t, f.Name(), `corrupted`)
procutil.SelfSIGHUP()
time.Sleep(*rulesCheckInterval / 2)
groupsLen = lenLocked(m)
if groupsLen != 1 { // should remain unchanged
t.Fatalf("expected to have exactly 1 group loaded; got %d", groupsLen)
}
cancel()
<-syncCh
}
func writeToFile(t *testing.T, file, b string) {
t.Helper()
err := ioutil.WriteFile(file, []byte(b), 0644)
if err != nil {
t.Fatal(err)
}
}

View File

@@ -3,7 +3,8 @@ package main
import (
"context"
"fmt"
"strings"
"net/url"
"sort"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
@@ -15,11 +16,12 @@ import (
// manager controls group states
type manager struct {
querier datasource.Querier
notifiers []notifier.Notifier
querierBuilder datasource.QuerierBuilder
notifiers []notifier.Notifier
rw *remotewrite.Client
rr datasource.Querier
// remote read builder.
rr datasource.QuerierBuilder
wg sync.WaitGroup
labels map[string]string
@@ -49,8 +51,8 @@ func (m *manager) AlertAPI(gID, aID uint64) (*APIAlert, error) {
return nil, fmt.Errorf("can't find alert with id %q in group %q", aID, g.Name)
}
func (m *manager) start(ctx context.Context, path []string, validateTpl, validateExpr bool) error {
return m.update(ctx, path, validateTpl, validateExpr, true)
func (m *manager) start(ctx context.Context, groupsCfg []config.Group) error {
return m.update(ctx, groupsCfg, true)
}
func (m *manager) close() {
@@ -63,10 +65,13 @@ func (m *manager) close() {
m.wg.Wait()
}
func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) {
func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) error {
if restore && m.rr != nil {
err := group.Restore(ctx, m.rr, *remoteReadLookBack, m.labels)
if err != nil {
if !*remoteReadIgnoreRestoreErrors {
return fmt.Errorf("failed to restore state for group %q: %w", group.Name, err)
}
logger.Errorf("error while restoring state for group %q: %s", group.Name, err)
}
}
@@ -74,25 +79,39 @@ func (m *manager) startGroup(ctx context.Context, group *Group, restore bool) {
m.wg.Add(1)
id := group.ID()
go func() {
group.start(ctx, m.querier, m.notifiers, m.rw)
group.start(ctx, m.notifiers, m.rw)
m.wg.Done()
}()
m.groups[id] = group
return nil
}
func (m *manager) update(ctx context.Context, path []string, validateTpl, validateExpr, restore bool) error {
logger.Infof("reading rules configuration file from %q", strings.Join(path, ";"))
groupsCfg, err := config.Parse(path, validateTpl, validateExpr)
if err != nil {
return fmt.Errorf("cannot parse configuration file: %w", err)
}
func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore bool) error {
var rrPresent, arPresent bool
groupsRegistry := make(map[uint64]*Group)
for _, cfg := range groupsCfg {
ng := newGroup(cfg, *evaluationInterval, m.labels)
for _, r := range cfg.Rules {
if rrPresent && arPresent {
continue
}
if r.Record != "" {
rrPresent = true
}
if r.Alert != "" {
arPresent = true
}
}
ng := newGroup(cfg, m.querierBuilder, *evaluationInterval, m.labels)
groupsRegistry[ng.ID()] = ng
}
if rrPresent && m.rw == nil {
return fmt.Errorf("config contains recording rules but `-remoteWrite.url` isn't set")
}
if arPresent && m.notifiers == nil {
return fmt.Errorf("config contains alerting rules but `-notifier.url` isn't set")
}
type updateItem struct {
old *Group
new *Group
@@ -116,7 +135,9 @@ func (m *manager) update(ctx context.Context, path []string, validateTpl, valida
}
}
for _, ng := range groupsRegistry {
m.startGroup(ctx, ng, restore)
if err := m.startGroup(ctx, ng, restore); err != nil {
return err
}
}
m.groupsMu.Unlock()
@@ -140,12 +161,15 @@ func (g *Group) toAPI() APIGroup {
ag := APIGroup{
// encode as string to avoid rounding
ID: fmt.Sprintf("%d", g.ID()),
ID: fmt.Sprintf("%d", g.ID()),
Name: g.Name,
Type: g.Type.String(),
File: g.File,
Interval: g.Interval.String(),
Concurrency: g.Concurrency,
Params: urlValuesToStrings(g.Params),
Labels: g.Labels,
}
for _, r := range g.Rules {
switch v := r.(type) {
@@ -157,3 +181,24 @@ func (g *Group) toAPI() APIGroup {
}
return ag
}
func urlValuesToStrings(values url.Values) []string {
if len(values) < 1 {
return nil
}
keys := make([]string, 0, len(values))
for k := range values {
keys = append(keys, k)
}
sort.Strings(keys)
var res []string
for _, k := range keys {
params := values[k]
for _, v := range params {
res = append(res, fmt.Sprintf("%s=%s", k, v))
}
}
return res
}

View File

@@ -5,13 +5,15 @@ import (
"math/rand"
"net/url"
"os"
"strings"
"sync"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
)
func TestMain(m *testing.M) {
@@ -25,9 +27,8 @@ func TestMain(m *testing.M) {
// starting with empty rules folder
func TestManagerEmptyRulesDir(t *testing.T) {
m := &manager{groups: make(map[uint64]*Group)}
path := []string{"foo/bar"}
err := m.update(context.Background(), path, true, true, false)
if err != nil {
cfg := loadCfg(t, []string{"foo/bar"}, true, true)
if err := m.update(context.Background(), cfg, false); err != nil {
t.Fatalf("expected to load succesfully with empty rules dir; got err instead: %v", err)
}
}
@@ -37,9 +38,9 @@ func TestManagerEmptyRulesDir(t *testing.T) {
// Should be executed with -race flag
func TestManagerUpdateConcurrent(t *testing.T) {
m := &manager{
groups: make(map[uint64]*Group),
querier: &fakeQuerier{},
notifiers: []notifier.Notifier{&fakeNotifier{}},
groups: make(map[uint64]*Group),
querierBuilder: &fakeQuerier{},
notifiers: []notifier.Notifier{&fakeNotifier{}},
}
paths := []string{
"config/testdata/dir/rules0-good.rules",
@@ -50,8 +51,11 @@ func TestManagerUpdateConcurrent(t *testing.T) {
"config/testdata/rules1-good.rules",
"config/testdata/rules2-good.rules",
}
evalInterval := *evaluationInterval
defer func() { *evaluationInterval = evalInterval }()
*evaluationInterval = time.Millisecond
if err := m.start(context.Background(), []string{paths[0]}, true, true); err != nil {
cfg := loadCfg(t, []string{paths[0]}, true, true)
if err := m.start(context.Background(), cfg); err != nil {
t.Fatalf("failed to start: %s", err)
}
@@ -64,8 +68,11 @@ func TestManagerUpdateConcurrent(t *testing.T) {
defer wg.Done()
for i := 0; i < iterations; i++ {
rnd := rand.Intn(len(paths))
path := []string{paths[rnd]}
_ = m.update(context.Background(), path, true, true, false)
cfg, err := config.Parse([]string{paths[rnd]}, true, true)
if err != nil { // update can fail and this is expected
continue
}
_ = m.update(context.Background(), cfg, false)
}
}()
}
@@ -108,18 +115,6 @@ func TestManagerUpdate(t *testing.T) {
Name: "ExampleAlertAlwaysFiring",
Expr: "sum by(job) (up == 1)",
}
ExampleAlertGraphite = &AlertingRule{
Name: "up graphite",
Expr: "filterSeries(time('host.1',20),'>','0')",
Type: datasource.NewGraphiteType(),
For: defaultEvalInterval,
}
ExampleAlertGraphite2 = &AlertingRule{
Name: "up",
Expr: "filterSeries(time('host.2',20),'>','0')",
Type: datasource.NewGraphiteType(),
For: defaultEvalInterval,
}
)
testCases := []struct {
@@ -143,7 +138,7 @@ func TestManagerUpdate(t *testing.T) {
Name: "VMRows",
Expr: "vm_rows > 0",
For: 5 * time.Minute,
Labels: map[string]string{"label": "bar"},
Labels: map[string]string{"dc": "gcp", "label": "bar"},
Annotations: map[string]string{
"summary": "{{ $value }}",
"description": "{{$labels}}",
@@ -221,35 +216,25 @@ func TestManagerUpdate(t *testing.T) {
},
},
},
{
name: "update prometheus to graphite type",
initPath: "config/testdata/dir/rules-update0-good.rules",
updatePath: "config/testdata/dir/rules-update1-good.rules",
want: []*Group{
{
File: "config/testdata/dir/rules-update1-good.rules",
Interval: defaultEvalInterval,
Type: datasource.NewGraphiteType(),
Name: "TestUpdateGroup",
Rules: []Rule{
ExampleAlertGraphite2,
ExampleAlertGraphite,
},
},
},
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
ctx, cancel := context.WithCancel(context.TODO())
m := &manager{groups: make(map[uint64]*Group), querier: &fakeQuerier{}}
path := []string{tc.initPath}
if err := m.update(ctx, path, true, true, false); err != nil {
m := &manager{
groups: make(map[uint64]*Group),
querierBuilder: &fakeQuerier{},
notifiers: []notifier.Notifier{&fakeNotifier{}},
}
cfgInit := loadCfg(t, []string{tc.initPath}, true, true)
if err := m.update(ctx, cfgInit, false); err != nil {
t.Fatalf("failed to complete initial rules update: %s", err)
}
path = []string{tc.updatePath}
_ = m.update(ctx, path, true, true, false)
cfgUpdate, err := config.Parse([]string{tc.updatePath}, true, true)
if err == nil { // update can fail and that's expected
_ = m.update(ctx, cfgUpdate, false)
}
if len(tc.want) != len(m.groups) {
t.Fatalf("\nwant number of groups: %d;\ngot: %d ", len(tc.want), len(m.groups))
}
@@ -267,3 +252,84 @@ func TestManagerUpdate(t *testing.T) {
})
}
}
func TestManagerUpdateNegative(t *testing.T) {
testCases := []struct {
notifiers []notifier.Notifier
rw *remotewrite.Client
cfg config.Group
expErr string
}{
{
nil,
nil,
config.Group{Name: "Recording rule only",
Rules: []config.Rule{
{Record: "record", Expr: "max(up)"},
},
},
"contains recording rules",
},
{
nil,
nil,
config.Group{Name: "Alerting rule only",
Rules: []config.Rule{
{Alert: "alert", Expr: "up > 0"},
},
},
"contains alerting rules",
},
{
[]notifier.Notifier{&fakeNotifier{}},
nil,
config.Group{Name: "Recording and alerting rules",
Rules: []config.Rule{
{Alert: "alert1", Expr: "up > 0"},
{Alert: "alert2", Expr: "up > 0"},
{Record: "record", Expr: "max(up)"},
},
},
"contains recording rules",
},
{
nil,
&remotewrite.Client{},
config.Group{Name: "Recording and alerting rules",
Rules: []config.Rule{
{Record: "record1", Expr: "max(up)"},
{Record: "record2", Expr: "max(up)"},
{Alert: "alert", Expr: "up > 0"},
},
},
"contains alerting rules",
},
}
for _, tc := range testCases {
t.Run(tc.cfg.Name, func(t *testing.T) {
m := &manager{
groups: make(map[uint64]*Group),
querierBuilder: &fakeQuerier{},
notifiers: tc.notifiers,
rw: tc.rw,
}
err := m.update(context.Background(), []config.Group{tc.cfg}, false)
if err == nil {
t.Fatalf("expected to get error; got nil")
}
if !strings.Contains(err.Error(), tc.expErr) {
t.Fatalf("expected err to contain %q; got %q", tc.expErr, err)
}
})
}
}
func loadCfg(t *testing.T, path []string, validateAnnotations, validateExpressions bool) []config.Group {
t.Helper()
cfg, err := config.Parse(path, validateAnnotations, validateExpressions)
if err != nil {
t.Fatal(err)
}
return cfg
}

View File

@@ -14,17 +14,28 @@ import (
// Alert the triggered alert
// TODO: Looks like alert name isn't unique
type Alert struct {
GroupID uint64
Name string
Labels map[string]string
// GroupID contains the ID of the parent rules group
GroupID uint64
// Name represents Alert name
Name string
// Labels is the list of label-value pairs attached to the Alert
Labels map[string]string
// Annotations is the list of annotations generated on Alert evaluation
Annotations map[string]string
State AlertState
Expr string
// State represents the current state of the Alert
State AlertState
// Expr contains expression that was executed to generate the Alert
Expr string
// Start defines the moment of time when Alert has triggered
Start time.Time
End time.Time
// End defines the moment of time when Alert supposed to expire
End time.Time
// Value stores the value returned from evaluating expression from Expr field
Value float64
ID uint64
// ID is the unique identifer for the Alert
ID uint64
// Restored is true if Alert was restored after restart
Restored bool
}
// AlertState type indicates the Alert state
@@ -90,13 +101,13 @@ func templateAnnotations(annotations map[string]string, data AlertTplData, funcs
eg := new(utils.ErrGroup)
r := make(map[string]string, len(annotations))
for key, text := range annotations {
r[key] = text
buf.Reset()
builder.Reset()
builder.Grow(len(tplHeader) + len(text))
builder.WriteString(tplHeader)
builder.WriteString(text)
if err := templateAnnotation(&buf, builder.String(), data, funcs); err != nil {
r[key] = text
eg.Add(fmt.Errorf("key %q, template %q: %w", key, text, err))
continue
}

View File

@@ -83,14 +83,16 @@ func TestAlert_ExecTemplate(t *testing.T) {
{Name: "foo", Value: "bar"},
{Name: "baz", Value: "qux"},
},
Value: 1,
Values: []float64{1},
Timestamps: []int64{1},
},
{
Labels: []datasource.Label{
{Name: "foo", Value: "garply"},
{Name: "baz", Value: "fred"},
},
Value: 2,
Values: []float64{2},
Timestamps: []int64{1},
},
}, nil
}

View File

@@ -12,6 +12,7 @@ import (
// AlertManager represents integration provider with Prometheus alert manager
// https://github.com/prometheus/alertmanager
type AlertManager struct {
addr string
alertURL string
basicAuthUser string
basicAuthPass string
@@ -19,6 +20,9 @@ type AlertManager struct {
client *http.Client
}
// Addr returns address where alerts are sent.
func (am AlertManager) Addr() string { return am.addr }
// Send an alert or resolve message
func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
b := &bytes.Buffer{}
@@ -28,7 +32,7 @@ func (am *AlertManager) Send(ctx context.Context, alerts []Alert) error {
if err != nil {
return err
}
req.Header.Set("Content-Type", "application/json; charset=utf-8")
req.Header.Set("Content-Type", "application/json")
req = req.WithContext(ctx)
if am.basicAuthPass != "" {
req.SetBasicAuth(am.basicAuthUser, am.basicAuthPass)
@@ -57,9 +61,10 @@ const alertManagerPath = "/api/v2/alerts"
// NewAlertManager is a constructor for AlertManager
func NewAlertManager(alertManagerURL, user, pass string, fn AlertURLGenerator, c *http.Client) *AlertManager {
addr := strings.TrimSuffix(alertManagerURL, "/") + alertManagerPath
url := strings.TrimSuffix(alertManagerURL, "/") + alertManagerPath
return &AlertManager{
alertURL: addr,
addr: alertManagerURL,
alertURL: url,
argFunc: fn,
client: c,
basicAuthUser: user,

View File

@@ -10,6 +10,14 @@ import (
"time"
)
func TestAlertManager_Addr(t *testing.T) {
const addr = "http://localhost"
am := NewAlertManager(addr, "", "", nil, nil)
if am.Addr() != addr {
t.Errorf("expected to have %q; got %q", addr, am.Addr())
}
}
func TestAlertManager_Send(t *testing.T) {
const baUser, baPass = "foo", "bar"
mux := http.NewServeMux()

View File

@@ -9,7 +9,7 @@ import (
)
var (
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL. Required parameter. e.g. http://127.0.0.1:9093")
addrs = flagutil.NewArray("notifier.url", "Prometheus alertmanager URL, e.g. http://127.0.0.1:9093")
basicAuthUsername = flagutil.NewArray("notifier.basicAuth.username", "Optional basic auth username for -notifier.url")
basicAuthPassword = flagutil.NewArray("notifier.basicAuth.password", "Optional basic auth password for -notifier.url")
@@ -24,10 +24,6 @@ var (
// Init creates a Notifier object based on provided flags.
func Init(gen AlertURLGenerator) ([]Notifier, error) {
if len(*addrs) == 0 {
return nil, fmt.Errorf("at least one `-notifier.url` must be set")
}
var notifiers []Notifier
for i, addr := range *addrs {
cert, key := tlsCertFile.GetOptionalArg(i), tlsKeyFile.GetOptionalArg(i)

View File

@@ -2,7 +2,12 @@ package notifier
import "context"
// Notifier is common interface for alert manager provider
// Notifier is a common interface for alert manager provider
type Notifier interface {
// Send sends the given list of alerts.
// Returns an error if fails to send the alerts.
// Must unblock if the given ctx is cancelled.
Send(ctx context.Context, alerts []Alert) error
// Addr returns address where alerts are sent.
Addr() string
}

View File

@@ -17,6 +17,7 @@ import (
"errors"
"fmt"
"math"
"net"
"net/url"
"regexp"
"strings"
@@ -26,46 +27,95 @@ import (
textTpl "text/template"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/metricsql"
)
// metric is private copy of datasource.Metric,
// it is used for templating annotations,
// Labels as map simplifies templates evaluation.
type metric struct {
Labels map[string]string
Timestamp int64
Value float64
}
// datasourceMetricsToTemplateMetrics converts Metrics from datasource package to private copy for templating.
func datasourceMetricsToTemplateMetrics(ms []datasource.Metric) []metric {
mss := make([]metric, 0, len(ms))
for _, m := range ms {
labelsMap := make(map[string]string, len(m.Labels))
for _, labelValue := range m.Labels {
labelsMap[labelValue.Name] = labelValue.Value
}
mss = append(mss, metric{
Labels: labelsMap,
Timestamp: m.Timestamps[0],
Value: m.Values[0]})
}
return mss
}
// QueryFn is used to wrap a call to datasource into simple-to-use function
// for templating functions.
type QueryFn func(query string) ([]datasource.Metric, error)
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
fm := make(textTpl.FuncMap)
for k, fn := range tmplFunc {
fm[k] = fn
}
fm["query"] = func(q string) ([]datasource.Metric, error) {
return query(q)
}
return fm
}
var tmplFunc textTpl.FuncMap
// InitTemplateFunc initiates template helper functions
func InitTemplateFunc(externalURL *url.URL) {
// See https://prometheus.io/docs/prometheus/latest/configuration/template_reference/
tmplFunc = textTpl.FuncMap{
"args": func(args ...interface{}) map[string]interface{} {
result := make(map[string]interface{})
for i, a := range args {
result[fmt.Sprintf("arg%d", i)] = a
}
return result
},
/* Strings */
// reReplaceAll ReplaceAllString returns a copy of src, replacing matches of the Regexp with
// the replacement string repl. Inside repl, $ signs are interpreted as in Expand,
// so for instance $1 represents the text of the first submatch.
// alias for https://golang.org/pkg/regexp/#Regexp.ReplaceAllString
"reReplaceAll": func(pattern, repl, text string) string {
re := regexp.MustCompile(pattern)
return re.ReplaceAllString(text, repl)
},
"safeHtml": func(text string) htmlTpl.HTML {
return htmlTpl.HTML(text)
},
"match": regexp.MatchString,
"title": strings.Title,
// match reports whether the string s
// contains any match of the regular expression pattern.
// alias for https://golang.org/pkg/regexp/#MatchString
"match": regexp.MatchString,
// title returns a copy of the string s with all Unicode letters
// that begin words mapped to their Unicode title case.
// alias for https://golang.org/pkg/strings/#Title
"title": strings.Title,
// toUpper returns s with all Unicode letters mapped to their upper case.
// alias for https://golang.org/pkg/strings/#ToUpper
"toUpper": strings.ToUpper,
// toLower returns s with all Unicode letters mapped to their lower case.
// alias for https://golang.org/pkg/strings/#ToLower
"toLower": strings.ToLower,
// stripPort splits string into host and port, then returns only host.
"stripPort": func(hostPort string) string {
host, _, err := net.SplitHostPort(hostPort)
if err != nil {
return hostPort
}
return host
},
// parseDuration parses a duration string such as "1h" into the number of seconds it represents
"parseDuration": func(d string) (float64, error) {
ms, err := metricsql.DurationValue(d, 0)
if err != nil {
return 0, err
}
return float64(ms) / 1000, nil
},
/* Numbers */
// humanize converts given number to a human readable format
// by adding metric prefixes https://en.wikipedia.org/wiki/Metric_prefix
"humanize": func(v float64) string {
if v == 0 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
@@ -91,6 +141,8 @@ func InitTemplateFunc(externalURL *url.URL) {
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
// humanize1024 converts given number to a human readable format with 1024 as base
"humanize1024": func(v float64) string {
if math.Abs(v) <= 1 || math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
@@ -105,6 +157,8 @@ func InitTemplateFunc(externalURL *url.URL) {
}
return fmt.Sprintf("%.4g%s", v, prefix)
},
// humanizeDuration converts given seconds to a human readable duration
"humanizeDuration": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
@@ -145,9 +199,13 @@ func InitTemplateFunc(externalURL *url.URL) {
}
return fmt.Sprintf("%.4g%ss", v, prefix)
},
// humanizePercentage converts given ratio value to a fraction of 100
"humanizePercentage": func(v float64) string {
return fmt.Sprintf("%.4g%%", v*100)
},
// humanizeTimestamp converts given timestamp to a human readable time equivalent
"humanizeTimestamp": func(v float64) string {
if math.IsNaN(v) || math.IsInf(v, 0) {
return fmt.Sprintf("%.4g", v)
@@ -155,48 +213,115 @@ func InitTemplateFunc(externalURL *url.URL) {
t := TimeFromUnixNano(int64(v * 1e9)).Time().UTC()
return fmt.Sprint(t)
},
"pathPrefix": func() string {
return externalURL.Path
},
/* URLs */
// externalURL returns value of `external.url` flag
"externalURL": func() string {
return externalURL.String()
},
// pathPrefix returns a Path segment from the URL value in `external.url` flag
"pathPrefix": func() string {
return externalURL.Path
},
// pathEscape escapes the string so it can be safely placed inside a URL path segment,
// replacing special characters (including /) with %XX sequences as needed.
// alias for https://golang.org/pkg/net/url/#PathEscape
"pathEscape": func(u string) string {
return url.PathEscape(u)
},
// queryEscape escapes the string so it can be safely placed
// inside a URL query.
// alias for https://golang.org/pkg/net/url/#QueryEscape
"queryEscape": func(q string) string {
return url.QueryEscape(q)
},
// crlfEscape replaces new line chars to skip URL encoding.
// see https://github.com/VictoriaMetrics/VictoriaMetrics/issues/890
"crlfEscape": func(q string) string {
q = strings.Replace(q, "\n", `\n`, -1)
return strings.Replace(q, "\r", `\r`, -1)
},
// quotesEscape escapes quote char
"quotesEscape": func(q string) string {
return strings.Replace(q, `"`, `\"`, -1)
},
// query function supposed to be substituted at funcsWithQuery().
// it is present here only for validation purposes, when there is no
// provided datasource.
"query": func(q string) ([]datasource.Metric, error) {
// query executes the MetricsQL/PromQL query against
// configured `datasource.url` address.
// For example, {{ query "foo" | first | value }} will
// execute "/api/v1/query?query=foo" request and will return
// the first value in response.
"query": func(q string) ([]metric, error) {
// query function supposed to be substituted at funcsWithQuery().
// it is present here only for validation purposes, when there is no
// provided datasource.
//
// return non-empty slice to pass validation with chained functions in template
// see issue #989 for details
return []datasource.Metric{{}}, nil
return []metric{{}}, nil
},
"first": func(metrics []datasource.Metric) (datasource.Metric, error) {
// first returns the first by order element from the given metrics list.
// usually used alongside with `query` template function.
"first": func(metrics []metric) (metric, error) {
if len(metrics) > 0 {
return metrics[0], nil
}
return datasource.Metric{}, errors.New("first() called on vector with no elements")
return metric{}, errors.New("first() called on vector with no elements")
},
"label": func(label string, m datasource.Metric) string {
return m.Label(label)
// label returns the value of the given label name for the given metric.
// usually used alongside with `query` template function.
"label": func(label string, m metric) string {
return m.Labels[label]
},
"value": func(m datasource.Metric) float64 {
// value returns the value of the given metric.
// usually used alongside with `query` template function.
"value": func(m metric) float64 {
return m.Value
},
/* Helpers */
// Converts a list of objects to a map with keys arg0, arg1 etc.
// This is intended to allow multiple arguments to be passed to templates.
"args": func(args ...interface{}) map[string]interface{} {
result := make(map[string]interface{})
for i, a := range args {
result[fmt.Sprintf("arg%d", i)] = a
}
return result
},
// safeHtml marks string as HTML not requiring auto-escaping.
"safeHtml": func(text string) htmlTpl.HTML {
return htmlTpl.HTML(text)
},
}
}
func funcsWithQuery(query QueryFn) textTpl.FuncMap {
fm := make(textTpl.FuncMap)
for k, fn := range tmplFunc {
fm[k] = fn
}
fm["query"] = func(q string) ([]metric, error) {
result, err := query(q)
if err != nil {
return nil, err
}
return datasourceMetricsToTemplateMetrics(result), nil
}
return fm
}
// Time is the number of milliseconds since the epoch
// (1970-01-01 00:00 UTC) excluding leap seconds.
type Time int64

View File

@@ -3,8 +3,8 @@ package main
import (
"context"
"fmt"
"hash/fnv"
"sort"
"strings"
"sync"
"time"
@@ -25,6 +25,8 @@ type RecordingRule struct {
Labels map[string]string
GroupID uint64
q datasource.Querier
// guard status fields
mu sync.RWMutex
// stores last moment of time Exec was called
@@ -33,12 +35,16 @@ type RecordingRule struct {
// resets on every successful Exec
// may be used as Health state
lastExecError error
// stores the number of samples returned during
// the last evaluation
lastExecSamples int
metrics *recordingRuleMetrics
}
type recordingRuleMetrics struct {
errors *gauge
errors *gauge
samples *gauge
}
// String implements Stringer interface
@@ -52,81 +58,117 @@ func (rr *RecordingRule) ID() uint64 {
return rr.RuleID
}
func newRecordingRule(group *Group, cfg config.Rule) *RecordingRule {
func newRecordingRule(qb datasource.QuerierBuilder, group *Group, cfg config.Rule) *RecordingRule {
rr := &RecordingRule{
Type: cfg.Type,
Type: group.Type,
RuleID: cfg.ID,
Name: cfg.Record,
Expr: cfg.Expr,
Labels: cfg.Labels,
GroupID: group.ID(),
metrics: &recordingRuleMetrics{},
q: qb.BuildWithParams(datasource.QuerierParams{
DataSourceType: &group.Type,
EvaluationInterval: group.Interval,
QueryParams: group.Params,
}),
}
labels := fmt.Sprintf(`recording=%q, group=%q, id="%d"`, rr.Name, group.Name, rr.ID())
rr.metrics.errors = getOrCreateGauge(fmt.Sprintf(`vmalert_recording_rules_error{%s}`, labels),
func() float64 {
rr.mu.Lock()
defer rr.mu.Unlock()
rr.mu.RLock()
defer rr.mu.RUnlock()
if rr.lastExecError == nil {
return 0
}
return 1
})
rr.metrics.samples = getOrCreateGauge(fmt.Sprintf(`vmalert_recording_rules_last_evaluation_samples{%s}`, labels),
func() float64 {
rr.mu.RLock()
defer rr.mu.RUnlock()
return float64(rr.lastExecSamples)
})
return rr
}
// Close unregisters rule metrics
func (rr *RecordingRule) Close() {
metrics.UnregisterMetric(rr.metrics.errors.name)
metrics.UnregisterMetric(rr.metrics.samples.name)
}
// Exec executes RecordingRule expression via the given Querier.
func (rr *RecordingRule) Exec(ctx context.Context, q datasource.Querier, series bool) ([]prompbmarshal.TimeSeries, error) {
if !series {
return nil, nil
}
qMetrics, err := q.Query(ctx, rr.Expr, rr.Type)
rr.mu.Lock()
defer rr.mu.Unlock()
rr.lastExecTime = time.Now()
rr.lastExecError = err
// ExecRange executes recording rule on the given time range similarly to Exec.
// It doesn't update internal states of the Rule and meant to be used just
// to get time series for backfilling.
func (rr *RecordingRule) ExecRange(ctx context.Context, start, end time.Time) ([]prompbmarshal.TimeSeries, error) {
series, err := rr.q.QueryRange(ctx, rr.Expr, start, end)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", rr.Expr, err)
return nil, err
}
duplicates := make(map[uint64]prompbmarshal.TimeSeries, len(qMetrics))
duplicates := make(map[string]struct{}, len(series))
var tss []prompbmarshal.TimeSeries
for _, r := range qMetrics {
ts := rr.toTimeSeries(r, rr.lastExecTime)
h := hashTimeSeries(ts)
if _, ok := duplicates[h]; ok {
rr.lastExecError = errDuplicate
return nil, errDuplicate
for _, s := range series {
ts := rr.toTimeSeries(s)
key := stringifyLabels(ts)
if _, ok := duplicates[key]; ok {
return nil, fmt.Errorf("original metric %v; resulting labels %q: %w", s.Labels, key, errDuplicate)
}
duplicates[h] = ts
duplicates[key] = struct{}{}
tss = append(tss, ts)
}
return tss, nil
}
func hashTimeSeries(ts prompbmarshal.TimeSeries) uint64 {
hash := fnv.New64a()
labels := ts.Labels
sort.Slice(labels, func(i, j int) bool {
return labels[i].Name < labels[j].Name
})
for _, l := range labels {
hash.Write([]byte(l.Name))
hash.Write([]byte(l.Value))
hash.Write([]byte("\xff"))
// Exec executes RecordingRule expression via the given Querier.
func (rr *RecordingRule) Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error) {
qMetrics, err := rr.q.Query(ctx, rr.Expr)
rr.mu.Lock()
defer rr.mu.Unlock()
rr.lastExecTime = time.Now()
rr.lastExecError = err
rr.lastExecSamples = len(qMetrics)
if err != nil {
return nil, fmt.Errorf("failed to execute query %q: %w", rr.Expr, err)
}
return hash.Sum64()
duplicates := make(map[string]struct{}, len(qMetrics))
var tss []prompbmarshal.TimeSeries
for _, r := range qMetrics {
ts := rr.toTimeSeries(r)
key := stringifyLabels(ts)
if _, ok := duplicates[key]; ok {
rr.lastExecError = errDuplicate
return nil, fmt.Errorf("original metric %v; resulting labels %q: %w", r, key, errDuplicate)
}
duplicates[key] = struct{}{}
tss = append(tss, ts)
}
return tss, nil
}
func (rr *RecordingRule) toTimeSeries(m datasource.Metric, timestamp time.Time) prompbmarshal.TimeSeries {
func stringifyLabels(ts prompbmarshal.TimeSeries) string {
labels := ts.Labels
if len(labels) > 1 {
sort.Slice(labels, func(i, j int) bool {
return labels[i].Name < labels[j].Name
})
}
b := strings.Builder{}
for i, l := range labels {
b.WriteString(l.Name)
b.WriteString("=")
b.WriteString(l.Value)
if i != len(labels)-1 {
b.WriteString(",")
}
}
return b.String()
}
func (rr *RecordingRule) toTimeSeries(m datasource.Metric) prompbmarshal.TimeSeries {
labels := make(map[string]string)
for _, l := range m.Labels {
labels[l.Name] = l.Value
@@ -136,12 +178,10 @@ func (rr *RecordingRule) toTimeSeries(m datasource.Metric, timestamp time.Time)
for k, v := range rr.Labels {
labels[k] = v
}
return newTimeSeries(m.Value, labels, timestamp)
return newTimeSeries(m.Values, m.Timestamps, labels)
}
// UpdateWith copies all significant fields.
// alerts state isn't copied since
// it should be updated in next 2 Execs
func (rr *RecordingRule) UpdateWith(r Rule) error {
nr, ok := r.(*RecordingRule)
if !ok {
@@ -149,6 +189,7 @@ func (rr *RecordingRule) UpdateWith(r Rule) error {
}
rr.Expr = nr.Expr
rr.Labels = nr.Labels
rr.q = nr.q
return nil
}
@@ -161,13 +202,14 @@ func (rr *RecordingRule) RuleAPI() APIRecordingRule {
}
return APIRecordingRule{
// encode as strings to avoid rounding
ID: fmt.Sprintf("%d", rr.ID()),
GroupID: fmt.Sprintf("%d", rr.GroupID),
Name: rr.Name,
Type: rr.Type.String(),
Expression: rr.Expr,
LastError: lastErr,
LastExec: rr.lastExecTime,
Labels: rr.Labels,
ID: fmt.Sprintf("%d", rr.ID()),
GroupID: fmt.Sprintf("%d", rr.GroupID),
Name: rr.Name,
Type: rr.Type.String(),
Expression: rr.Expr,
LastError: lastErr,
LastSamples: rr.lastExecSamples,
LastExec: rr.lastExecTime,
Labels: rr.Labels,
}
}

View File

@@ -11,7 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
func TestRecoridngRule_ToTimeSeries(t *testing.T) {
func TestRecoridngRule_Exec(t *testing.T) {
timestamp := time.Now()
testCases := []struct {
rule *RecordingRule
@@ -24,9 +24,9 @@ func TestRecoridngRule_ToTimeSeries(t *testing.T) {
"__name__", "bar",
)},
[]prompbmarshal.TimeSeries{
newTimeSeries(10, map[string]string{
newTimeSeries([]float64{10}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "foo",
}, timestamp),
}),
},
},
{
@@ -37,18 +37,18 @@ func TestRecoridngRule_ToTimeSeries(t *testing.T) {
metricWithValueAndLabels(t, 3, "__name__", "baz", "job", "baz"),
},
[]prompbmarshal.TimeSeries{
newTimeSeries(1, map[string]string{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "foobarbaz",
"job": "foo",
}, timestamp),
newTimeSeries(2, map[string]string{
}),
newTimeSeries([]float64{2}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "foobarbaz",
"job": "bar",
}, timestamp),
newTimeSeries(3, map[string]string{
}),
newTimeSeries([]float64{3}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "foobarbaz",
"job": "baz",
}, timestamp),
}),
},
},
{
@@ -59,16 +59,16 @@ func TestRecoridngRule_ToTimeSeries(t *testing.T) {
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar")},
[]prompbmarshal.TimeSeries{
newTimeSeries(2, map[string]string{
newTimeSeries([]float64{2}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "job:foo",
"job": "foo",
"source": "test",
}, timestamp),
newTimeSeries(1, map[string]string{
}),
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "job:foo",
"job": "bar",
"source": "test",
}, timestamp),
}),
},
},
}
@@ -76,7 +76,8 @@ func TestRecoridngRule_ToTimeSeries(t *testing.T) {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
fq.add(tc.metrics...)
tss, err := tc.rule.Exec(context.TODO(), fq, true)
tc.rule.q = fq
tss, err := tc.rule.Exec(context.TODO())
if err != nil {
t.Fatalf("unexpected Exec err: %s", err)
}
@@ -87,7 +88,88 @@ func TestRecoridngRule_ToTimeSeries(t *testing.T) {
}
}
func TestRecoridngRule_ToTimeSeriesNegative(t *testing.T) {
func TestRecoridngRule_ExecRange(t *testing.T) {
timestamp := time.Now()
testCases := []struct {
rule *RecordingRule
metrics []datasource.Metric
expTS []prompbmarshal.TimeSeries
}{
{
&RecordingRule{Name: "foo"},
[]datasource.Metric{metricWithValuesAndLabels(t, []float64{10, 20, 30},
"__name__", "bar",
)},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{10, 20, 30},
[]int64{timestamp.UnixNano(), timestamp.UnixNano(), timestamp.UnixNano()},
map[string]string{
"__name__": "foo",
}),
},
},
{
&RecordingRule{Name: "foobarbaz"},
[]datasource.Metric{
metricWithValuesAndLabels(t, []float64{1}, "__name__", "foo", "job", "foo"),
metricWithValuesAndLabels(t, []float64{2, 3}, "__name__", "bar", "job", "bar"),
metricWithValuesAndLabels(t, []float64{4, 5, 6}, "__name__", "baz", "job", "baz"),
},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "foobarbaz",
"job": "foo",
}),
newTimeSeries([]float64{2, 3}, []int64{timestamp.UnixNano(), timestamp.UnixNano()}, map[string]string{
"__name__": "foobarbaz",
"job": "bar",
}),
newTimeSeries([]float64{4, 5, 6},
[]int64{timestamp.UnixNano(), timestamp.UnixNano(), timestamp.UnixNano()},
map[string]string{
"__name__": "foobarbaz",
"job": "baz",
}),
},
},
{
&RecordingRule{Name: "job:foo", Labels: map[string]string{
"source": "test",
}},
[]datasource.Metric{
metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "foo"),
metricWithValueAndLabels(t, 1, "__name__", "bar", "job", "bar")},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{2}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "job:foo",
"job": "foo",
"source": "test",
}),
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": "job:foo",
"job": "bar",
"source": "test",
}),
},
},
}
for _, tc := range testCases {
t.Run(tc.rule.Name, func(t *testing.T) {
fq := &fakeQuerier{}
fq.add(tc.metrics...)
tc.rule.q = fq
tss, err := tc.rule.ExecRange(context.TODO(), time.Now(), time.Now())
if err != nil {
t.Fatalf("unexpected Exec err: %s", err)
}
if err := compareTimeSeries(t, tc.expTS, tss); err != nil {
t.Fatalf("timeseries missmatch: %s", err)
}
})
}
}
func TestRecoridngRule_ExecNegative(t *testing.T) {
rr := &RecordingRule{Name: "job:foo", Labels: map[string]string{
"job": "test",
}}
@@ -95,8 +177,8 @@ func TestRecoridngRule_ToTimeSeriesNegative(t *testing.T) {
fq := &fakeQuerier{}
expErr := "connection reset by peer"
fq.setErr(errors.New(expErr))
_, err := rr.Exec(context.TODO(), fq, true)
rr.q = fq
_, err := rr.Exec(context.TODO())
if err == nil {
t.Fatalf("expected to get err; got nil")
}
@@ -111,7 +193,7 @@ func TestRecoridngRule_ToTimeSeriesNegative(t *testing.T) {
fq.add(metricWithValueAndLabels(t, 1, "__name__", "foo", "job", "foo"))
fq.add(metricWithValueAndLabels(t, 2, "__name__", "foo", "job", "bar"))
_, err = rr.Exec(context.TODO(), fq, true)
_, err = rr.Exec(context.TODO())
if err == nil {
t.Fatalf("expected to get err; got nil")
}

View File

@@ -10,11 +10,15 @@ import (
)
var (
addr = flag.String("remoteRead.url", "", "Optional URL to Victoria Metrics or VMSelect that will be used to restore alerts"+
" state. This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state."+
" E.g. http://127.0.0.1:8428")
addr = flag.String("remoteRead.url", "", "Optional URL to VictoriaMetrics or vmselect that will be used to restore alerts "+
"state. This configuration makes sense only if `vmalert` was configured with `remoteWrite.url` before and has been successfully persisted its state. "+
"E.g. http://127.0.0.1:8428. See also -remoteRead.disablePathAppend")
basicAuthUsername = flag.String("remoteRead.basicAuth.username", "", "Optional basic auth username for -remoteRead.url")
basicAuthPassword = flag.String("remoteRead.basicAuth.password", "", "Optional basic auth password for -remoteRead.url")
basicAuthPasswordFile = flag.String("remoteRead.basicAuth.passwordFile", "", "Optional path to basic auth password to use for -remoteRead.url")
bearerToken = flag.String("remoteRead.bearerToken", "", "Optional bearer auth token to use for -remoteRead.url.")
bearerTokenFile = flag.String("remoteRead.bearerTokenFile", "", "Optional path to bearer token file to use for -remoteRead.url.")
tlsInsecureSkipVerify = flag.Bool("remoteRead.tlsInsecureSkipVerify", false, "Whether to skip tls verification when connecting to -remoteRead.url")
tlsCertFile = flag.String("remoteRead.tlsCertFile", "", "Optional path to client-side TLS certificate file to use when connecting to -remoteRead.url")
tlsKeyFile = flag.String("remoteRead.tlsKeyFile", "", "Optional path to client-side TLS certificate key to use when connecting to -remoteRead.url")
@@ -22,11 +26,12 @@ var (
"By default system CA is used")
tlsServerName = flag.String("remoteRead.tlsServerName", "", "Optional TLS server name to use for connections to -remoteRead.url. "+
"By default the server name from -remoteRead.url is used")
disablePathAppend = flag.Bool("remoteRead.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/query' path to the configured -remoteRead.url.")
)
// Init creates a Querier from provided flag values.
// Returns nil if addr flag wasn't set.
func Init() (datasource.Querier, error) {
func Init() (datasource.QuerierBuilder, error) {
if *addr == "" {
return nil, nil
}
@@ -34,6 +39,10 @@ func Init() (datasource.Querier, error) {
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
authCfg, err := utils.AuthConfig(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile, *bearerToken, *bearerTokenFile)
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}
c := &http.Client{Transport: tr}
return datasource.NewVMStorage(*addr, *basicAuthUsername, *basicAuthPassword, 0, 0, false, c), nil
return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c, *disablePathAppend), nil
}

View File

@@ -10,10 +10,14 @@ import (
)
var (
addr = flag.String("remoteWrite.url", "", "Optional URL to Victoria Metrics or VMInsert where to persist alerts state"+
" and recording rules results in form of timeseries. E.g. http://127.0.0.1:8428")
basicAuthUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username for -remoteWrite.url")
basicAuthPassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password for -remoteWrite.url")
addr = flag.String("remoteWrite.url", "", "Optional URL to VictoriaMetrics or vminsert where to persist alerts state "+
"and recording rules results in form of timeseries. For example, if -remoteWrite.url=http://127.0.0.1:8428 is specified, "+
"then the alerts state will be written to http://127.0.0.1:8428/api/v1/write . See also -remoteWrite.disablePathAppend")
basicAuthUsername = flag.String("remoteWrite.basicAuth.username", "", "Optional basic auth username for -remoteWrite.url")
basicAuthPassword = flag.String("remoteWrite.basicAuth.password", "", "Optional basic auth password for -remoteWrite.url")
basicAuthPasswordFile = flag.String("remoteWrite.basicAuth.passwordFile", "", "Optional path to basic auth password to use for -remoteWrite.url")
bearerToken = flag.String("remoteWrite.bearerToken", "", "Optional bearer auth token to use for -remoteWrite.url.")
bearerTokenFile = flag.String("remoteWrite.bearerTokenFile", "", "Optional path to bearer token file to use for -remoteWrite.url.")
maxQueueSize = flag.Int("remoteWrite.maxQueueSize", 1e5, "Defines the max number of pending datapoints to remote write endpoint")
maxBatchSize = flag.Int("remoteWrite.maxBatchSize", 1e3, "Defines defines max number of timeseries to be flushed at once")
@@ -27,6 +31,7 @@ var (
"By default system CA is used")
tlsServerName = flag.String("remoteWrite.tlsServerName", "", "Optional TLS server name to use for connections to -remoteWrite.url. "+
"By default the server name from -remoteWrite.url is used")
disablePathAppend = flag.Bool("remoteWrite.disablePathAppend", false, "Whether to disable automatic appending of '/api/v1/write' path to the configured -remoteWrite.url.")
)
// Init creates Client object from given flags.
@@ -41,14 +46,19 @@ func Init(ctx context.Context) (*Client, error) {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
authCfg, err := utils.AuthConfig(*basicAuthUsername, *basicAuthPassword, *basicAuthPasswordFile, *bearerToken, *bearerTokenFile)
if err != nil {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}
return NewClient(ctx, Config{
Addr: *addr,
Concurrency: *concurrency,
MaxQueueSize: *maxQueueSize,
MaxBatchSize: *maxBatchSize,
FlushInterval: *flushInterval,
BasicAuthUser: *basicAuthUsername,
BasicAuthPass: *basicAuthPassword,
Transport: t,
Addr: *addr,
AuthCfg: authCfg,
Concurrency: *concurrency,
MaxQueueSize: *maxQueueSize,
MaxBatchSize: *maxBatchSize,
FlushInterval: *flushInterval,
DisablePathAppend: *disablePathAppend,
Transport: t,
})
}

View File

@@ -6,26 +6,30 @@ import (
"fmt"
"io/ioutil"
"net/http"
"path"
"strings"
"sync"
"time"
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
// Client is an asynchronous HTTP client for writing
// timeseries via remote write protocol.
type Client struct {
addr string
c *http.Client
input chan prompbmarshal.TimeSeries
baUser, baPass string
flushInterval time.Duration
maxBatchSize int
maxQueueSize int
addr string
c *http.Client
authCfg *promauth.Config
input chan prompbmarshal.TimeSeries
flushInterval time.Duration
maxBatchSize int
maxQueueSize int
disablePathAppend bool
wg sync.WaitGroup
doneCh chan struct{}
@@ -34,10 +38,8 @@ type Client struct {
// Config is config for remote write.
type Config struct {
// Addr of remote storage
Addr string
BasicAuthUser string
BasicAuthPass string
Addr string
AuthCfg *promauth.Config
// Concurrency defines number of readers that
// concurrently read from the queue and flush data
@@ -56,6 +58,8 @@ type Config struct {
WriteTimeout time.Duration
// Transport will be used by the underlying http.Client
Transport *http.Transport
// DisablePathAppend can be used to not automatically append '/api/v1/write' to the remote write url
DisablePathAppend bool
}
const (
@@ -89,24 +93,25 @@ func NewClient(ctx context.Context, cfg Config) (*Client, error) {
if cfg.Transport == nil {
cfg.Transport = http.DefaultTransport.(*http.Transport).Clone()
}
cc := defaultConcurrency
if cfg.Concurrency > 0 {
cc = cfg.Concurrency
}
c := &Client{
c: &http.Client{
Timeout: cfg.WriteTimeout,
Transport: cfg.Transport,
},
addr: strings.TrimSuffix(cfg.Addr, "/"),
baUser: cfg.BasicAuthUser,
baPass: cfg.BasicAuthPass,
flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
}
cc := defaultConcurrency
if cfg.Concurrency > 0 {
cc = cfg.Concurrency
addr: strings.TrimSuffix(cfg.Addr, "/"),
authCfg: cfg.AuthCfg,
flushInterval: cfg.FlushInterval,
maxBatchSize: cfg.MaxBatchSize,
maxQueueSize: cfg.MaxQueueSize,
doneCh: make(chan struct{}),
input: make(chan prompbmarshal.TimeSeries, cfg.MaxQueueSize),
disablePathAppend: cfg.DisablePathAppend,
}
for i := 0; i < cc; i++ {
c.run(ctx)
}
@@ -179,10 +184,11 @@ func (c *Client) run(ctx context.Context) {
}
var (
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
sentRows = metrics.NewCounter(`vmalert_remotewrite_sent_rows_total`)
sentBytes = metrics.NewCounter(`vmalert_remotewrite_sent_bytes_total`)
droppedRows = metrics.NewCounter(`vmalert_remotewrite_dropped_rows_total`)
droppedBytes = metrics.NewCounter(`vmalert_remotewrite_dropped_bytes_total`)
bufferFlushDuration = metrics.NewHistogram(`vmalert_remotewrite_flush_duration_seconds`)
)
// flush is a blocking function that marshals WriteRequest and sends
@@ -193,6 +199,7 @@ func (c *Client) flush(ctx context.Context, wr *prompbmarshal.WriteRequest) {
return
}
defer prompbmarshal.ResetWriteRequest(wr)
defer bufferFlushDuration.UpdateDuration(time.Now())
data, err := wr.Marshal()
if err != nil {
@@ -228,20 +235,24 @@ func (c *Client) send(ctx context.Context, data []byte) error {
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
if c.baPass != "" {
req.SetBasicAuth(c.baUser, c.baPass)
if c.authCfg != nil {
if auth := c.authCfg.GetAuthHeader(); auth != "" {
req.Header.Set("Authorization", auth)
}
}
if !c.disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, writePath)
}
req.URL.Path += writePath
resp, err := c.c.Do(req.WithContext(ctx))
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL, err, len(data), r.Size())
req.URL.Redacted(), err, len(data), r.Size())
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode != http.StatusNoContent {
if resp.StatusCode != http.StatusNoContent && resp.StatusCode != http.StatusOK {
body, _ := ioutil.ReadAll(resp.Body)
return fmt.Errorf("unexpected response code %d for %s. Response body %q",
resp.StatusCode, req.URL, body)
resp.StatusCode, req.URL.Redacted(), body)
}
return nil
}

160
app/vmalert/replay.go Normal file
View File

@@ -0,0 +1,160 @@
package main
import (
"context"
"flag"
"fmt"
"strings"
"time"
"github.com/cheggaaa/pb/v3"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
var (
replayFrom = flag.String("replay.timeFrom", "",
"The time filter in RFC3339 format to select time series with timestamp equal or higher than provided value. E.g. '2020-01-01T20:07:00Z'")
replayTo = flag.String("replay.timeTo", "",
"The time filter in RFC3339 format to select timeseries with timestamp equal or lower than provided value. E.g. '2020-01-01T20:07:00Z'")
replayRulesDelay = flag.Duration("replay.rulesDelay", time.Second,
"Delay between rules evaluation within the group. Could be important if there are chained rules inside of the group"+
"and processing need to wait for previous rule results to be persisted by remote storage before evaluating the next rule."+
"Keep it equal or bigger than -remoteWrite.flushInterval.")
replayMaxDatapoints = flag.Int("replay.maxDatapointsPerQuery", 1e3,
"Max number of data points expected in one request. The higher the value, the less requests will be made during replay.")
replayRuleRetryAttempts = flag.Int("replay.ruleRetryAttempts", 5,
"Defines how many retries to make before giving up on rule if request for it returns an error.")
)
func replay(groupsCfg []config.Group, qb datasource.QuerierBuilder, rw *remotewrite.Client) error {
if *replayMaxDatapoints < 1 {
return fmt.Errorf("replay.maxDatapointsPerQuery can't be lower than 1")
}
tFrom, err := time.Parse(time.RFC3339, *replayFrom)
if err != nil {
return fmt.Errorf("failed to parse %q: %s", *replayFrom, err)
}
tTo, err := time.Parse(time.RFC3339, *replayTo)
if err != nil {
return fmt.Errorf("failed to parse %q: %s", *replayTo, err)
}
if !tTo.After(tFrom) {
return fmt.Errorf("replay.timeTo must be bigger than replay.timeFrom")
}
labels := make(map[string]string)
for _, s := range *externalLabels {
if len(s) == 0 {
continue
}
n := strings.IndexByte(s, '=')
if n < 0 {
return fmt.Errorf("missing '=' in `-label`. It must contain label in the form `name=value`; got %q", s)
}
labels[s[:n]] = s[n+1:]
}
fmt.Printf("Replay mode:"+
"\nfrom: \t%v "+
"\nto: \t%v "+
"\nmax data points per request: %d\n",
tFrom, tTo, *replayMaxDatapoints)
var total int
for _, cfg := range groupsCfg {
ng := newGroup(cfg, qb, *evaluationInterval, labels)
total += ng.replay(tFrom, tTo, rw)
}
logger.Infof("replay finished! Imported %d samples", total)
if rw != nil {
return rw.Close()
}
return nil
}
func (g *Group) replay(start, end time.Time, rw *remotewrite.Client) int {
var total int
step := g.Interval * time.Duration(*replayMaxDatapoints)
ri := rangeIterator{start: start, end: end, step: step}
iterations := int(end.Sub(start)/step) + 1
fmt.Printf("\nGroup %q"+
"\ninterval: \t%v"+
"\nrequests to make: \t%d"+
"\nmax range per request: \t%v\n",
g.Name, g.Interval, iterations, step)
for _, rule := range g.Rules {
fmt.Printf("> Rule %q (ID: %d)\n", rule, rule.ID())
bar := pb.StartNew(iterations)
ri.reset()
for ri.next() {
n, err := replayRule(rule, ri.s, ri.e, rw)
if err != nil {
logger.Fatalf("rule %q: %s", rule, err)
}
total += n
bar.Increment()
}
bar.Finish()
// sleep to let remote storage to flush data on-disk
// so chained rules could be calculated correctly
time.Sleep(*replayRulesDelay)
}
return total
}
func replayRule(rule Rule, start, end time.Time, rw *remotewrite.Client) (int, error) {
var err error
var tss []prompbmarshal.TimeSeries
for i := 0; i < *replayRuleRetryAttempts; i++ {
tss, err = rule.ExecRange(context.Background(), start, end)
if err == nil {
break
}
logger.Errorf("attempt %d to execute rule %q failed: %s", i+1, rule, err)
time.Sleep(time.Second)
}
if err != nil { // means all attempts failed
return 0, err
}
if len(tss) < 1 {
return 0, nil
}
var n int
for _, ts := range tss {
if err := rw.Push(ts); err != nil {
return n, fmt.Errorf("remote write failure: %s", err)
}
n += len(ts.Samples)
}
return n, nil
}
type rangeIterator struct {
step time.Duration
start, end time.Time
iter int
s, e time.Time
}
func (ri *rangeIterator) reset() {
ri.iter = 0
ri.s, ri.e = time.Time{}, time.Time{}
}
func (ri *rangeIterator) next() bool {
ri.s = ri.start.Add(ri.step * time.Duration(ri.iter))
if !ri.end.After(ri.s) {
return false
}
ri.e = ri.s.Add(ri.step)
if ri.e.After(ri.end) {
ri.e = ri.end
}
ri.iter++
return true
}

250
app/vmalert/replay_test.go Normal file
View File

@@ -0,0 +1,250 @@
package main
import (
"context"
"fmt"
"testing"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/config"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
)
type fakeReplayQuerier struct {
fakeQuerier
registry map[string]map[string]struct{}
}
func (fr *fakeReplayQuerier) BuildWithParams(_ datasource.QuerierParams) datasource.Querier {
return fr
}
func (fr *fakeReplayQuerier) QueryRange(_ context.Context, q string, from, to time.Time) ([]datasource.Metric, error) {
key := fmt.Sprintf("%s+%s", from.Format("15:04:05"), to.Format("15:04:05"))
dps, ok := fr.registry[q]
if !ok {
return nil, fmt.Errorf("unexpected query received: %q", q)
}
_, ok = dps[key]
if !ok {
return nil, fmt.Errorf("unexpected time range received: %q", key)
}
delete(dps, key)
if len(fr.registry[q]) < 1 {
delete(fr.registry, q)
}
return nil, nil
}
func TestReplay(t *testing.T) {
testCases := []struct {
name string
from, to string
maxDP int
cfg []config.Group
qb *fakeReplayQuerier
}{
{
name: "one rule + one response",
from: "2021-01-01T12:00:00.000Z",
to: "2021-01-01T12:02:00.000Z",
maxDP: 10,
cfg: []config.Group{
{Rules: []config.Rule{{Record: "foo", Expr: "sum(up)"}}},
},
qb: &fakeReplayQuerier{
registry: map[string]map[string]struct{}{
"sum(up)": {"12:00:00+12:02:00": {}},
},
},
},
{
name: "one rule + multiple responses",
from: "2021-01-01T12:00:00.000Z",
to: "2021-01-01T12:02:30.000Z",
maxDP: 1,
cfg: []config.Group{
{Rules: []config.Rule{{Record: "foo", Expr: "sum(up)"}}},
},
qb: &fakeReplayQuerier{
registry: map[string]map[string]struct{}{
"sum(up)": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {},
"12:02:00+12:02:30": {},
},
},
},
},
{
name: "datapoints per step",
from: "2021-01-01T12:00:00.000Z",
to: "2021-01-01T15:02:30.000Z",
maxDP: 60,
cfg: []config.Group{
{Interval: utils.NewPromDuration(time.Minute), Rules: []config.Rule{{Record: "foo", Expr: "sum(up)"}}},
},
qb: &fakeReplayQuerier{
registry: map[string]map[string]struct{}{
"sum(up)": {
"12:00:00+13:00:00": {},
"13:00:00+14:00:00": {},
"14:00:00+15:00:00": {},
"15:00:00+15:02:30": {},
},
},
},
},
{
name: "multiple recording rules + multiple responses",
from: "2021-01-01T12:00:00.000Z",
to: "2021-01-01T12:02:30.000Z",
maxDP: 1,
cfg: []config.Group{
{Rules: []config.Rule{{Record: "foo", Expr: "sum(up)"}}},
{Rules: []config.Rule{{Record: "bar", Expr: "max(up)"}}},
},
qb: &fakeReplayQuerier{
registry: map[string]map[string]struct{}{
"sum(up)": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {},
"12:02:00+12:02:30": {},
},
"max(up)": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {},
"12:02:00+12:02:30": {},
},
},
},
},
{
name: "multiple alerting rules + multiple responses",
from: "2021-01-01T12:00:00.000Z",
to: "2021-01-01T12:02:30.000Z",
maxDP: 1,
cfg: []config.Group{
{Rules: []config.Rule{{Alert: "foo", Expr: "sum(up) > 1"}}},
{Rules: []config.Rule{{Alert: "bar", Expr: "max(up) < 1"}}},
},
qb: &fakeReplayQuerier{
registry: map[string]map[string]struct{}{
"sum(up) > 1": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {},
"12:02:00+12:02:30": {},
},
"max(up) < 1": {
"12:00:00+12:01:00": {},
"12:01:00+12:02:00": {},
"12:02:00+12:02:30": {},
},
},
},
},
}
from, to, maxDP := *replayFrom, *replayTo, *replayMaxDatapoints
retries, delay := *replayRuleRetryAttempts, *replayRulesDelay
defer func() {
*replayFrom, *replayTo = from, to
*replayMaxDatapoints, *replayRuleRetryAttempts = maxDP, retries
*replayRulesDelay = delay
}()
*replayRuleRetryAttempts = 1
*replayRulesDelay = time.Millisecond
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
*replayFrom = tc.from
*replayTo = tc.to
*replayMaxDatapoints = tc.maxDP
if err := replay(tc.cfg, tc.qb, nil); err != nil {
t.Fatalf("replay failed: %s", err)
}
if len(tc.qb.registry) > 0 {
t.Fatalf("not all requests were sent: %#v", tc.qb.registry)
}
})
}
}
func TestRangeIterator(t *testing.T) {
testCases := []struct {
ri rangeIterator
result [][2]time.Time
}{
{
ri: rangeIterator{
start: parseTime(t, "2021-01-01T12:00:00.000Z"),
end: parseTime(t, "2021-01-01T12:30:00.000Z"),
step: 5 * time.Minute,
},
result: [][2]time.Time{
{parseTime(t, "2021-01-01T12:00:00.000Z"), parseTime(t, "2021-01-01T12:05:00.000Z")},
{parseTime(t, "2021-01-01T12:05:00.000Z"), parseTime(t, "2021-01-01T12:10:00.000Z")},
{parseTime(t, "2021-01-01T12:10:00.000Z"), parseTime(t, "2021-01-01T12:15:00.000Z")},
{parseTime(t, "2021-01-01T12:15:00.000Z"), parseTime(t, "2021-01-01T12:20:00.000Z")},
{parseTime(t, "2021-01-01T12:20:00.000Z"), parseTime(t, "2021-01-01T12:25:00.000Z")},
{parseTime(t, "2021-01-01T12:25:00.000Z"), parseTime(t, "2021-01-01T12:30:00.000Z")},
},
},
{
ri: rangeIterator{
start: parseTime(t, "2021-01-01T12:00:00.000Z"),
end: parseTime(t, "2021-01-01T12:30:00.000Z"),
step: 45 * time.Minute,
},
result: [][2]time.Time{
{parseTime(t, "2021-01-01T12:00:00.000Z"), parseTime(t, "2021-01-01T12:30:00.000Z")},
{parseTime(t, "2021-01-01T12:30:00.000Z"), parseTime(t, "2021-01-01T12:30:00.000Z")},
},
},
{
ri: rangeIterator{
start: parseTime(t, "2021-01-01T12:00:12.000Z"),
end: parseTime(t, "2021-01-01T12:00:17.000Z"),
step: time.Second,
},
result: [][2]time.Time{
{parseTime(t, "2021-01-01T12:00:12.000Z"), parseTime(t, "2021-01-01T12:00:13.000Z")},
{parseTime(t, "2021-01-01T12:00:13.000Z"), parseTime(t, "2021-01-01T12:00:14.000Z")},
{parseTime(t, "2021-01-01T12:00:14.000Z"), parseTime(t, "2021-01-01T12:00:15.000Z")},
{parseTime(t, "2021-01-01T12:00:15.000Z"), parseTime(t, "2021-01-01T12:00:16.000Z")},
{parseTime(t, "2021-01-01T12:00:16.000Z"), parseTime(t, "2021-01-01T12:00:17.000Z")},
},
},
}
for i, tc := range testCases {
t.Run(fmt.Sprintf("case %d", i), func(t *testing.T) {
var j int
for tc.ri.next() {
if len(tc.result) < j+1 {
t.Fatalf("unexpected result for iterator on step %d: %v - %v",
j, tc.ri.s, tc.ri.e)
}
s, e := tc.ri.s, tc.ri.e
expS, expE := tc.result[j][0], tc.result[j][1]
if s != expS {
t.Fatalf("expected to get start=%v; got %v", expS, s)
}
if e != expE {
t.Fatalf("expected to get end=%v; got %v", expE, e)
}
j++
}
})
}
}
func parseTime(t *testing.T, s string) time.Time {
t.Helper()
tt, err := time.Parse("2006-01-02T15:04:05.000Z", s)
if err != nil {
t.Fatal(err)
}
return tt
}

View File

@@ -3,22 +3,21 @@ package main
import (
"context"
"errors"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"time"
)
// Rule represents alerting or recording rule
// that has unique ID, can be Executed and
// updated with other Rule.
type Rule interface {
// Returns unique ID that may be used for
// ID returns unique ID that may be used for
// identifying this Rule among others.
ID() uint64
// Exec executes the rule with given context
// and Querier. If returnSeries is true, Exec
// may return TimeSeries as result of execution
Exec(ctx context.Context, q datasource.Querier, returnSeries bool) ([]prompbmarshal.TimeSeries, error)
Exec(ctx context.Context) ([]prompbmarshal.TimeSeries, error)
// ExecRange executes the rule on the given time range
ExecRange(ctx context.Context, start, end time.Time) ([]prompbmarshal.TimeSeries, error)
// UpdateWith performs modification of current Rule
// with fields of the given Rule.
UpdateWith(Rule) error

View File

@@ -0,0 +1,36 @@
{% func Footer() %}
</main>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script>
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<script type="text/javascript">
function expandAll() {
$('.collapse').addClass('show');
}
function collapseAll() {
$('.collapse').removeClass('show');
}
$(document).ready(function() {
// prevent collapse logic on link click
$(".group-heading a").click(function(e) {
e.stopPropagation();
});
$(".group-heading").click(function(e) {
let target = $(this).attr('data-bs-target');
let el = $('#'+target);
new bootstrap.Collapse(el, {
toggle: true
});
});
var hash = window.location.hash.substr(1);
let group = $('#'+hash);
if (group.length > 0) {
group.click();
}
});
</script>
</body>
</html>
{% endfunc %}

View File

@@ -0,0 +1,86 @@
// Code generated by qtc from "footer.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/tpl/footer.qtpl:1
package tpl
//line app/vmalert/tpl/footer.qtpl:1
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/tpl/footer.qtpl:1
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/tpl/footer.qtpl:1
func StreamFooter(qw422016 *qt422016.Writer) {
//line app/vmalert/tpl/footer.qtpl:1
qw422016.N().S(`
</main>
<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/js/bootstrap.bundle.min.js" integrity="sha384-MrcW6ZMFYlzcLA8Nl+NtUVF0sA7MsXsP1UyJoMp4YLEuNSfAP+JcXn/tWtIaxVXM" crossorigin="anonymous"></script>
<script src="https://code.jquery.com/jquery-3.3.1.min.js"></script>
<script type="text/javascript">
function expandAll() {
$('.collapse').addClass('show');
}
function collapseAll() {
$('.collapse').removeClass('show');
}
$(document).ready(function() {
// prevent collapse logic on link click
$(".group-heading a").click(function(e) {
e.stopPropagation();
});
$(".group-heading").click(function(e) {
let target = $(this).attr('data-bs-target');
let el = $('#'+target);
new bootstrap.Collapse(el, {
toggle: true
});
});
var hash = window.location.hash.substr(1);
let group = $('#'+hash);
if (group.length > 0) {
group.click();
}
});
</script>
</body>
</html>
`)
//line app/vmalert/tpl/footer.qtpl:36
}
//line app/vmalert/tpl/footer.qtpl:36
func WriteFooter(qq422016 qtio422016.Writer) {
//line app/vmalert/tpl/footer.qtpl:36
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/tpl/footer.qtpl:36
StreamFooter(qw422016)
//line app/vmalert/tpl/footer.qtpl:36
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/tpl/footer.qtpl:36
}
//line app/vmalert/tpl/footer.qtpl:36
func Footer() string {
//line app/vmalert/tpl/footer.qtpl:36
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/tpl/footer.qtpl:36
WriteFooter(qb422016)
//line app/vmalert/tpl/footer.qtpl:36
qs422016 := string(qb422016.B)
//line app/vmalert/tpl/footer.qtpl:36
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/tpl/footer.qtpl:36
return qs422016
//line app/vmalert/tpl/footer.qtpl:36
}

View File

@@ -0,0 +1,43 @@
{% func Header(title string, pages []NavItem) %}
<!DOCTYPE html>
<html lang="en">
<head>
<title>vmalert{% if title != "" %} - {%s title %}{% endif %}</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
<style>
body{
min-height: 75rem;
padding-top: 4.5rem;
}
pre {
overflow: scroll;
max-width: 600px;
min-height: 30px;
}
.group-heading {
cursor: pointer;
padding: 5px;
margin-top: 5px;
position: relative;
}
.group-heading .anchor {
position:absolute;
top:-60px;
}
.group-heading span {
float: right;
margin-left: 5px;
margin-right: 5px;
}
.group-heading:hover {
background-color: #f8f9fa!important;
}
.table .error-cell{
word-break: break-word;
}
</style>
</head>
<body>
{%= PrintNavItems(title, pages) %}
<main class="px-2">
{% endfunc %}

View File

@@ -0,0 +1,107 @@
// Code generated by qtc from "header.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/tpl/header.qtpl:1
package tpl
//line app/vmalert/tpl/header.qtpl:1
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/tpl/header.qtpl:1
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/tpl/header.qtpl:1
func StreamHeader(qw422016 *qt422016.Writer, title string, pages []NavItem) {
//line app/vmalert/tpl/header.qtpl:1
qw422016.N().S(`
<!DOCTYPE html>
<html lang="en">
<head>
<title>vmalert`)
//line app/vmalert/tpl/header.qtpl:5
if title != "" {
//line app/vmalert/tpl/header.qtpl:5
qw422016.N().S(` - `)
//line app/vmalert/tpl/header.qtpl:5
qw422016.E().S(title)
//line app/vmalert/tpl/header.qtpl:5
}
//line app/vmalert/tpl/header.qtpl:5
qw422016.N().S(`</title>
<link href="https://cdn.jsdelivr.net/npm/bootstrap@5.0.2/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-EVSTQN3/azprG1Anm3QDgpJLIm9Nao0Yz1ztcQTwFspd3yD65VohhpuuCOmLASjC" crossorigin="anonymous">
<style>
body{
min-height: 75rem;
padding-top: 4.5rem;
}
pre {
overflow: scroll;
max-width: 600px;
min-height: 30px;
}
.group-heading {
cursor: pointer;
padding: 5px;
margin-top: 5px;
position: relative;
}
.group-heading .anchor {
position:absolute;
top:-60px;
}
.group-heading span {
float: right;
margin-left: 5px;
margin-right: 5px;
}
.group-heading:hover {
background-color: #f8f9fa!important;
}
.table .error-cell{
word-break: break-word;
}
</style>
</head>
<body>
`)
//line app/vmalert/tpl/header.qtpl:41
StreamPrintNavItems(qw422016, title, pages)
//line app/vmalert/tpl/header.qtpl:41
qw422016.N().S(`
<main class="px-2">
`)
//line app/vmalert/tpl/header.qtpl:43
}
//line app/vmalert/tpl/header.qtpl:43
func WriteHeader(qq422016 qtio422016.Writer, title string, pages []NavItem) {
//line app/vmalert/tpl/header.qtpl:43
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/tpl/header.qtpl:43
StreamHeader(qw422016, title, pages)
//line app/vmalert/tpl/header.qtpl:43
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/tpl/header.qtpl:43
}
//line app/vmalert/tpl/header.qtpl:43
func Header(title string, pages []NavItem) string {
//line app/vmalert/tpl/header.qtpl:43
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/tpl/header.qtpl:43
WriteHeader(qb422016, title, pages)
//line app/vmalert/tpl/header.qtpl:43
qs422016 := string(qb422016.B)
//line app/vmalert/tpl/header.qtpl:43
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/tpl/header.qtpl:43
return qs422016
//line app/vmalert/tpl/header.qtpl:43
}

25
app/vmalert/tpl/nav.qtpl Normal file
View File

@@ -0,0 +1,25 @@
{% code
type NavItem struct {
Name string
Url string
}
%}
{% func PrintNavItems(current string, items []NavItem) %}
<nav class="navbar navbar-expand-md navbar-dark fixed-top bg-dark">
<div class="container-fluid">
<div class="collapse navbar-collapse" id="navbarCollapse">
<ul class="navbar-nav me-auto mb-2 mb-md-0">
{% for _, item := range items %}
<li class="nav-item">
<a class="nav-link{% if current == item.Name %} active{% endif %}" href="{%s item.Url %}">
{%s item.Name %}
</a>
</li>
{% endfor %}
</ul>
</div>
</nav>
{% endfunc %}

View File

@@ -0,0 +1,96 @@
// Code generated by qtc from "nav.qtpl". DO NOT EDIT.
// See https://github.com/valyala/quicktemplate for details.
//line app/vmalert/tpl/nav.qtpl:1
package tpl
//line app/vmalert/tpl/nav.qtpl:1
import (
qtio422016 "io"
qt422016 "github.com/valyala/quicktemplate"
)
//line app/vmalert/tpl/nav.qtpl:1
var (
_ = qtio422016.Copy
_ = qt422016.AcquireByteBuffer
)
//line app/vmalert/tpl/nav.qtpl:2
type NavItem struct {
Name string
Url string
}
//line app/vmalert/tpl/nav.qtpl:8
func StreamPrintNavItems(qw422016 *qt422016.Writer, current string, items []NavItem) {
//line app/vmalert/tpl/nav.qtpl:8
qw422016.N().S(`
<nav class="navbar navbar-expand-md navbar-dark fixed-top bg-dark">
<div class="container-fluid">
<div class="collapse navbar-collapse" id="navbarCollapse">
<ul class="navbar-nav me-auto mb-2 mb-md-0">
`)
//line app/vmalert/tpl/nav.qtpl:13
for _, item := range items {
//line app/vmalert/tpl/nav.qtpl:13
qw422016.N().S(`
<li class="nav-item">
<a class="nav-link`)
//line app/vmalert/tpl/nav.qtpl:15
if current == item.Name {
//line app/vmalert/tpl/nav.qtpl:15
qw422016.N().S(` active`)
//line app/vmalert/tpl/nav.qtpl:15
}
//line app/vmalert/tpl/nav.qtpl:15
qw422016.N().S(`" href="`)
//line app/vmalert/tpl/nav.qtpl:15
qw422016.E().S(item.Url)
//line app/vmalert/tpl/nav.qtpl:15
qw422016.N().S(`">
`)
//line app/vmalert/tpl/nav.qtpl:16
qw422016.E().S(item.Name)
//line app/vmalert/tpl/nav.qtpl:16
qw422016.N().S(`
</a>
</li>
`)
//line app/vmalert/tpl/nav.qtpl:19
}
//line app/vmalert/tpl/nav.qtpl:19
qw422016.N().S(`
</ul>
</div>
</nav>
`)
//line app/vmalert/tpl/nav.qtpl:23
}
//line app/vmalert/tpl/nav.qtpl:23
func WritePrintNavItems(qq422016 qtio422016.Writer, current string, items []NavItem) {
//line app/vmalert/tpl/nav.qtpl:23
qw422016 := qt422016.AcquireWriter(qq422016)
//line app/vmalert/tpl/nav.qtpl:23
StreamPrintNavItems(qw422016, current, items)
//line app/vmalert/tpl/nav.qtpl:23
qt422016.ReleaseWriter(qw422016)
//line app/vmalert/tpl/nav.qtpl:23
}
//line app/vmalert/tpl/nav.qtpl:23
func PrintNavItems(current string, items []NavItem) string {
//line app/vmalert/tpl/nav.qtpl:23
qb422016 := qt422016.AcquireByteBuffer()
//line app/vmalert/tpl/nav.qtpl:23
WritePrintNavItems(qb422016, current, items)
//line app/vmalert/tpl/nav.qtpl:23
qs422016 := string(qb422016.B)
//line app/vmalert/tpl/nav.qtpl:23
qt422016.ReleaseByteBuffer(qb422016)
//line app/vmalert/tpl/nav.qtpl:23
return qs422016
//line app/vmalert/tpl/nav.qtpl:23
}

View File

@@ -7,17 +7,21 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
func newTimeSeries(value float64, labels map[string]string, timestamp time.Time) prompbmarshal.TimeSeries {
ts := prompbmarshal.TimeSeries{}
ts.Samples = append(ts.Samples, prompbmarshal.Sample{
Value: value,
Timestamp: timestamp.UnixNano() / 1e6,
})
func newTimeSeries(values []float64, timestamps []int64, labels map[string]string) prompbmarshal.TimeSeries {
ts := prompbmarshal.TimeSeries{
Samples: make([]prompbmarshal.Sample, len(values)),
}
for i := range values {
ts.Samples[i] = prompbmarshal.Sample{
Value: values[i],
Timestamp: time.Unix(timestamps[i], 0).UnixNano() / 1e6,
}
}
keys := make([]string, 0, len(labels))
for k := range labels {
keys = append(keys, k)
}
sort.Strings(keys)
sort.Strings(keys) // make order deterministic
for _, key := range keys {
ts.Labels = append(ts.Labels, prompbmarshal.Label{
Name: key,

18
app/vmalert/utils/auth.go Normal file
View File

@@ -0,0 +1,18 @@
package utils
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
)
// AuthConfig returns promauth.Config based on the given params
func AuthConfig(baUser, baPass, baFile, bearerToken, bearerTokenFile string) (*promauth.Config, error) {
var baCfg *promauth.BasicAuthConfig
if baUser != "" || baPass != "" || baFile != "" {
baCfg = &promauth.BasicAuthConfig{
Username: baUser,
Password: promauth.NewSecret(baPass),
PasswordFile: baFile,
}
}
return promauth.NewConfig(".", nil, baCfg, bearerToken, bearerTokenFile, nil, nil)
}

View File

@@ -0,0 +1,43 @@
package utils
import (
"time"
"github.com/VictoriaMetrics/metricsql"
)
// PromDuration is Prometheus duration.
type PromDuration struct {
milliseconds int64
}
// NewPromDuration returns PromDuration for given d.
func NewPromDuration(d time.Duration) PromDuration {
return PromDuration{
milliseconds: d.Milliseconds(),
}
}
// MarshalYAML implements yaml.Marshaler interface.
func (pd PromDuration) MarshalYAML() (interface{}, error) {
return pd.Duration().String(), nil
}
// UnmarshalYAML implements yaml.Unmarshaler interface.
func (pd *PromDuration) UnmarshalYAML(unmarshal func(interface{}) error) error {
var s string
if err := unmarshal(&s); err != nil {
return err
}
ms, err := metricsql.DurationValue(s, 0)
if err != nil {
return err
}
pd.milliseconds = ms
return nil
}
// Duration returns duration for pd.
func (pd *PromDuration) Duration() time.Duration {
return time.Duration(pd.milliseconds) * time.Millisecond
}

View File

@@ -13,6 +13,7 @@ func TestTLSConfig(t *testing.T) {
}
if tlsCfg == nil {
t.Errorf("expected tlsConfig to be set, got nil")
return
}
if tlsCfg.ServerName != serverName {
t.Errorf("unexpected ServerName, want %s, got %s", serverName, tlsCfg.ServerName)

View File

@@ -0,0 +1,525 @@
{
"type": "excalidraw",
"version": 2,
"source": "https://excalidraw.com",
"elements": [
{
"type": "rectangle",
"version": 794,
"versionNonce": 1855937036,
"isDeleted": false,
"id": "VgBUzo0blGR-Ijd2mQEEf",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 422.3502197265625,
"y": 215.55953979492188,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 123.7601318359375,
"height": 72.13211059570312,
"seed": 1194011660,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"miEbzHxOPXe4PEYvXiJp5",
"rcmiQfIWtfbTTlwxqr1sl",
"P-dpWlSTtnsux-zr5oqgF",
"oAToSPttH7aWoD_AqXGFX",
"Bpy5by47XGKB4yS99ZkuA",
"wRO0q9xKPHc8e8XPPsQWh",
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348083348
},
{
"type": "text",
"version": 659,
"versionNonce": 247957684,
"isDeleted": false,
"id": "e9TDm09y-GhPm84XWt0Jv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 443.89678955078125,
"y": 236.64378356933594,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 82,
"height": 24,
"seed": 327273100,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347948032,
"fontSize": 20,
"fontFamily": 3,
"text": "vmalert",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "middle"
},
{
"type": "rectangle",
"version": 1670,
"versionNonce": 2021681972,
"isDeleted": false,
"id": "dd52BjHfPMPRji9Tws7U-",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 750.3317260742188,
"y": 226.5509033203125,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 1779959692,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"Bpy5by47XGKB4yS99ZkuA"
],
"updated": 1638348054411
},
{
"type": "text",
"version": 1311,
"versionNonce": 1283453068,
"isDeleted": false,
"id": "9TEzv0sVCHAkc46ou0oNF",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 759.2862243652344,
"y": 238.68240356445312,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 152,
"height": 24,
"seed": 1617178804,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348054411,
"fontSize": 20,
"fontFamily": 3,
"text": "vminsert:8480",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 897,
"versionNonce": 1983434892,
"isDeleted": false,
"id": "Sa4OBd1ZjD6itohm7Ll8z",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 61.744873046875,
"y": 224.9600830078125,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 126267060,
"groupIds": [
"ek-pq3umtz1yN-J_-preq"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh"
],
"updated": 1638348077724
},
{
"type": "text",
"version": 719,
"versionNonce": 457402292,
"isDeleted": false,
"id": "we766A079lfGYu2_aC4Pl",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 70.69937133789062,
"y": 237.33523559570312,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 152,
"height": 24,
"seed": 478660236,
"groupIds": [
"ek-pq3umtz1yN-J_-preq"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348077725,
"fontSize": 20,
"fontFamily": 3,
"text": "vmselect:8481",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 1098,
"versionNonce": 1480603788,
"isDeleted": false,
"id": "8-XFSbd6Zw96EUSJbJXZv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 371.7434387207031,
"y": 398.50787353515625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 240.10644531249997,
"height": 44.74725341796875,
"seed": 99322124,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348083348
},
{
"type": "text",
"version": 864,
"versionNonce": 1115813900,
"isDeleted": false,
"id": "GUs816aggGqUSdoEsSmea",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 393.73809814453125,
"y": 410.5976257324219,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 199,
"height": 24,
"seed": 1194745268,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348009180,
"fontSize": 20,
"fontFamily": 3,
"text": "alertmanager:9093",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "arrow",
"version": 2405,
"versionNonce": 959767732,
"isDeleted": false,
"id": "Bpy5by47XGKB4yS99ZkuA",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 556.6860961914062,
"y": 252.2582773408825,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 184.9195556640625,
"height": 1.6022679018915937,
"seed": 357577356,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638348054411,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": 0.0344528515859526,
"gap": 10.57574462890625
},
"endBinding": {
"elementId": "dd52BjHfPMPRji9Tws7U-",
"focus": -0.039393828258510157,
"gap": 8.72607421875
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
184.9195556640625,
-1.6022679018915937
]
]
},
{
"type": "arrow",
"version": 1173,
"versionNonce": 1248255756,
"isDeleted": false,
"id": "wRO0q9xKPHc8e8XPPsQWh",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 406.0439244722469,
"y": 246.80533728741074,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 157.86774649373126,
"height": 0.1417938392881979,
"seed": 656189364,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638348077725,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": 0.13736472619498497,
"gap": 16.306295254315614
},
"endBinding": {
"elementId": "Sa4OBd1ZjD6itohm7Ll8z",
"focus": -0.013200835330936087,
"gap": 14.437713623046875
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
-157.86774649373126,
0.1417938392881979
]
]
},
{
"type": "text",
"version": 557,
"versionNonce": 995289780,
"isDeleted": false,
"id": "RbVSa4PnOgAMtzoKb-DhW",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 552.4987182617188,
"y": 212.27996826171875,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 188,
"height": 76,
"seed": 1989838604,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348059525,
"fontSize": 16,
"fontFamily": 3,
"text": "persist alerts state\n\n\nand recording rules",
"baseline": 72,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "text",
"version": 803,
"versionNonce": 1576507444,
"isDeleted": false,
"id": "ia2QzZNl_tuvfY3ymLjyJ",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 290.34130859375,
"y": 210.56927490234375,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 122,
"height": 19,
"seed": 157304972,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh"
],
"updated": 1638347948032,
"fontSize": 16,
"fontFamily": 3,
"text": "execute rules",
"baseline": 15,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "arrow",
"version": 1471,
"versionNonce": 1361321140,
"isDeleted": false,
"id": "sxEhnxlbT7ldlSsmHDUHp",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 484.18669893674246,
"y": 302.3424013553929,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 1.0484739253853945,
"height": 84.72775855671654,
"seed": 1818348300,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638348083348,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": 0.010768924644894236,
"gap": 14.650750964767894
},
"endBinding": {
"elementId": "8-XFSbd6Zw96EUSJbJXZv",
"focus": -0.051051952959743775,
"gap": 11.437713623046818
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
1.0484739253853945,
84.72775855671654
]
]
},
{
"type": "text",
"version": 576,
"versionNonce": 2112088460,
"isDeleted": false,
"id": "E9Run6wCm2chQ6JHrmc_y",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 530.5612182617188,
"y": 318.60687255859375,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 122,
"height": 38,
"seed": 1836541708,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348023735,
"fontSize": 16,
"fontFamily": 3,
"text": "send alert \nnotifications",
"baseline": 34,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "text",
"version": 480,
"versionNonce": 1835119500,
"isDeleted": false,
"id": "ff5OkfgmkKLifS13_TFj3",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 291.37474060058594,
"y": 261.4861297607422,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 122,
"height": 19,
"seed": 264004620,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh"
],
"updated": 1638347948032,
"fontSize": 16,
"fontFamily": 3,
"text": "restore state",
"baseline": 15,
"textAlign": "left",
"verticalAlign": "top"
}
],
"appState": {
"gridSize": null,
"viewBackgroundColor": "#ffffff"
},
"files": {}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 73 KiB

View File

@@ -0,0 +1,911 @@
{
"type": "excalidraw",
"version": 2,
"source": "https://excalidraw.com",
"elements": [
{
"type": "rectangle",
"version": 906,
"versionNonce": 448716468,
"isDeleted": false,
"id": "VgBUzo0blGR-Ijd2mQEEf",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 302.2676696777344,
"y": 275.59356689453125,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 123.7601318359375,
"height": 72.13211059570312,
"seed": 1194011660,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"miEbzHxOPXe4PEYvXiJp5",
"rcmiQfIWtfbTTlwxqr1sl",
"P-dpWlSTtnsux-zr5oqgF",
"oAToSPttH7aWoD_AqXGFX",
"Bpy5by47XGKB4yS99ZkuA",
"wRO0q9xKPHc8e8XPPsQWh",
"sxEhnxlbT7ldlSsmHDUHp",
"m9_BptFOFxbV2sS_xJDu2",
"fsGFp4NW4JlrCdF0HR3uA",
"OTKoeHmKtqCxFArDbY-sP"
],
"updated": 1638355221749
},
{
"type": "text",
"version": 762,
"versionNonce": 223660724,
"isDeleted": false,
"id": "e9TDm09y-GhPm84XWt0Jv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 318.0538024902344,
"y": 296.6778106689453,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 327273100,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"m9_BptFOFxbV2sS_xJDu2"
],
"updated": 1638355102070,
"fontSize": 20,
"fontFamily": 3,
"text": "vmalert1",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 2067,
"versionNonce": 817347852,
"isDeleted": false,
"id": "dd52BjHfPMPRji9Tws7U-",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 329.6426086425781,
"y": 90.3275146484375,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 269.336669921875,
"height": 44.74725341796875,
"seed": 1779959692,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"Bpy5by47XGKB4yS99ZkuA",
"m9_BptFOFxbV2sS_xJDu2",
"S_dOHQrhGmu8SFJzobJK7"
],
"updated": 1638355058507
},
{
"type": "text",
"version": 1717,
"versionNonce": 2040797452,
"isDeleted": false,
"id": "9TEzv0sVCHAkc46ou0oNF",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 349.01177978515625,
"y": 102.45901489257812,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 234,
"height": 24,
"seed": 1617178804,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"m9_BptFOFxbV2sS_xJDu2",
"S_dOHQrhGmu8SFJzobJK7"
],
"updated": 1638355040938,
"fontSize": 20,
"fontFamily": 3,
"text": "victoriametrics:8428",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 1553,
"versionNonce": 39381044,
"isDeleted": false,
"id": "8-XFSbd6Zw96EUSJbJXZv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 173.67257690429688,
"y": 470.5100402832031,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 240.10644531249997,
"height": 44.74725341796875,
"seed": 99322124,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp",
"OTKoeHmKtqCxFArDbY-sP",
"XhPgLRBk-YhWAFcSQi9TJ",
"D6fkQH1E_MFbCuL697ArO"
],
"updated": 1638355221749
},
{
"type": "text",
"version": 1309,
"versionNonce": 1112872844,
"isDeleted": false,
"id": "GUs816aggGqUSdoEsSmea",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 189.667236328125,
"y": 482.59979248046875,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 211,
"height": 24,
"seed": 1194745268,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355040938,
"fontSize": 20,
"fontFamily": 3,
"text": "alertmanager1:9093",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "text",
"version": 835,
"versionNonce": 2036483596,
"isDeleted": false,
"id": "RbVSa4PnOgAMtzoKb-DhW",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 525.83837890625,
"y": 147.33470153808594,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 188,
"height": 95,
"seed": 1989838604,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355040939,
"fontSize": 16,
"fontFamily": 3,
"text": "execute rules,\npersist alerts \nand recording rules,\nrestore state\n",
"baseline": 91,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "text",
"version": 777,
"versionNonce": 990468492,
"isDeleted": false,
"id": "E9Run6wCm2chQ6JHrmc_y",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 617.0690307617188,
"y": 376.9822692871094,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 122,
"height": 38,
"seed": 1836541708,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638355227582,
"fontSize": 16,
"fontFamily": 3,
"text": "send alert \nnotifications",
"baseline": 34,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 1112,
"versionNonce": 999136652,
"isDeleted": false,
"id": "mIu-d0lmShCxzMLD5iA_p",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 492.29505920410156,
"y": 272.3052215576172,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 123.7601318359375,
"height": 72.13211059570312,
"seed": 756416780,
"groupIds": [
"jbBot-UNdMoWy2jPXC1u5"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"miEbzHxOPXe4PEYvXiJp5",
"rcmiQfIWtfbTTlwxqr1sl",
"P-dpWlSTtnsux-zr5oqgF",
"oAToSPttH7aWoD_AqXGFX",
"Bpy5by47XGKB4yS99ZkuA",
"wRO0q9xKPHc8e8XPPsQWh",
"sxEhnxlbT7ldlSsmHDUHp",
"S_dOHQrhGmu8SFJzobJK7",
"XhPgLRBk-YhWAFcSQi9TJ",
"Ar-hcDLlzVSoTPs2MaywO"
],
"updated": 1638355153683
},
{
"type": "text",
"version": 970,
"versionNonce": 430760500,
"isDeleted": false,
"id": "ZqIR6SaLNDQl8s0zZbdnE",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 508.08119201660156,
"y": 293.38946533203125,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 1477404084,
"groupIds": [
"jbBot-UNdMoWy2jPXC1u5"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355105020,
"fontSize": 20,
"fontFamily": 3,
"text": "vmalertN",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"id": "qZFezRGU4Chwxgvb2451t",
"type": "line",
"x": 449.93653869628906,
"y": 306.30010986328125,
"width": 19.48321533203125,
"height": 0.3865966796875,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1833426828,
"version": 69,
"versionNonce": 871948300,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355040939,
"points": [
[
0,
0
],
[
19.48321533203125,
0.3865966796875
]
],
"lastCommittedPoint": null,
"startBinding": null,
"endBinding": null,
"startArrowhead": null,
"endArrowhead": null
},
{
"type": "rectangle",
"version": 1649,
"versionNonce": 93981708,
"isDeleted": false,
"id": "gNHiZJKo0ap69ALDobtZ-",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 518.5596466064453,
"y": 471.4539489746094,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 240.10644531249997,
"height": 44.74725341796875,
"seed": 454422412,
"groupIds": [
"fEAIeQ0DxLnI_rPlKPZqW"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp",
"fsGFp4NW4JlrCdF0HR3uA",
"Ar-hcDLlzVSoTPs2MaywO",
"D6fkQH1E_MFbCuL697ArO"
],
"updated": 1638355153683
},
{
"type": "text",
"version": 1412,
"versionNonce": 75038348,
"isDeleted": false,
"id": "O-zgjZBvt4RI1PrkBNQnb",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 534.3148040771484,
"y": 483.543701171875,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 211,
"height": 24,
"seed": 1026268980,
"groupIds": [
"fEAIeQ0DxLnI_rPlKPZqW"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355040939,
"fontSize": 20,
"fontFamily": 3,
"text": "alertmanagerN:9093",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"id": "m9_BptFOFxbV2sS_xJDu2",
"type": "arrow",
"x": 378.57185562792466,
"y": 262.1596984863281,
"width": 79.96146539019026,
"height": 111.53411865234375,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 141224884,
"version": 521,
"versionNonce": 283254540,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355102070,
"points": [
[
0,
0
],
[
79.96146539019026,
-111.53411865234375
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": -0.23996018586441184,
"gap": 13.433868408203125
},
"endBinding": {
"elementId": "dd52BjHfPMPRji9Tws7U-",
"focus": -0.14207100087207922,
"gap": 15.550811767578125
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "S_dOHQrhGmu8SFJzobJK7",
"type": "arrow",
"x": 558.1990515577129,
"y": 263.4271240234375,
"width": 88.18940317574186,
"height": 114.15780639648438,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1004789812,
"version": 314,
"versionNonce": 1259171724,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355105019,
"points": [
[
0,
0
],
[
-88.18940317574186,
-114.15780639648438
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "mIu-d0lmShCxzMLD5iA_p",
"focus": 0.4315094049079832,
"gap": 8.878097534179688
},
"endBinding": {
"elementId": "dd52BjHfPMPRji9Tws7U-",
"focus": 0.14840834302306677,
"gap": 14.194549560546875
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "fsGFp4NW4JlrCdF0HR3uA",
"type": "arrow",
"x": 356.6613630516658,
"y": 354.95416259765625,
"width": 278.74351860741064,
"height": 106.90020751953125,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 712384692,
"version": 334,
"versionNonce": 266493324,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355102070,
"points": [
[
0,
0
],
[
278.74351860741064,
106.90020751953125
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": 0.7721210277890177,
"gap": 7.228485107421875
},
"endBinding": {
"elementId": "gNHiZJKo0ap69ALDobtZ-",
"focus": 0.4493598001028791,
"gap": 9.599578857421875
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "OTKoeHmKtqCxFArDbY-sP",
"type": "arrow",
"x": 361.02563221226757,
"y": 356.2252197265625,
"width": 95.3594006000182,
"height": 105.52130126953125,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1992252596,
"version": 466,
"versionNonce": 472553100,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355221749,
"points": [
[
0,
0
],
[
-95.3594006000182,
105.52130126953125
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": -0.39325294425055324,
"gap": 8.499542236328125
},
"endBinding": {
"elementId": "8-XFSbd6Zw96EUSJbJXZv",
"focus": -0.400636314282374,
"gap": 8.763519287109375
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "XhPgLRBk-YhWAFcSQi9TJ",
"type": "arrow",
"x": 561.7189961327852,
"y": 349.238037109375,
"width": 266.271510370216,
"height": 112.383544921875,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 588869260,
"version": 453,
"versionNonce": 584723212,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355150131,
"points": [
[
0,
0
],
[
-266.271510370216,
112.383544921875
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "mIu-d0lmShCxzMLD5iA_p",
"focus": -0.7084007026732048,
"gap": 4.8007049560546875
},
"endBinding": {
"elementId": "8-XFSbd6Zw96EUSJbJXZv",
"focus": -0.4180430111580123,
"gap": 8.888458251953125
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "Ar-hcDLlzVSoTPs2MaywO",
"type": "arrow",
"x": 557.5999880535809,
"y": 352.71795654296875,
"width": 112.87813113656136,
"height": 106.3255615234375,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1508782476,
"version": 331,
"versionNonce": 1137561908,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355153683,
"points": [
[
0,
0
],
[
112.87813113656136,
106.3255615234375
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "mIu-d0lmShCxzMLD5iA_p",
"focus": 0.4358123206211808,
"gap": 8.280624389648438
},
"endBinding": {
"elementId": "gNHiZJKo0ap69ALDobtZ-",
"focus": 0.4783744286829653,
"gap": 12.410430908203125
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "WqHnv9_g3SdkLZuy4WSHa",
"type": "text",
"x": 186.5891876220703,
"y": 523.145263671875,
"width": 272,
"height": 19,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 1310147468,
"version": 281,
"versionNonce": 949157044,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355040939,
"text": "Alertmanagers in cluster mode",
"fontSize": 16,
"fontFamily": 3,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 15
},
{
"id": "yA0kgvdF71wZbHJ7Cg2p8",
"type": "rectangle",
"x": 159.41685485839844,
"y": 440.7666015625,
"width": 625.9590148925781,
"height": 110.13494873046878,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 1465986996,
"version": 366,
"versionNonce": 196040844,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355040939
},
{
"id": "D6fkQH1E_MFbCuL697ArO",
"type": "arrow",
"x": 422.4853057861328,
"y": 494.8779132311229,
"width": 90.5517578125,
"height": 0.12427004064892344,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 2025208204,
"version": 316,
"versionNonce": 2144102156,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638355040939,
"points": [
[
0,
0
],
[
90.5517578125,
0.12427004064892344
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "8-XFSbd6Zw96EUSJbJXZv",
"focus": 0.08064204025505255,
"gap": 8.706283569335938
},
"endBinding": {
"elementId": "gNHiZJKo0ap69ALDobtZ-",
"focus": -0.05976220061952895,
"gap": 5.5225830078125
},
"startArrowhead": "arrow",
"endArrowhead": "arrow"
},
{
"type": "text",
"version": 463,
"versionNonce": 2118914484,
"isDeleted": false,
"id": "p4EDvSKPqZzfM_SReybeq",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 274.6086883544922,
"y": 56.77886962890625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 394,
"height": 19,
"seed": 1214462348,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355074872,
"fontSize": 16,
"fontFamily": 3,
"text": "VictoriaMetrics with deduplication enabled",
"baseline": 15,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 712,
"versionNonce": 737833612,
"isDeleted": false,
"id": "YfNFeOucMo8BJk2P2JRex",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 229.77923583984375,
"y": 227.84378051757812,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 448.5542297363281,
"height": 134.06329345703122,
"seed": 932207756,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355133244
},
{
"type": "text",
"version": 427,
"versionNonce": 1458527796,
"isDeleted": false,
"id": "oldM7Q_aJtpWd2jXQ0iKf",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 238.9442901611328,
"y": 230.862548828125,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 225,
"height": 38,
"seed": 2131899572,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638355259358,
"fontSize": 16,
"fontFamily": 3,
"text": "vmalerts with identical \nconfigurations",
"baseline": 34,
"textAlign": "left",
"verticalAlign": "top"
}
],
"appState": {
"gridSize": null,
"viewBackgroundColor": "#ffffff"
},
"files": {}
}

BIN
app/vmalert/vmalert_ha.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 151 KiB

View File

@@ -0,0 +1,915 @@
{
"type": "excalidraw",
"version": 2,
"source": "https://excalidraw.com",
"elements": [
{
"type": "rectangle",
"version": 791,
"versionNonce": 48874036,
"isDeleted": false,
"id": "VgBUzo0blGR-Ijd2mQEEf",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 289.6802978515625,
"y": 399.3895568847656,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 123.7601318359375,
"height": 72.13211059570312,
"seed": 1194011660,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"miEbzHxOPXe4PEYvXiJp5",
"rcmiQfIWtfbTTlwxqr1sl",
"P-dpWlSTtnsux-zr5oqgF",
"oAToSPttH7aWoD_AqXGFX",
"wRO0q9xKPHc8e8XPPsQWh",
"sxEhnxlbT7ldlSsmHDUHp",
"pD9DcILMxa6GaR1U5YyMO",
"HPEwr85wL4IedW0AgdArp",
"EyecK0YM9Cc8T6ju-nTOc"
],
"updated": 1638347812431
},
{
"type": "text",
"version": 658,
"versionNonce": 1653816076,
"isDeleted": false,
"id": "e9TDm09y-GhPm84XWt0Jv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 311.22686767578125,
"y": 420.4738006591797,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 82,
"height": 24,
"seed": 327273100,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347796775,
"fontSize": 20,
"fontFamily": 3,
"text": "vmalert",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "middle"
},
{
"type": "rectangle",
"version": 802,
"versionNonce": 995326644,
"isDeleted": false,
"id": "Sa4OBd1ZjD6itohm7Ll8z",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 603.05322265625,
"y": 228.65371704101562,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 126267060,
"groupIds": [
"ek-pq3umtz1yN-J_-preq"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"he-SpFjCxEQEWpWny2kKP",
"-pjrKo16rOsasM8viZPJ-",
"MGdu6GDIPNBAaEUr0Gt-a"
],
"updated": 1638347737174
},
{
"type": "text",
"version": 624,
"versionNonce": 707755700,
"isDeleted": false,
"id": "we766A079lfGYu2_aC4Pl",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 640.7635803222656,
"y": 241.02886962890625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 478660236,
"groupIds": [
"ek-pq3umtz1yN-J_-preq"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347701075,
"fontSize": 20,
"fontFamily": 3,
"text": "vmselect",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "text",
"version": 802,
"versionNonce": 1974403340,
"isDeleted": false,
"id": "ia2QzZNl_tuvfY3ymLjyJ",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 401.241943359375,
"y": 342.4627990722656,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 178,
"height": 38,
"seed": 157304972,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh"
],
"updated": 1638347863144,
"fontSize": 16,
"fontFamily": 3,
"text": "execute aggregating\nrecording rules",
"baseline": 34,
"textAlign": "left",
"verticalAlign": "top"
},
{
"id": "jrFZeFNsxlss7IsEZpA-J",
"type": "rectangle",
"x": 594.1509857177734,
"y": 192.09194946289062,
"width": 397.1286010742188,
"height": 209.62742614746097,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 48379060,
"version": 665,
"versionNonce": 617456652,
"isDeleted": false,
"boundElementIds": [
"pD9DcILMxa6GaR1U5YyMO"
],
"updated": 1638347763716
},
{
"id": "6ibhLp94HJFIdfP5HCEv8",
"type": "text",
"x": 609.7205657958984,
"y": 199.81723022460938,
"width": 225,
"height": 19,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "sharp",
"seed": 1053164852,
"version": 256,
"versionNonce": 538595124,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638347761092,
"text": "VM cluster with raw data",
"fontSize": 16,
"fontFamily": 3,
"textAlign": "left",
"verticalAlign": "top",
"baseline": 15
},
{
"type": "rectangle",
"version": 914,
"versionNonce": 1576666164,
"isDeleted": false,
"id": "R5v-yZCVJ97BkJqr0Qb7H",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 807.5664215087891,
"y": 287.8628387451172,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 1794212620,
"groupIds": [
"BJNOAY1MY3Evr9B3qQtHf"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"he-SpFjCxEQEWpWny2kKP",
"-pjrKo16rOsasM8viZPJ-",
"bZXA8PH9gYu-clotqJ4f7",
"MGdu6GDIPNBAaEUr0Gt-a"
],
"updated": 1638347737174
},
{
"type": "text",
"version": 725,
"versionNonce": 267455500,
"isDeleted": false,
"id": "pWRC_smX7TuOI8_8UrA4H",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 840.0209197998047,
"y": 300.2379913330078,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 105,
"height": 24,
"seed": 421856180,
"groupIds": [
"BJNOAY1MY3Evr9B3qQtHf"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347701076,
"fontSize": 20,
"fontFamily": 3,
"text": "vmstorage",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 843,
"versionNonce": 1244193972,
"isDeleted": false,
"id": "EqROOfYulSPsZm7ovxfQN",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 604.9091339111328,
"y": 345.2847442626953,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 2043521972,
"groupIds": [
"ls6uq-W9bbVBM_UxAuyba"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"bZXA8PH9gYu-clotqJ4f7"
],
"updated": 1638347718537
},
{
"type": "text",
"version": 676,
"versionNonce": 143370892,
"isDeleted": false,
"id": "ddQH1nnmT7HbKW7Xmv4zx",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 642.6199798583984,
"y": 357.90354919433594,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 335223180,
"groupIds": [
"ls6uq-W9bbVBM_UxAuyba"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"bZXA8PH9gYu-clotqJ4f7"
],
"updated": 1638347701076,
"fontSize": 20,
"fontFamily": 3,
"text": "vminsert",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"id": "bZXA8PH9gYu-clotqJ4f7",
"type": "arrow",
"x": 785.2667708463345,
"y": 367.2342882272049,
"width": 99.87819315552736,
"height": 24.681162491468,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 1153029388,
"version": 420,
"versionNonce": 1726025228,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638347704644,
"points": [
[
0,
0
],
[
99.87819315552736,
-24.681162491468
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "EqROOfYulSPsZm7ovxfQN",
"focus": 0.5247890891198189,
"gap": 8.36404562660789
},
"endBinding": {
"elementId": "R5v-yZCVJ97BkJqr0Qb7H",
"focus": -0.6931056940383433,
"gap": 9.943033572650961
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"type": "arrow",
"version": 528,
"versionNonce": 1908259468,
"isDeleted": false,
"id": "MGdu6GDIPNBAaEUr0Gt-a",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 781.4456641707259,
"y": 248.777857603323,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 104.0940537386266,
"height": 27.60697400233846,
"seed": 1695061516,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638347737174,
"startBinding": {
"elementId": "Sa4OBd1ZjD6itohm7Ll8z",
"focus": -0.5921495247360469,
"gap": 6.398850205882127
},
"endBinding": {
"elementId": "R5v-yZCVJ97BkJqr0Qb7H",
"focus": 0.7021471697457312,
"gap": 11.478007139455713
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
104.0940537386266,
27.60697400233846
]
]
},
{
"type": "rectangle",
"version": 883,
"versionNonce": 845352076,
"isDeleted": false,
"id": "4pW8hvBu3bo1eMtFvg_gS",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 601.6520690917969,
"y": 473.7677536010742,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 1678097076,
"groupIds": [
"3kSpFrIN3kg4jwjDKpNWw"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"he-SpFjCxEQEWpWny2kKP",
"-pjrKo16rOsasM8viZPJ-",
"5W90eDBjtfZvkSuQTG0Iw"
],
"updated": 1638347777173
},
{
"type": "text",
"version": 703,
"versionNonce": 519371956,
"isDeleted": false,
"id": "w_i2hO06oLa0bWbyAfFzU",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 639.3624267578125,
"y": 486.14290618896484,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 340753036,
"groupIds": [
"3kSpFrIN3kg4jwjDKpNWw"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347776748,
"fontSize": 20,
"fontFamily": 3,
"text": "vmselect",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 745,
"versionNonce": 879581324,
"isDeleted": false,
"id": "U5U-67wL5fPwvzlxOAF5A",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 592.7498321533203,
"y": 437.2059860229492,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 397.1286010742188,
"height": 209.62742614746097,
"seed": 912540724,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"pD9DcILMxa6GaR1U5YyMO"
],
"updated": 1638347776748
},
{
"type": "text",
"version": 345,
"versionNonce": 1628116148,
"isDeleted": false,
"id": "0IGVrvICMVeZp2RCaqTeP",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "dotted",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 608.3194122314453,
"y": 444.93126678466797,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 291,
"height": 19,
"seed": 68700428,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347784373,
"fontSize": 16,
"fontFamily": 3,
"text": "VM cluster with aggregated data",
"baseline": 15,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 996,
"versionNonce": 2112461580,
"isDeleted": false,
"id": "9QMf0HXPE1W4M9S-zMJVO",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 806.1652679443359,
"y": 532.9768753051758,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 523607476,
"groupIds": [
"Eb2kWfz3ZWBu8cNul0h_c"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"he-SpFjCxEQEWpWny2kKP",
"-pjrKo16rOsasM8viZPJ-",
"9WBH34em4CdU2OwzqYIl5",
"5W90eDBjtfZvkSuQTG0Iw"
],
"updated": 1638347777173
},
{
"type": "text",
"version": 804,
"versionNonce": 1660832052,
"isDeleted": false,
"id": "WOrOD5vn6EtMUQRJomt3-",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 838.6197662353516,
"y": 545.3520278930664,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 105,
"height": 24,
"seed": 1321420684,
"groupIds": [
"Eb2kWfz3ZWBu8cNul0h_c"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638347776748,
"fontSize": 20,
"fontFamily": 3,
"text": "vmstorage",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 925,
"versionNonce": 2052807604,
"isDeleted": false,
"id": "4dtJZpXEUxSK3biwFG7vd",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 603.5079803466797,
"y": 590.3987808227539,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 171.99359130859375,
"height": 44.74725341796875,
"seed": 1738524468,
"groupIds": [
"punXEDFtHkSpcd9seAkZj"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh",
"9WBH34em4CdU2OwzqYIl5",
"EyecK0YM9Cc8T6ju-nTOc"
],
"updated": 1638347812431
},
{
"type": "text",
"version": 756,
"versionNonce": 2040967820,
"isDeleted": false,
"id": "rAbyooo-X08-86qjoK0WR",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 641.2188262939453,
"y": 603.0175857543945,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 94,
"height": 24,
"seed": 948598284,
"groupIds": [
"punXEDFtHkSpcd9seAkZj"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"9WBH34em4CdU2OwzqYIl5"
],
"updated": 1638347776748,
"fontSize": 20,
"fontFamily": 3,
"text": "vminsert",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "arrow",
"version": 660,
"versionNonce": 1560678196,
"isDeleted": false,
"id": "9WBH34em4CdU2OwzqYIl5",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 783.8656172818814,
"y": 612.3483247872634,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 99.87819315552736,
"height": 24.681162491468,
"seed": 1141424308,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638347777173,
"startBinding": {
"elementId": "4dtJZpXEUxSK3biwFG7vd",
"focus": 0.5247890891198189,
"gap": 8.364045626608004
},
"endBinding": {
"elementId": "9QMf0HXPE1W4M9S-zMJVO",
"focus": -0.6931056940383404,
"gap": 9.943033572650847
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
99.87819315552736,
-24.681162491468
]
]
},
{
"type": "arrow",
"version": 768,
"versionNonce": 1653264948,
"isDeleted": false,
"id": "5W90eDBjtfZvkSuQTG0Iw",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 780.0445106062728,
"y": 493.8918941633816,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 104.0940537386266,
"height": 27.60697400233846,
"seed": 1852298380,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638347777173,
"startBinding": {
"elementId": "4pW8hvBu3bo1eMtFvg_gS",
"focus": -0.5921495247360469,
"gap": 6.398850205882127
},
"endBinding": {
"elementId": "9QMf0HXPE1W4M9S-zMJVO",
"focus": 0.7021471697457312,
"gap": 11.478007139455713
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
104.0940537386266,
27.60697400233846
]
]
},
{
"id": "HPEwr85wL4IedW0AgdArp",
"type": "arrow",
"x": 423.70701599121094,
"y": 428.34434509277344,
"width": 179.54901123046875,
"height": 182.9937744140625,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 389863732,
"version": 47,
"versionNonce": 1364399028,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638347805500,
"points": [
[
0,
0
],
[
179.54901123046875,
-182.9937744140625
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": 0.6700023593531782,
"gap": 10.266586303710938
},
"endBinding": null,
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"id": "EyecK0YM9Cc8T6ju-nTOc",
"type": "arrow",
"x": 424.7585906982422,
"y": 441.12806701660156,
"width": 174.29449462890625,
"height": 170.56607055664062,
"angle": 0,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"groupIds": [],
"strokeSharpness": "round",
"seed": 981082124,
"version": 92,
"versionNonce": 1261546252,
"isDeleted": false,
"boundElementIds": null,
"updated": 1638347812431,
"points": [
[
0,
0
],
[
174.29449462890625,
170.56607055664062
]
],
"lastCommittedPoint": null,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"focus": -0.6826568395144794,
"gap": 11.318161010742188
},
"endBinding": {
"elementId": "4dtJZpXEUxSK3biwFG7vd",
"focus": -0.8207814548026305,
"gap": 4.45489501953125
},
"startArrowhead": null,
"endArrowhead": "arrow"
},
{
"type": "text",
"version": 775,
"versionNonce": 95530764,
"isDeleted": false,
"id": "o-yIJ_WZzVubhbVODvhcC",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 407.27012634277344,
"y": 489.33091735839844,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 141,
"height": 19,
"seed": 775116812,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"wRO0q9xKPHc8e8XPPsQWh"
],
"updated": 1638347824876,
"fontSize": 16,
"fontFamily": 3,
"text": "persist results",
"baseline": 15,
"textAlign": "left",
"verticalAlign": "top"
}
],
"appState": {
"gridSize": null,
"viewBackgroundColor": "#ffffff"
},
"files": {}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 122 KiB

View File

@@ -0,0 +1,354 @@
{
"type": "excalidraw",
"version": 2,
"source": "https://excalidraw.com",
"elements": [
{
"type": "rectangle",
"version": 795,
"versionNonce": 1195460364,
"isDeleted": false,
"id": "VgBUzo0blGR-Ijd2mQEEf",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 422.3502197265625,
"y": 215.55953979492188,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 123.7601318359375,
"height": 72.13211059570312,
"seed": 1194011660,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"miEbzHxOPXe4PEYvXiJp5",
"rcmiQfIWtfbTTlwxqr1sl",
"P-dpWlSTtnsux-zr5oqgF",
"oAToSPttH7aWoD_AqXGFX",
"Bpy5by47XGKB4yS99ZkuA",
"wRO0q9xKPHc8e8XPPsQWh",
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348116633
},
{
"type": "text",
"version": 660,
"versionNonce": 1034125236,
"isDeleted": false,
"id": "e9TDm09y-GhPm84XWt0Jv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 443.89678955078125,
"y": 236.64378356933594,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 82,
"height": 24,
"seed": 327273100,
"groupIds": [
"iBaXgbpyifSwPplm_GO5b"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348116633,
"fontSize": 20,
"fontFamily": 3,
"text": "vmalert",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "middle"
},
{
"type": "rectangle",
"version": 1690,
"versionNonce": 236788620,
"isDeleted": false,
"id": "dd52BjHfPMPRji9Tws7U-",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 344.9837951660156,
"y": 64.64248657226562,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 269.336669921875,
"height": 44.74725341796875,
"seed": 1779959692,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"Bpy5by47XGKB4yS99ZkuA"
],
"updated": 1638348176044
},
{
"type": "text",
"version": 1331,
"versionNonce": 945335476,
"isDeleted": false,
"id": "9TEzv0sVCHAkc46ou0oNF",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 364.35296630859375,
"y": 76.77398681640625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 234,
"height": 24,
"seed": 1617178804,
"groupIds": [
"2Lijjn3PwPQW_8KrcDmdu"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348176045,
"fontSize": 20,
"fontFamily": 3,
"text": "victoriametrics:8428",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "rectangle",
"version": 1099,
"versionNonce": 671872012,
"isDeleted": false,
"id": "8-XFSbd6Zw96EUSJbJXZv",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 360.6495666503906,
"y": 382.2536315917969,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 240.10644531249997,
"height": 44.74725341796875,
"seed": 99322124,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348116633
},
{
"type": "text",
"version": 865,
"versionNonce": 635839156,
"isDeleted": false,
"id": "GUs816aggGqUSdoEsSmea",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 382.64422607421875,
"y": 394.3433837890625,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 199,
"height": 24,
"seed": 1194745268,
"groupIds": [
"6obQBPHIfExBKfejeLLVO"
],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348116633,
"fontSize": 20,
"fontFamily": 3,
"text": "alertmanager:9093",
"baseline": 19,
"textAlign": "center",
"verticalAlign": "top"
},
{
"type": "arrow",
"version": 2444,
"versionNonce": 361351692,
"isDeleted": false,
"id": "Bpy5by47XGKB4yS99ZkuA",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 486.0465703465111,
"y": 204.98379516601562,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 0.7962088410123442,
"height": 86.86798095703125,
"seed": 357577356,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638348176045,
"startBinding": {
"elementId": "VgBUzo0blGR-Ijd2mQEEf",
"gap": 10.57574462890625,
"focus": 0.0344528515859526
},
"endBinding": {
"elementId": "dd52BjHfPMPRji9Tws7U-",
"gap": 8.72607421875,
"focus": -0.039393828258510157
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
-0.7962088410123442,
-86.86798095703125
]
]
},
{
"type": "text",
"version": 640,
"versionNonce": 1892951180,
"isDeleted": false,
"id": "RbVSa4PnOgAMtzoKb-DhW",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 500.609619140625,
"y": 134.11544799804688,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 188,
"height": 95,
"seed": 1989838604,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [],
"updated": 1638348163424,
"fontSize": 16,
"fontFamily": 3,
"text": "execute rules,\npersist alerts \nand recording rules,\nrestore state\n",
"baseline": 91,
"textAlign": "left",
"verticalAlign": "top"
},
{
"type": "arrow",
"version": 1472,
"versionNonce": 768565516,
"isDeleted": false,
"id": "sxEhnxlbT7ldlSsmHDUHp",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 486.5245361328125,
"y": 300.5478957313172,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 0.3521007656264601,
"height": 70.26802223743277,
"seed": 1818348300,
"groupIds": [],
"strokeSharpness": "round",
"boundElementIds": [],
"updated": 1638348116633,
"startBinding": {
"elementId": "E9Run6wCm2chQ6JHrmc_y",
"focus": 1.1925203824459496,
"gap": 11.77154541015625
},
"endBinding": {
"elementId": "8-XFSbd6Zw96EUSJbJXZv",
"focus": 0.0441077573536454,
"gap": 11.437713623046875
},
"lastCommittedPoint": null,
"startArrowhead": null,
"endArrowhead": "arrow",
"points": [
[
0,
0
],
[
-0.3521007656264601,
70.26802223743277
]
]
},
{
"type": "text",
"version": 577,
"versionNonce": 1646003636,
"isDeleted": false,
"id": "E9Run6wCm2chQ6JHrmc_y",
"fillStyle": "hachure",
"strokeWidth": 1,
"strokeStyle": "solid",
"roughness": 0,
"opacity": 100,
"angle": 0,
"x": 498.29608154296875,
"y": 298.6573791503906,
"strokeColor": "#000000",
"backgroundColor": "transparent",
"width": 122,
"height": 38,
"seed": 1836541708,
"groupIds": [],
"strokeSharpness": "sharp",
"boundElementIds": [
"sxEhnxlbT7ldlSsmHDUHp"
],
"updated": 1638348116633,
"fontSize": 16,
"fontFamily": 3,
"text": "send alert \nnotifications",
"baseline": 34,
"textAlign": "left",
"verticalAlign": "top"
}
],
"appState": {
"gridSize": null,
"viewBackgroundColor": "#ffffff"
},
"files": {}
}

Binary file not shown.

After

Width:  |  Height:  |  Size: 62 KiB

View File

@@ -4,52 +4,80 @@ import (
"encoding/json"
"fmt"
"net/http"
"path"
"sort"
"strconv"
"strings"
"sync"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/tpl"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
var (
once = sync.Once{}
apiLinks [][2]string
navItems []tpl.NavItem
)
func initLinks() {
pathPrefix := httpserver.GetPathPrefix()
apiLinks = [][2]string{
{path.Join(pathPrefix, "api/v1/groups"), "list all loaded groups and rules"},
{path.Join(pathPrefix, "api/v1/alerts"), "list all active alerts"},
{path.Join(pathPrefix, "api/v1/groupID/alertID/status"), "get alert status by ID"},
{path.Join(pathPrefix, "flags"), "command-line flags"},
{path.Join(pathPrefix, "metrics"), "list of application metrics"},
{path.Join(pathPrefix, "-/reload"), "reload configuration"},
}
navItems = []tpl.NavItem{
{Name: "vmalert", Url: pathPrefix},
{Name: "Groups", Url: path.Join(pathPrefix, "groups")},
{Name: "Alerts", Url: path.Join(pathPrefix, "alerts")},
{Name: "Docs", Url: "https://docs.victoriametrics.com/vmalert.html"},
}
}
type requestHandler struct {
m *manager
}
var pathList = [][]string{
{"/api/v1/groups", "list all loaded groups and rules"},
{"/api/v1/alerts", "list all active alerts"},
{"/api/v1/groupID/alertID/status", "get alert status by ID"},
// /metrics is served by httpserver by default
{"/metrics", "list of application metrics"},
{"/-/reload", "reload configuration"},
}
func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
once.Do(func() {
initLinks()
})
switch r.URL.Path {
case "/":
for _, path := range pathList {
p, doc := path[0], path[1]
fmt.Fprintf(w, "<a href='%s'>%q</a> - %s<br/>", p, p, doc)
if r.Method != "GET" {
return false
}
WriteWelcome(w)
return true
case "/alerts":
WriteListAlerts(w, rh.groupAlerts())
return true
case "/groups":
WriteListGroups(w, rh.groups())
return true
case "/api/v1/groups":
data, err := rh.listGroups()
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.Header().Set("Content-Type", "application/json; charset=utf-8")
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/api/v1/alerts":
data, err := rh.listAlerts()
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.Header().Set("Content-Type", "application/json; charset=utf-8")
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/-/reload":
@@ -61,14 +89,26 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
if !strings.HasSuffix(r.URL.Path, "/status") {
return false
}
// /api/v1/<groupName>/<alertID>/status
data, err := rh.alert(r.URL.Path)
alert, err := rh.alertByPath(strings.TrimPrefix(r.URL.Path, "/api/v1/"))
if err != nil {
httpserver.Errorf(w, r, "error in %q: %s", r.URL.Path, err)
httpserver.Errorf(w, r, "%s", err)
return true
}
w.Header().Set("Content-Type", "application/json; charset=utf-8")
w.Write(data)
// /api/v1/<groupID>/<alertID>/status
if strings.HasPrefix(r.URL.Path, "/api/v1/") {
data, err := json.Marshal(alert)
if err != nil {
httpserver.Errorf(w, r, "failed to marshal alert: %s", err)
return true
}
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
}
// <groupID>/<alertID>/status
WriteAlert(w, alert)
return true
}
}
@@ -80,20 +120,25 @@ type listGroupsResponse struct {
Status string `json:"status"`
}
func (rh *requestHandler) listGroups() ([]byte, error) {
func (rh *requestHandler) groups() []APIGroup {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
lr := listGroupsResponse{Status: "success"}
var groups []APIGroup
for _, g := range rh.m.groups {
lr.Data.Groups = append(lr.Data.Groups, g.toAPI())
groups = append(groups, g.toAPI())
}
// sort list of alerts for deterministic output
sort.Slice(lr.Data.Groups, func(i, j int) bool {
return lr.Data.Groups[i].Name < lr.Data.Groups[j].Name
sort.Slice(groups, func(i, j int) bool {
return groups[i].Name < groups[j].Name
})
return groups
}
func (rh *requestHandler) listGroups() ([]byte, error) {
lr := listGroupsResponse{Status: "success"}
lr.Data.Groups = rh.groups()
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{
@@ -111,6 +156,30 @@ type listAlertsResponse struct {
Status string `json:"status"`
}
func (rh *requestHandler) groupAlerts() []GroupAlerts {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
var groupAlerts []GroupAlerts
for _, g := range rh.m.groups {
var alerts []*APIAlert
for _, r := range g.Rules {
a, ok := r.(*AlertingRule)
if !ok {
continue
}
alerts = append(alerts, a.AlertsAPI()...)
}
if len(alerts) > 0 {
groupAlerts = append(groupAlerts, GroupAlerts{
Group: g.toAPI(),
Alerts: alerts,
})
}
}
return groupAlerts
}
func (rh *requestHandler) listAlerts() ([]byte, error) {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
@@ -141,18 +210,17 @@ func (rh *requestHandler) listAlerts() ([]byte, error) {
return b, nil
}
func (rh *requestHandler) alert(path string) ([]byte, error) {
func (rh *requestHandler) alertByPath(path string) (*APIAlert, error) {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
parts := strings.SplitN(strings.TrimPrefix(path, "/api/v1/"), "/", 3)
parts := strings.SplitN(strings.TrimLeft(path, "/"), "/", 3)
if len(parts) != 3 {
return nil, &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf(`path %q cointains /status suffix but doesn't match pattern "/group/alert/status"`, path),
Err: fmt.Errorf(`path %q cointains /status suffix but doesn't match pattern "/groupID/alertID/status"`, path),
StatusCode: http.StatusBadRequest,
}
}
groupID, err := uint64FromPath(parts[0])
if err != nil {
return nil, badRequest(fmt.Errorf(`cannot parse groupID: %w`, err))
@@ -165,7 +233,7 @@ func (rh *requestHandler) alert(path string) ([]byte, error) {
if err != nil {
return nil, errResponse(err, http.StatusNotFound)
}
return json.Marshal(resp)
return resp, nil
}
func uint64FromPath(path string) (uint64, error) {

305
app/vmalert/web.qtpl Normal file
View File

@@ -0,0 +1,305 @@
{% package main %}
{% import (
"time"
"sort"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/tpl"
) %}
{% func Welcome() %}
{%= tpl.Header("vmalert", navItems) %}
<p>
API:<br>
{% for _, p := range apiLinks %}
{%code
p, doc := p[0], p[1]
%}
<a href="{%s p %}">{%s p %}</a> - {%s doc %}<br/>
{% endfor %}
</p>
{%= tpl.Footer() %}
{% endfunc %}
{% func ListGroups(groups []APIGroup) %}
{%= tpl.Header("Groups", navItems) %}
{% if len(groups) > 0 %}
{%code
rOk := make(map[string]int)
rNotOk := make(map[string]int)
for _, g := range groups {
for _, r := range g.AlertingRules{
if r.LastError != "" {
rNotOk[g.Name]++
} else {
rOk[g.Name]++
}
}
for _, r := range g.RecordingRules{
if r.LastError != "" {
rNotOk[g.Name]++
} else {
rOk[g.Name]++
}
}
}
%}
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
{% for _, g := range groups %}
<div class="group-heading{% if rNotOk[g.Name] > 0 %} alert-danger{% endif %}" data-bs-target="rules-{%s g.ID %}">
<span class="anchor" id="group-{%s g.ID %}"></span>
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%s g.Interval %})</a>
{% if rNotOk[g.Name] > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d rNotOk[g.Name] %}</span> {% endif %}
<span class="badge bg-success" title="Number of rules withs status Ok">{%d rOk[g.Name] %}</span>
<p class="fs-6 fw-lighter">{%s g.File %}</p>
{% if len(g.Params) > 0 %}
<div class="fs-6 fw-lighter">Extra params
{% for _, param := range g.Params %}
<span class="float-left badge bg-primary">{%s param %}</span>
{% endfor %}
</div>
{% endif %}
</div>
<div class="collapse" id="rules-{%s g.ID %}">
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col">Rule</th>
<th scope="col" title="Shows if rule's execution ended with error">Error</th>
<th scope="col" title="How many samples were produced by the rule">Samples</th>
<th scope="col" title="How many seconds ago rule was executed">Updated</th>
</tr>
</thead>
<tbody>
{% for _, ar := range g.AlertingRules %}
<tr{% if ar.LastError != "" %} class="alert-danger"{% endif %}>
<td>
<b>alert:</b> {%s ar.Name %} (for: {%v ar.For %})<br>
<code><pre>{%s ar.Expression %}</pre></code><br>
{% if len(ar.Labels) > 0 %} <b>Labels:</b>{% endif %}
{% for k, v := range ar.Labels %}
<span class="ms-1 badge bg-primary">{%s k %}={%s v %}</span>
{% endfor %}
</td>
<td><div class="error-cell">{%s ar.LastError %}</div></td>
<td>{%d ar.LastSamples %}</td>
<td>{%f.3 time.Since(ar.LastExec).Seconds() %}s ago</td>
</tr>
{% endfor %}
{% for _, rr := range g.RecordingRules %}
<tr>
<td>
<b>record:</b> {%s rr.Name %}<br>
<code><pre>{%s rr.Expression %}</pre></code>
{% if len(rr.Labels) > 0 %} <b>Labels:</b>{% endif %}
{% for k, v := range rr.Labels %}
<span class="ms-1 badge bg-primary">{%s k %}={%s v %}</span>
{% endfor %}
</td>
<td><div class="error-cell">{%s rr.LastError %}</div></td>
<td>{%d rr.LastSamples %}</td>
<td>{%f.3 time.Since(rr.LastExec).Seconds() %}s ago</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
{% endfor %}
{% else %}
<div>
<p>No items...</p>
</div>
{% endif %}
{%= tpl.Footer() %}
{% endfunc %}
{% func ListAlerts(groupAlerts []GroupAlerts) %}
{%= tpl.Header("Alerts", navItems) %}
{% if len(groupAlerts) > 0 %}
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
{% for _, ga := range groupAlerts %}
{%code g := ga.Group %}
<div class="group-heading alert-danger" data-bs-target="rules-{%s g.ID %}">
<span class="anchor" id="group-{%s g.ID %}"></span>
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %}</a>
<span class="badge bg-danger" title="Number of active alerts">{%d len(ga.Alerts) %}</span>
<br>
<p class="fs-6 fw-lighter">{%s g.File %}</p>
</div>
{%code
var keys []string
alertsByRule := make(map[string][]*APIAlert)
for _, alert := range ga.Alerts {
if len(alertsByRule[alert.RuleID]) < 1 {
keys = append(keys, alert.RuleID)
}
alertsByRule[alert.RuleID] = append(alertsByRule[alert.RuleID], alert)
}
sort.Strings(keys)
%}
<div class="collapse" id="rules-{%s g.ID %}">
{% for _, ruleID := range keys %}
{%code
defaultAR := alertsByRule[ruleID][0]
var labelKeys []string
for k := range defaultAR.Labels {
labelKeys = append(labelKeys, k)
}
sort.Strings(labelKeys)
%}
<br>
<b>alert:</b> {%s defaultAR.Name %} ({%d len(alertsByRule[ruleID]) %})
| <span><a target="_blank" href="{%s defaultAR.SourceLink %}">Source</a></span>
<br>
<b>expr:</b><code><pre>{%s defaultAR.Expression %}</pre></code>
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col">Labels</th>
<th scope="col">State</th>
<th scope="col">Active at</th>
<th scope="col">Value</th>
<th scope="col">Link</th>
</tr>
</thead>
<tbody>
{% for _, ar := range alertsByRule[ruleID] %}
<tr>
<td>
{% for _, k := range labelKeys %}
<span class="ms-1 badge bg-primary">{%s k %}={%s ar.Labels[k] %}</span>
{% endfor %}
</td>
<td>{%= badgeState(ar.State) %}</td>
<td>
{%s ar.ActiveAt.Format("2006-01-02T15:04:05Z07:00") %}
{% if ar.Restored %}{%= badgeRestored() %}{% endif %}
</td>
<td>{%s ar.Value %}</td>
<td>
<a href="/{%s g.ID %}/{%s ar.ID %}/status">Details</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
{% endfor %}
</div>
<br>
{% endfor %}
{% else %}
<div>
<p>No items...</p>
</div>
{% endif %}
{%= tpl.Footer() %}
{% endfunc %}
{% func Alert(alert *APIAlert) %}
{%= tpl.Header("", navItems) %}
{%code
var labelKeys []string
for k := range alert.Labels {
labelKeys = append(labelKeys, k)
}
sort.Strings(labelKeys)
var annotationKeys []string
for k := range alert.Annotations {
annotationKeys = append(annotationKeys, k)
}
sort.Strings(annotationKeys)
%}
<div class="display-6 pb-3 mb-3">{%s alert.Name %}<span class="ms-2 badge {% if alert.State=="firing" %}bg-danger{% else %} bg-warning text-dark{% endif %}">{%s alert.State %}</span></div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Active at
</div>
<div class="col">
{%s alert.ActiveAt.Format("2006-01-02T15:04:05Z07:00") %}
</div>
</div>
</div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Expr
</div>
<div class="col">
<code><pre>{%s alert.Expression %}</pre></code>
</div>
</div>
</div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Labels
</div>
<div class="col">
{% for _, k := range labelKeys %}
<span class="m-1 badge bg-primary">{%s k %}={%s alert.Labels[k] %}</span>
{% endfor %}
</div>
</div>
</div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Annotations
</div>
<div class="col">
{% for _, k := range annotationKeys %}
<b>{%s k %}:</b><br>
<p>{%s alert.Annotations[k] %}</p>
{% endfor %}
</div>
</div>
</div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Group
</div>
<div class="col">
<a target="_blank" href="/groups#group-{%s alert.GroupID %}">{%s alert.GroupID %}</a>
</div>
</div>
</div>
<div class="container border-bottom p-2">
<div class="row">
<div class="col-2">
Source link
</div>
<div class="col">
<a target="_blank" href="{%s alert.SourceLink %}">Link</a>
</div>
</div>
</div>
{%= tpl.Footer() %}
{% endfunc %}
{% func badgeState(state string) %}
{%code
badgeClass := "bg-warning text-dark"
if state == "firing" {
badgeClass = "bg-danger"
}
%}
<span class="badge {%s badgeClass %}">{%s state %}</span>
{% endfunc %}
{% func badgeRestored() %}
<span class="badge bg-warning text-dark" title="Alert state was restored after the service restart from remote storage">restored</span>
{% endfunc %}

1014
app/vmalert/web.qtpl.go Normal file

File diff suppressed because it is too large Load Diff

Some files were not shown because too many files have changed in this diff Show More