VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2026-06-28 21:18:23 +03:00

Author	SHA1	Message	Date
Hui Wang	3e834b5853	vmauth: add new counters to track the number of user request errors follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10177 Add `vmauth_user_request_backend_requests_total` and `vmauth_unauthorized_user_request_backend_requests_total` which track the number of user request errors, and aligned with `vmauth_user_requests_total`. The existing `vmauth_http_request_errors_total` currently only counts requests with `invalid_auth_token`. Once authorization has passed, any subsequent request errors are tracked under `xxx_user_request_backend_requests_total`.	2025-12-22 13:06:42 +01:00
Aliaksandr Valialkin	ebb5ccbfcf	deployment/docker/rules/alerts-health.yml: clarify the description of the TooManyTSIDMisses alert after the commit `30641b201b` It is expected that the number of TSIDs misses over the last 5 minutes is zero in steady state. If it is non-zero, then something wrong happens. That's why it is better to use increase() instead of rate() function for this alert.	2025-11-10 14:37:36 +01:00
Aliaksandr Valialkin	d7e4d8aa7a	deployment/docker/rules/alerts-health.yml: clarify the description for the TooManyTSIDMisses alert This alert is expected after unclean shutdown (OOM, power off, kill -9) of VictoriaMetrics. It should go away in a few minutes after the restart while VictoriaMetrics deletes metricIDs for the missing MetricID->TSID entries which were created for the newly registered time series just before unclean shutdown. It is OK to delete such metricIDs, since the corresponding time series will be re-registered again. See the commit `20812008a7` . Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3502	2025-11-10 14:31:30 +01:00
Andrii Chubatiuk	ea172f7bc7	{dashboards,rules}: update storage ETA calculations in both dashboards and rules (#9848 ) currently index file size is calculated as average across all storages from all clusters, updated it to get more valid calculations. also PR [fixes helm chart issue ](https://github.com/VictoriaMetrics/helm-charts/issues/2474). ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2025-10-17 12:50:56 +03:00
Andrii Chubatiuk	d8ec4894b5	deployment/rules: set proper job filters for rules (#9587 ) ### Describe Your Changes related issue https://github.com/VictoriaMetrics/helm-charts/issues/2350 ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). (cherry picked from commit `7e05200c60`)	2025-08-21 15:37:24 +02:00
Corporte Gadfly	6494508f00	fix typo in sentence	2025-08-18 22:54:35 +02:00
Fred Navruzov	28f1986a0c	docs/vmanomaly: release v1.25.1 (#9496 ) ### Describe Your Changes Documentation updates for vmanomaly release v1.25.1 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2025-07-25 15:48:25 +02:00
Aliaksandr Valialkin	23222ecd69	deployment/docker: remove all the code related to VictoriaLogs, since it has been migrated to https://github.com/VictoriaMetrics/VictoriaLogs/	2025-07-07 03:43:19 +02:00
Hui Wang	77a754678a	alerts: fix the alerting rule `ScrapePoolHasNoTargets` (#9045 ) as it may cause false positive in [sharding mode](https://docs.victoriametrics.com/victoriametrics/vmagent/#scraping-big-number-of-targets) related https://github.com/VictoriaMetrics/helm-charts/issues/2200 (cherry picked from commit `309f1898b3`)	2025-05-29 11:52:02 +02:00
Hui Wang	87ba3d429a	alerts: improve disk full estimation (#8955 ) enhance alerting rule `DiskRunsOutOfSpaceIn3Days` and `NodeBecomesReadonlyIn3Days` to account for [deduplication](https://docs.victoriametrics.com/#deduplication) and [indexDB](https://docs.victoriametrics.com/#indexdb) when calculating disk consumption rate. And try bring the `Storage full ETA` panel back. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `454ad7a1b4`)	2025-05-16 16:08:45 +02:00
Roman Khavronenko	c10b6d4542	deployment: update vlogs demo installation (#8894 ) * move example alerts out of /rules folder, since this folder should contain only useful rules * expose ports for vlogs and vmetrics for local debug * add some comments explamining vmalert config ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `001dc7c985`)	2025-05-08 14:20:43 +02:00
Roman Khavronenko	451aa69164	deployment/rules: add alerting rule `ScrapePoolHasNoTargets` to vmagent (#8868 ) The new rule should notify user when there is a job with 0 configured or discovered targets, which is usually a sign of misconfiguration. ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/). Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `b5c9284748`)	2025-05-05 15:57:10 +02:00
Aliaksandr Valialkin	9df1e8e71e	use new canonical urls to stream-aggregation docs: https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ This avoids a redirect from the old link https://docs.victoriametrics.com/stream-aggregation/ to https://docs.victoriametrics.com/victoriametrics/stream-aggregation/ , and fixes `backwards` navigation for these links across VictoriaMetrics docs. This is a follow-up for `f152021521` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8595#issuecomment-2831598274	2025-04-30 22:48:10 +02:00
Aliaksandr Valialkin	a9e637162e	use new canonical urls to troubleshooting docs: https://docs.victoriametrics.com/victoriametrics/troubleshooting/ This avoids a redirect from the old link https://docs.victoriametrics.com/troubleshooting/ to https://docs.victoriametrics.com/victoriametrics/troubleshooting/ , and fixes `backwards` navigation for these links across VictoriaMetrics docs. This is a follow-up for `f152021521` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8595#issuecomment-2831598274	2025-04-30 17:32:22 +02:00
Aliaksandr Valialkin	7f1febcd4b	all: use new canonical urls to vmauth docs: https://docs.victoriametrics.com/victoriametrics/vmauth/ This avoids a redirect from the old link https://docs.victoriametrics.com/vmauth/ to https://docs.victoriametrics.com/victoriametrics/vmauth/ , and fixes `backwards` navigation for these links across VictoriaMetrics docs. This is a follow-up for `f152021521` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8595#issuecomment-2831598274	2025-04-30 16:40:36 +02:00
Aliaksandr Valialkin	f740f1106d	all: use new canonical urls to vmalert docs: https://docs.victoriametrics.com/victoriametrics/vmalert/ This avoids a redirect from the old link https://docs.victoriametrics.com/vmalert/ to https://docs.victoriametrics.com/victoriametrics/vmalert/ , and fixes `backwards` navigation for these links across VictoriaMetrics docs. This is a follow-up for `f152021521` See https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8595#issuecomment-2831598274	2025-04-30 16:04:18 +02:00
Aliaksandr Valialkin	d756e83f80	all: consistently use Grafana dashboard links ending with dashboard ID The suffix after the ID may change in dashboard settings. If it changes, the link becomes broken (dubious decision at grafana.com/grafana/dashboards/ ). That's why it is better to drop all the suffixes and use links for Grafana dashbooards ending with IDs. In this case they are automatically redirected to the url with the proper suffix. This is a follow-up for `3e4c38c56c` See also the previous commits `9c0863babc` and `0a5ffb3bc1`	2025-04-22 14:20:55 +02:00
Andrii Chubatiuk	57aadb94a2	deployment/docker: added victorialogs cluster docker compose setup (#8725 ) ### Describe Your Changes fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8694 additionally removed container_name, docker network, renamed all compose, config files for consistency Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `f38736343d`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-04-18 14:33:31 +02:00
hagen1778	35b0233b5b	dashboards: rm ETA panel from single and cluster dashboards The panel was producing wrong predictions as it is almost impossible, without making too expensive queries, to make a precise predictions. More details on reasoning why it is better to remove it than fix it is here https://github.com/VictoriaMetrics/VictoriaMetrics/pull/8492. This change also removes ETA panels from alerting rules annotations. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `ef16681dbf`)	2025-03-27 10:41:14 +01:00
Guillem Jover	1d8b7faf71	spelling and grammar fixes via codespell (#8497 ) ### Describe Your Changes Fix many spelling errors and some grammar, including misspellings in filenames. The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`. While this is a breaking change, this metric isn't used in alerts or dashboards. So it seems to have low impact on users. The change also deprecates `cspell` as it is much heavier and less usable. --------- Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com> Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com> (cherry picked from commit `76d205feae`) Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-03-17 16:38:11 +01:00
Roman Khavronenko	04a94793b7	deployment/rules: add alerting rule `TooHighQueryLoad` (#8365 ) TooHighQueryLoad should trigger when vmsingle or vmselect can't start processing read queries for last 15min. Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `8f87427c81`)	2025-02-25 09:30:21 +01:00
Artem Fetishev	1ea2586856	RemoteWriteConnectionIsSaturated alert: add another saturation cause to the alert description (#8195 ) ### Describe Your Changes Currently the alert descrption considers only one end of the connection (vmagent). While saturation can also be caused by slowness of the receiving components (vminsert, vmstorage). Update the alert description with a brief suggestion to also check the dashboards of these components. ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: Artem Fetishev <rtm@victoriametrics.com> Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `30af662d84`)	2025-02-01 22:31:56 +01:00
Mathias Palmersheim	fb76aad365	fixed #7804 Added NoSelfMonitoringMetrics rule (#7805 ) ### Describe Your Changes fixes #7804 by adding alert for missing uptime metric in vmanomaly ### Checklist The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-12-18 22:40:51 +01:00
f41gh7	78ad858ff7	app/{vminsert,vmagent}: drop time series on exceeding labels limits. Previously, time series with labels exceeding the configured limits were truncated and written to storage, potentially causing data inconsistency. This could lead to collisions between time series and make it difficult to identify the source due to truncated labels. This commit changes the behavior: * Such time series are now rejected outright. * Rejected time series are logged to stdout, and corresponding counters are incremented. * removes `vm_too_long_label_values_total`, `vm_too_long_label_names_total`, `vm_metrics_with_dropped_labels_total` metrics. * adds new values `[too_many_labels,too_long_label_name,too_long_label_value]` to `reason` label of the `vm_rows_ignored_total` metric name related issues: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6928 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7661	2024-12-10 22:15:38 +01:00
Andrii Chubatiuk	3be3705097	deployment/rules: updated sum expressions in alerts to be able to inject cluster labels in helm charts scripts (#7670 ) ### Describe Your Changes Many users are running k8s-stack in multiple kubernetes clusters and to configure a proper routing in alertmanager it's required to support `cluster` label in alerting rules. It's now implemented in helm-chart hack scripts, but it's tricky part to define if cluster label should be added or not, when functions has no `by` expression. Updated existing alerts to provide later an ability to inject cluster label later Also take into an account `storage.minFreeDiskSpaceBytes` in `DiskRunsOutOfSpace` alerts ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Co-authored-by: Roman Khavronenko <roman@victoriametrics.com> (cherry picked from commit `fefa3e7936`)	2024-12-09 12:23:31 +01:00
Hui Wang	5f9db9a61f	alerts-vmalert: reserve rule name for description (#7659 )	2024-11-26 18:50:30 +01:00
Fred Navruzov	e53a69c27d	docs/vmanomaly: add self-monitoring section (#7558 ) - Added self-monitoring guide for `vmanomaly`. - Added cross-referencing on other pages. - Slight improvements in wording on related pages - Update references to v1.18.4 - [x] publish Grafana dashboard to https://grafana.com/orgs/victoriametrics/dashboards: https://grafana.com/grafana/dashboards/22337-victoriametrics-vmanomaly/ @AndrewChubatiuk , JFYI if it somehow impacts your work on supporting `vmanomaly` in operator. The following checks are mandatory: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/).	2024-11-20 16:32:45 +01:00
Hui Wang	aa817602fa	add vlogs type of rule in example (#7548 )	2024-11-20 16:31:55 +01:00
Hui Wang	1bff6c1bbd	dashboards: add `file` label filter to vmalert dashboard panels (#7515 ) Previously, metrics from groups with the same name but in different files could be mixed in the results. e.g. the evaluation time [here](https://grafana.maas.victoriametrics.com/d/LzldHAVnz/victoriametrics-vmalert?orgId=1&var-ds=PE8D8DB4BEE4E4B22&var-job=All&var-instance=All&var-file=%2Fetc%2Fvmalert%2Fconfig%2Fvm-per-tenant-rulefiles-0%2Fmaas-tenant-1011-maas-1011-vm-health.yaml&var-group=All&var-topk=5&editPanel=23) is the total for multiple groups from different tenants.	2024-11-14 17:21:31 +01:00
Roman Khavronenko	d71c745d19	deployment/alerts: add RemoteWriteDroppingData to vmalert rules (#7393 ) ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `3f0e2ab3b2`)	2024-10-31 14:11:10 +01:00
hagen1778	0e6ed4171b	deployment/alerts: consistently update path to alerting rules Follow-up after `68bad22fd2` Signed-off-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `6494606924`)	2024-10-30 16:44:51 +01:00
Hui Wang	9616814728	vmalert: integrate with victorialogs (#7255 ) address https://github.com/VictoriaMetrics/VictoriaMetrics/issues/6706. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md. Related fix https://github.com/VictoriaMetrics/VictoriaMetrics/pull/7254. Note: in this pull request, vmalert doesn't support [backfilling](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/vmalert-support-vlog-ds/docs/VictoriaLogs/vmalert.md#rules-backfilling) for rules with a customized time filter. It might be added in the future, see [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7289) for details. Feature can be tested with image `victoriametrics/vmalert:heads-vmalert-support-vlog-ds-0-g420629c-scratch`. --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: hagen1778 <roman@victoriametrics.com> (cherry picked from commit `68bad22fd2`)	2024-10-29 16:32:00 +01:00

32 Commits