VictoriaMetrics

mirror of https://github.com/VictoriaMetrics/VictoriaMetrics.git synced 2026-05-17 08:36:55 +03:00

Author	SHA1	Message	Date
Kirill Yurkov	1d2ec1947b	dsahboards: Add Kafka (Enterprise) row to vmagent dashboard (#10728 ) Add a new `Kafka (Enterprise)` row to both vmagent dashboards: - `dashboards/vmagent.json` - `dashboards/vm/vmagent.json` The row is placed before `Drilldown` and contains three Kafka-specific panels: - `Kafka bytes` - `Kafka messages in/out` - `Kafka and consumer errors` The goal is to provide a compact Kafka-focused view for enterprise vmagent deployments without duplicating the existing generic remote write panels such as connection saturation and persistent queue size. The new row helps distinguish: - producer vs consumer throughput at the Kafka topic level - message-rate shifts that may indicate smaller Kafka payloads and higher per-message overhead - producer-side Kafka errors vs consumer-side Kafka errors Descriptions include links to the relevant Kafka documentation sections. PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10728 --------- Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-05-07 21:15:21 +03:00
Hui Wang	387a54d3c8	dashboards: polish vmauth dashboard (#10884 ) See updated dashboard in https://play-grafana.victoriametrics.com/d/nbuo5Mr4k/victoriametrics-vmauth?orgId=1&from=now-3h&to=now&timezone=browser&var-ds=P4169E866C3094E38&var-job=vmclusterlb-benchmark-vm-cluster-lts&var-instance=$__all&var-user=$__all&var-adhoc=&refresh=30s. `Stats`: 1. `Users count`: set default value 0; 2. `Uptime`: count vmauth instances per job instead of showing instance uptime, to be consistent with other dashboards. The actual uptime is not very useful and is hard to read. `Overview`: 1. Reorder panels; 2. `Requests rejected rate`: add a `>0` threshold in query. `Troubleshooting`: 1. Remove unused `Restarts` panel; 2. `Logging rate`: add a `>0` threshold in query; 3. Add `Requests backend error rate` to show underlying backend errors in addition to request errors. I don’t see a specific change that needs to be mentioned in the changelog.	2026-04-27 20:20:40 +03:00
Hui Wang	66b9890025	dashboards: add metadata ingestion row rate queries to vmagent&vmcluster dashboards (#10868 ) Metadata is enabled by default since v1.137.0, and the metadata volume can be a big contributor to resource usage and network traffic. vmagent dahsboard: 1. `Troubleshooting` section: rename `Datapoints rate` panel to `Rows rate` to include metadata rate; 2. `Ingestion` section: add metadata rate to existing `Rows rate` panel. (The difference between this panel and the one above is that this panel only contains data from write requests, while the above panel also includes the scraping part.) vmcluster dashboard: 1. `vminsert` section: add `Rows rate` panel Didn’t see a good place for it in the vmsingle dashboard, since it doesn’t have a dedicated insert section, and I don’t want to add it to `overview` yet. https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10868	2026-04-24 14:07:31 +03:00
Hui Wang	72c9e9377c	app/vmalert: expose remotewrite queue_size metrics This commit adds new metrics `vmalert_remotewrite_queue_capacity` and `vmalert_remotewrite_queue_size`, which is updated with each push and it's frequency depends on `-remoteWrite.concurrency`, `remoteWrite.flushInterval` It doesn't account for the pending data within each pushers request, it should provide a general indication of the queue usage. Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10765	2026-04-09 11:22:38 +02:00
Hui Wang	df34ba3ba2	app/vmalert: expose new histograms to provide better visibility into remote write request sizes The new histograms should help with debugging whether remote write pushes are efficient(pushes can be underutilized due to small flush interval), like in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10693 and https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10536. This enhanced visibility will allow related parameters such as `-remoteWrite.maxBatchSize`, `-remoteWrite.maxQueueSize`, `-remoteWrite.flushInterval` to be tuned accordingly. Eventually, `vmalert_remotewrite_sent_rows_total` and `vmalert_remotewrite_sent_bytes_total` could be deprecated, but it's also fine to leave them as they are since they're small counters. Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10727	2026-04-06 09:49:17 +02:00
sias32	10dd45c4fd	dashboards: improvement alert statistics (#10571 ) Changes: - Added the number of `pending alerts` and `firing alerts` - Improvement `transormations` for panel - FIRING over time by group and rules - Added sort for panel - FIRING over time by rule Signed-off-by: sias32 <sias.32@yandex.ru> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-04-03 21:27:19 +03:00
hagen1778	a6833ffd08	dashboards/metrics-explorer: properly reference datasource variable Before, by mistake, datasource was referenced by input name instead of variable name. For an unknown reason, it worked well in local setup and on playground. This fix is confirmed by users and continues working at local setup and playground. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2026-03-18 15:28:26 +01:00
Roman Khavronenko	34d190b32a	dashboards: add dashboard for exploring stored metrics (#10617 ) The new Grafana dashboard uses the following APIs: - /api/v1/status/tsdb - /api/v1/status/metric_names_stats It shows the list of metric names, the request count and the last time they were "used". Clicking on metric name allows exploring its cardinality. Based on https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9832 ----------- The PR contains a few unrelated changes: * rename of folder for prometheus datasource to remove the duplicated word * fix for vmalert's access to the datasource, as before it wasn't able to write/read properly ------------- The dashboard screen cast: https://github.com/user-attachments/assets/01dda5d9-14e5-4f5a-b795-a838abec4f5e --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Co-authored-by: Haley Wang <haley@victoriametrics.com>	2026-03-16 11:09:06 +01:00
JAYICE	4589442345	dashboard: refine top10 instances by sample panel in vmagent (#10655 ) ### Describe Your Changes fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10654 <img width="1995" height="846" alt="image" src="https://github.com/user-attachments/assets/673afd18-9d64-43d3-9ec2-38508847a851" /> ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-03-16 11:04:04 +01:00
Hui Wang	cfbc5ae31d	dashboard: fix expressions in vmauth memory usage panel (#10574 ) vmauth doesn’t use fastcache or expose `vm_cache_size_bytes`, so having `vm_cache_size_bytes` makes the expression evaluate to null. Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10574/	2026-03-03 12:06:12 +01:00
Roman Khavronenko	d3e2946791	dashboards: remove $instance from drilldown link (#10518 ) For unknown reason, $instance variable can't be passed unescaped via dashboard link. In result, clicking on the line on panel opens a new tab where panel fails to render. This happens when `$instance=$__all`. The rendered link becomes `&var-instance=.*` which then gets double-escaped in the query and yields no result. This behavior can be verified at https://play-grafana.victoriametrics.com/. I've tried to properly unescape the variable using https://grafana.com/docs/grafana/latest/visualizations/dashboards/variables/variable-syntax but found no solution. Hence, proposing to remove this filter from drilldown. ------------ https://github.com/user-attachments/assets/faf76d63-7739-48d7-8ce6-3d567e77003c --------- Signed-off-by: hagen1778 <roman@victoriametrics.com> Signed-off-by: Roman Khavronenko <hagen1778@gmail.com> Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>	2026-02-27 10:41:26 +02:00
Max Kotliar	603dc03c7d	dashboards: add job\instance filters to alerts statistics dashboard (#10549 ) ### Describe Your Changes Add `job` and `instance` filters to the `VictoriaMetrics - Alert statistics` dashboard. This allows users running multiple independent [vmalert](https://docs.victoriametrics.com/victoriametrics/vmalert/) instances to filter and analyze alerts statistics per specific instance, making it easier to identify issues in a particular vmalert deployment. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-02-27 09:41:21 +02:00
hagen1778	65d0a8e129	dashboards: review alert-statistics dashboard * add meaningful description, it is required for publishin on grafana.com * remove dependency on `victoriametrics-metrics-datasource` as it is not used Signed-off-by: hagen1778 <roman@victoriametrics.com>	2026-02-26 11:33:26 +01:00
sias32	673b2ca7db	dashboards/deployment: add links for vmalert (#10509 ) ### Describe Your Changes 1. Dashboard: Adding a link to an alert for quick access to it (alert-statisticl) 2. Rules: Replace localhost with $externalURL to take the address from the --external.url flag ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Signed-off-by: sias32 <sias.32@yandex.ru>	2026-02-25 11:26:44 +01:00
Roman Khavronenko	30d01e9cae	dashboards: filter out zero value for Major page faults panel (#10517 ) Components like vmselect and vminsert rarely touch disk, so most of the time their values are 0. Filtering out 0 values makes the panel cleaner. Signed-off-by: hagen1778 <roman@victoriametrics.com>	2026-02-24 15:30:05 +01:00
Vadim Rutkovsky	f176a6624a	dashboards: operator dashboard should extract version from metrics (#10502 ) ### Describe Your Changes Use vm_app_version to determine operator version instead of static text ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). Signed-off-by: Vadim Rutkovsky <vadim@vrutkovs.eu>	2026-02-23 13:32:14 +02:00
Max Kotliar	d5b9d3e641	dashboards/vmauth: Add Client request buffering latency panel (#10412 ) ### Describe Your Changes In https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10310 ability to [buffer request body](https://docs.victoriametrics.com/victoriametrics/vmauth/#request-body-buffering) was added to `vmauth`. This PR adds a new panel `Request body buffering latency` to `vmauth` dashboard. Related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10309 <img width="1504" height="680" alt="Screenshot 2026-02-07 at 00 28 46" src="https://github.com/user-attachments/assets/ba98b06f-de2c-4d4c-96bb-e5c20049cebc" /> ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). Signed-off-by: Max Kotliar <kotlyar.maksim@gmail.com> Co-authored-by: Hui Wang <haley@victoriametrics.com>	2026-02-18 15:25:38 +02:00
Stephan Burns	d40696a2f2	Add restarts annotation to remaining dashboards (#10439 ) ### Describe Your Changes Added annotation to show restarts. ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Signed-off-by: Stephan Burns <34520077+Sleuth56@users.noreply.github.com> Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-02-12 16:39:38 +02:00
Aliaksandr Valialkin	b2a74ec494	dashboards/vm/vmauth.json: run `make dashboards-sync` after the commit `9774fe8df1` according to dashboards/README.md Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10437 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10438	2026-02-12 14:24:09 +01:00
Mathias Palmersheim	9774fe8df1	Change user count query so it accounts for multiple replicas of vmauth (#10438 ) ### Describe Your Changes Fixes issue where multiple replicas of vmauth cause the user count to be inflated for vmauth see #10437 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-02-12 14:22:22 +01:00
Max Kotliar	a279517034	dashboards: add source code data link to logging rate panel (#10406 ) ### Describe Your Changes Add Source Code data link (link to bar or line in graph to see) that points directly to a source code file on Github. `VictoriaMetrics - cluster`, `VictoriaMetrics - single-node`, and `VictoriaMetrics - vmagent` dashboards were updated. I did not add it to other panels since they do not have Drilldown section at all. Also, fixed a misplaced Drilldown link in `VictoriaMetrics - single-node` dashboard. Proxy service code is here https://github.com/VictoriaMetrics/location2source/ ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-02-10 10:25:46 +02:00
Max Kotliar	60dbd5a97e	dashboards: Rename "Concurrent flushes on disk" panel to "Concurrent inserts" (#10409 ) ### Describe Your Changes The new title better aligns with the code of [writeconcurrencylimiter](`d9dabea303/lib/writeconcurrencylimiter/concurrencylimiter.go (L140)`), the panel description and the metric used in the query. Previously, the panel title suggested that it reflected only disk write performance. During an incident investigation, this led to a wrong assumption that the panel was unrelated to client-side performance. In reality, the metric [includes the full write path](`98e320842c/lib/vminsertapi/server.go (L263)`): time spent reading data from the TCP connection, processing it, and acknowledging the block. The updated title reflects this behavior more accurately and reduces the risk of misinterpretation during incident analysis. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-02-09 19:40:41 +02:00
Aliaksandr Valialkin	5632ccc64a	lib/logger: count both printed and suppressed logs at vm_log_messages_total metric This simplifies troubleshooting by investigating the vm_log_messages_total metric when logs are unavailable. The logs may be unavailable when the -loggerLevel command-line flag is set to value other than INFO. The logs may be unavailable when clients use Monitoring of Monitoring service ( https://victoriametrics.com/products/mom/ ), which provides metrics, but doesn't provide logs from VictoriaMetrics components running at the client side. Add `is_printed` label to the `vm_log_messages_total` metric in order to detect whether the given log has been suppressed or printed. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10304 While at it, make more readable the description for the TooManyLogs alert, which is based on the vm_log_messages_total metric. Also return back the `level!="info"` instead of `level="error"` filter in the query for this alerting rule, in order to be consistent with queries at the official dashboards for VictoriaMetrics components. TODO: investigate too high warnings rate at https://github.com/VictoriaMetrics/VictoriaMetrics/pull/2760 and fix it at the source of these warnings instead of modifying the query for the TooManyLogs alert.	2026-01-25 17:43:20 +01:00
Aliaksandr Valialkin	fe2c60c79b	dashboards: follow-up for the commit `36460f6297` Use $__range duration instead of 1h duration for the 'Retention errors' stats panel in the similar way it was done in the commit `36460f6297` for the 'Backup errors' stats panel. While at it, run `make dashboards-sync` in order to sync the dashboards in the dasbhoards/vm/ folder. See https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/dashboards/README.md for details. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10279	2026-01-14 23:30:08 +01:00
Stephan Burns	36460f6297	Make stats panel use the range specified in grafana (#10279 ) ### Describe Your Changes The Backups errors panel uses a hard coded rate, when looking over a large period of time this number would likely stay low do to the hard coded rate when in reality the amount of errors is much larger. This change addresses this by using the __rate variable in Grafana so the rate will align with the date/time range in Grafana. ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2026-01-14 23:21:16 +01:00
Max Kotliar	b33d7c3ef9	dashboards: remove timezone from vmagent dashboard The bug introduced in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10267 and breaks helm charts customization, see discussion `415ff27c74 (r174600675)`	2026-01-14 13:28:35 +02:00
Jayice	415ff27c74	dashboards: add Persistent queue Full ETA panel to the Drilldown section in vmagent dashboard	2026-01-13 20:03:43 +02:00
Max Kotliar	eb7c5df65e	dashboards: run make dashboards-sync	2026-01-09 19:07:03 +02:00
Max Kotliar	729b1099d8	dashboards: Enhance VictoriaMetrics - single-node dashboard stats raw. (#10260 ) ### Describe Your Changes Currently, the stats are small and hard to read (see screenshot in the PR). In addition, the version and uptime panels work well for a single vmsingle, but become inconvenient when multiple instances are present, since only one is visible. This PR changes the version and uptime panels from single stat to time series, aligning them with the VictoriaMetrics – cluster dashboard. It also enlarges the remaining stats so the values are easier to read, consistent with the cluster dashboard (see screenshot in the PR). Follow up on https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10187 and https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132 Before: <img width="1512" height="364" alt="Screenshot 2026-01-07 at 21 38 17" src="https://github.com/user-attachments/assets/8d8baa86-b31b-4c58-ae22-cef94a1607e6" /> After: <img width="1512" height="670" alt="Screenshot 2026-01-07 at 22 07 10" src="https://github.com/user-attachments/assets/9e60596d-72ec-4060-af11-a69ce554d3b1" /> ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). Co-authored-by: Hui Wang <haley@victoriametrics.com>	2026-01-08 13:49:46 +02:00
Hui Wang	46e13fe0ca	vmselect: expose `vm_rollup_result_cache_requests_total` metric which tracks the number of requests to the query rollup cache As described in https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117, when retrieving cached data from the rollup result cache, there can be mixed `get()` and `getBig()` calls to the underlaying fastcache. And it's unpredictable how many times `getBig()` will call `get()`, so the metrics from fastcache cannot be used to indicate query cache miss ratio. Exposing a new counter `vm_rollup_result_cache_requests_total` to track the number of requests to the query rollup cache, together with the existing `vm_rollup_result_cache_miss_total`, allows for monitoring the rollup cache miss rate per query (or subquery), which is more user-facing. fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10117 related to https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5056	2026-01-08 11:05:25 +01:00
JAYICE	3b8550adb1	dashboard: refine vmsingle dashboard and align it to vmcluster dashboard (#10187 ) ### Describe Your Changes Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10132 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Co-authored-by: Max Kotliar <mkotlyar@victoriametrics.com>	2026-01-07 12:58:49 +02:00
Hui Wang	13911db316	vmauth: add new counters to track the number of user request errors follow up https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10177 Add `vmauth_user_request_backend_requests_total` and `vmauth_unauthorized_user_request_backend_requests_total` which track the number of user request errors, and aligned with `vmauth_user_requests_total`. The existing `vmauth_http_request_errors_total` currently only counts requests with `invalid_auth_token`. Once authorization has passed, any subsequent request errors are tracked under `xxx_user_request_backend_requests_total`.	2025-12-22 13:05:54 +01:00
Max Kotliar	bd725bdd69	dashboards: add usseful links to dashboards Dashboards: - Add a link to proper docs section - Add a link to troubleshooting page - Add links to community and enterprise support Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9904	2025-12-11 18:11:07 +02:00
Max Kotliar	c6731f964c	dashboards: add memory usage breakdown panels into Drilldown sections Right now we have two separate panels: RSS memory % usage and RSS anonymous memory % usage. This makes trend comparison difficult because one have to visually correlate two independent panels. Another problem is that these panels don't show Go runtime allocations at all. The same applies to memory allocated in C. There are allocations in C (zstd) one should account for but there is no even a metric to expose it. The commit adds Memory usage breakdown panel into Drilldown section. It provides insight into Go Stack, Go Heap, Go Heap Released, Go Other, Mmap: VM Cache, File cache memory distribution It should help spot trends changes in memory by type or invistigate issues such as #10069 and #10028 easier. Panel info: This panel shows memory usage by category. How to use: - Start from the high-level RSS panel. - Identify an instance with unexpected or abnormal memory growth. - Filter to that instance to inspect the detailed breakdown here. Interpretation - A steadily rising Go Heap usually indicates a memory leak. Collect pprof memory profile. - A growing Go Stack commonly points to a goroutine leak. <img width="1508" height="628" alt="Screenshot 2025-12-08 at 13 18 44" src="https://github.com/user-attachments/assets/0e794324-e86d-468e-b926-8bb11f5a2043" /> <img width="1503" height="674" alt="Screenshot 2025-12-08 at 13 19 34" src="https://github.com/user-attachments/assets/62fc3fff-33b3-4dfe-ad3f-ad0526a8a606" /> Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/10139	2025-12-11 08:39:00 +01:00
JAYICE	76f5def301	dashboard: fix page fault panel (#10141 ) add `[$__rate_interval]` to fix page fault panel introduced in https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9977	2025-12-09 12:41:28 +01:00
Max Kotliar	8e81d54851	Revert "dashboards: add memory usage breakdown panels into Drilldown sections" This reverts commit `5117cde8bc`.	2025-12-08 13:42:10 +02:00
Max Kotliar	5117cde8bc	dashboards: add memory usage breakdown panels into Drilldown sections Right now we have two separate panels: RSS memory % usage and RSS anonymous memory % usage. This makes trend comparison difficult because one have to visually correlate two independent panels. Another problem is that these panels don't show Go runtime allocations at all. The same applies to memory allocated in C. There are allocations in C (zstd) one should account for but there is no even a metric to expose it. The commit adds Memory usage breakdown panel into Drilldown section. It provides insight into Go Stack, Go Heap, Go Heap Released, Go Other, Mmap: VM Cache, File cache memory distribution It should help spot trends changes in memory by type or invistigate issues such as #10069 and #10028 easier. Panel info: This panel shows memory usage by category. How to use: - Start from the high-level RSS panel. - Identify an instance with unexpected or abnormal memory growth. - Filter to that instance to inspect the detailed breakdown here. Interpretation - A steadily rising Go Heap usually indicates a memory leak. Collect pprof memory profile. - A growing Go Stack commonly points to a goroutine leak.	2025-12-08 13:39:34 +02:00
JAYICE	474009a7f1	dashboard: add page faults panel for vmsingle&vmcluster (#9977 ) ### Describe Your Changes add page fault panel in `Troubleshooting`section for vmcluster and vmsingle. fix https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9974 The query ``` sum(rate(process_minor_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance) sum(rate(process_major_pagefaults_total{job=~"$job", instance=~"$instance"})) by (job,instance) ``` <img width="1088" height="306" alt="image" src="https://github.com/user-attachments/assets/4b4ac884-5372-4141-a429-ac0b296dc926" />	2025-12-05 18:04:44 +02:00
Max Kotliar	fe803bfc6e	Capitalize titles in operator.json Signed-off-by: d3spair <git@agrshv.dev>	2025-12-03 13:43:39 +02:00
Andrii Chubatiuk	8ee466ab06	dashboard: add panels for operator flags and global params	2025-12-03 13:28:59 +02:00
Max Kotliar	023a13435c	dashboards: make dashboards-sync	2025-11-27 16:52:45 +02:00
Max Kotliar	1ddcbed6d7	dashboards: Show "Disk space usage % by type" as stacked graph in Cluster dashboard. (#10089 ) ### Describe Your Changes VictoriaMetrics - cluster dashboard. vmstorage -> Disk space usage % by type pane. Switch panel to 100% stacked view to show space distribution. The goal is to highlight how space is split between datapoints and indexdb types; Simple time-series values made this hard to see. A 100% stacked layout makes the distribution immediately visible. Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/9932 was: <img width="1201" height="609" alt="Image" src="https://github.com/user-attachments/assets/1d199e65-5a20-4c63-a251-b7087020f42a" /> now: <img width="1208" height="608" alt="Screenshot 2025-11-27 at 13 14 51" src="https://github.com/user-attachments/assets/96aa32f3-1243-486b-bac8-2d3c0f4bdb7a" /> ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2025-11-27 16:50:15 +02:00
Hui Wang	b8bbb07431	dashboard: tidy vmauth panels (#10088 ) before: <img width="2498" height="1042" alt="image" src="https://github.com/user-attachments/assets/0bbd7cc2-7062-494f-827b-96d86133537f" /> after: <img width="2497" height="968" alt="image" src="https://github.com/user-attachments/assets/6256ccc2-2f8f-40ea-a23b-a1a20e242b3c" /> which is more consistent with other dashabords.	2025-11-27 14:12:53 +02:00
Max Kotliar	9c0683f8d1	dashboards: run make dashboards-sync	2025-11-25 20:13:06 +02:00
Mathias Palmersheim	70be2e7ea3	Remove threshold from available cpu panel (#10056 ) ### Describe Your Changes fixes #9988 by removing the cpu threshold from the Available CPU panel ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2025-11-24 14:16:35 +02:00
hagen1778	a866474918	dashboards: run `make dashboards-sync` Signed-off-by: hagen1778 <roman@victoriametrics.com>	2025-10-30 10:58:58 +01:00
Samarth Bagga	0d7b7649bf	dashboards: enable search for non default flags panel (#9928 ) ### Describe Your Changes Added search for non default flags by editing the grafana configs. Resolves #9910 ### Checklist The following checks are mandatory: - [x] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [x] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/). --------- Co-authored-by: hagen1778 <roman@victoriametrics.com>	2025-10-30 10:53:36 +01:00
Andrii Chubatiuk	b87c515d14	{dashboards,rules}: update storage ETA calculations in both dashboards and rules (#9848 ) currently index file size is calculated as average across all storages from all clusters, updated it to get more valid calculations. also PR [fixes helm chart issue ](https://github.com/VictoriaMetrics/helm-charts/issues/2474). ### Describe Your Changes Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications. ### Checklist The following checks are mandatory: - [ ] My change adheres to [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/victoriametrics/contributing/#pull-request-checklist). - [ ] My change adheres to [VictoriaMetrics development goals](https://docs.victoriametrics.com/victoriametrics/goals/).	2025-10-17 12:45:10 +03:00
Andrii Chubatiuk	0579e68409	dashboards: add adhoc filter to query stats and operator (#9774 ) Add ad-hoc filters to query stats and operator dashboards. These filters are useful for exploring non-uniform metrics sets without distinct job/instance filters.	2025-10-01 13:19:37 +02:00
Hui Wang	08c835e79f	dashboards: add panel `Storage full ETA` in the vmstorage section (#9670 )	2025-09-04 09:12:27 +02:00

1 2 3 4 5 ...

267 Commits