From 5114522186a9e70a0a896a9416827c980c1d2cf8 Mon Sep 17 00:00:00 2001 From: Pablo Fernandez <46322567+TomFern@users.noreply.github.com> Date: Tue, 12 May 2026 17:29:51 +0100 Subject: [PATCH] grammar and proofread pass --- docs/victoriametrics/vmagent.md | 477 ++++++++++++++++---------------- 1 file changed, 237 insertions(+), 240 deletions(-) diff --git a/docs/victoriametrics/vmagent.md b/docs/victoriametrics/vmagent.md index 2970795f8b..511c98dd35 100644 --- a/docs/victoriametrics/vmagent.md +++ b/docs/victoriametrics/vmagent.md @@ -16,8 +16,8 @@ aliases: `vmagent` is a tiny agent that helps you collect metrics from various sources, [relabel and filter the collected metrics](https://docs.victoriametrics.com/victoriametrics/relabeling/) and store them in [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics) -or any other storage systems via Prometheus `remote_write` protocol -or via [VictoriaMetrics `remote_write` protocol](#victoriametrics-remote-write-protocol). +or any other storage systems via the Prometheus `remote_write` protocol +or via the [VictoriaMetrics `remote_write` protocol](#victoriametrics-remote-write-protocol). See [Quick Start](#quick-start) for details. @@ -27,8 +27,8 @@ See [Quick Start](#quick-start) for details. ## Motivation While VictoriaMetrics provides an efficient solution to store and observe metrics, our users needed something fast -and RAM friendly to scrape metrics from Prometheus-compatible exporters into VictoriaMetrics. -Also, we found that our user's infrastructure are like snowflakes in that no two are alike. Therefore, we decided to add more flexibility +and RAM-friendly to scrape metrics from Prometheus-compatible exporters into VictoriaMetrics. +Also, we found that our users' infrastructure is like snowflakes in that no two are alike. Therefore, we decided to add more flexibility to `vmagent` such as the ability to [accept metrics via popular push protocols](#how-to-push-data-to-vmagent) and to [discover Prometheus-compatible targets and scrape metrics from them](#how-to-collect-metrics-in-prometheus-format). @@ -37,20 +37,20 @@ and to [discover Prometheus-compatible targets and scrape metrics from them](#ho * Can be used as a drop-in replacement for Prometheus for discovering and scraping targets such as [node_exporter](https://github.com/prometheus/node_exporter). Note that single-node VictoriaMetrics can also discover and scrape Prometheus-compatible targets in the same way `vmagent` does - see [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-scrape-prometheus-exporters-such-as-node-exporter). -* Can add, remove and modify labels (aka tags) via Prometheus relabeling and filter data before sending it to remote storage. See [these docs](https://docs.victoriametrics.com/victoriametrics/relabeling/) for details. -* Can accept data via all the ingestion protocols supported by VictoriaMetrics - see [these docs](#how-to-push-data-to-vmagent). -* Can aggregate incoming samples by time and by labels before sending them to remote storage - see [these docs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/). -* Can replicate collected metrics simultaneously to multiple Prometheus-compatible remote storage systems - see [these docs](#replication-and-high-availability). +* Can add, remove, and modify labels (aka tags) via Prometheus relabeling and filter data before sending it to remote storage. See [these docs](https://docs.victoriametrics.com/victoriametrics/relabeling/) for details. +* Can accept data via all the ingestion protocols supported by VictoriaMetrics. See [these docs](#how-to-push-data-to-vmagent). +* Can aggregate incoming samples by time and by labels before sending them to remote storage. See [these docs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/). +* Can replicate collected metrics simultaneously to multiple Prometheus-compatible remote storage systems . See [these docs](#replication-and-high-availability). * Can save egress network bandwidth usage costs when [VictoriaMetrics remote write protocol](#victoriametrics-remote-write-protocol) - is used for sending the data to VictoriaMetrics. + is used to send data to VictoriaMetrics. * Works smoothly in environments with unstable connections to remote storage. If the remote storage is unavailable, the collected metrics are buffered at `-remoteWrite.tmpDataPath`. The buffered metrics are sent to remote storage as soon as the connection to the remote storage is repaired. The maximum disk usage for the buffer can be limited with `-remoteWrite.maxDiskUsagePerURL`. -* Uses much lower amounts of RAM, CPU, disk IO and network bandwidth than Prometheus. The RAM usage and CPU usage can be reduced further +* Uses much lower amounts of RAM, CPU, disk IO, and network bandwidth than Prometheus. The RAM usage and CPU usage can be reduced further if needed according to [these docs](#performance-optimizations). -* Scrape targets can be spread among multiple `vmagent` instances when large number of targets must be scraped. See [these docs](#scraping-big-number-of-targets). +* Scrape targets can be spread among multiple `vmagent` instances when a large number of targets must be scraped. See [these docs](#scraping-big-number-of-targets). * Can load scrape configs from multiple files. See [these docs](#loading-scrape-configs-from-multiple-files). -* Can efficiently scrape targets that expose millions of time series such as [/federate endpoint in Prometheus](https://prometheus.io/docs/prometheus/latest/federation/). +* Can efficiently scrape targets that expose millions of time series, such as the [/federate endpoint in Prometheus](https://prometheus.io/docs/prometheus/latest/federation/). See [these docs](#stream-parsing-mode). * Can deal with [high cardinality](https://docs.victoriametrics.com/victoriametrics/faq/#what-is-high-cardinality) and [high churn rate](https://docs.victoriametrics.com/victoriametrics/faq/#what-is-high-churn-rate) issues by limiting the number of unique time series at scrape time @@ -61,16 +61,16 @@ and to [discover Prometheus-compatible targets and scrape metrics from them](#ho ## Quick Start -Please download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) ( -`vmagent` is also available in docker images [Docker Hub](https://hub.docker.com/r/victoriametrics/vmagent/tags) and [Quay](https://quay.io/repository/victoriametrics/vmagent?tab=tags)), -unpack it and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets +Download `vmutils-*` archive from [releases page](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) ( +`vmagent` is also available in Docker images [Docker Hub](https://hub.docker.com/r/victoriametrics/vmagent/tags) and [Quay](https://quay.io/repository/victoriametrics/vmagent?tab=tags)), +unpack it, and pass the following flags to the `vmagent` binary in order to start scraping Prometheus-compatible targets and sending the data to the Prometheus-compatible remote storage: -* `-promscrape.config` with the path to [Prometheus config file](https://docs.victoriametrics.com/victoriametrics/sd_configs/) (usually located at `/etc/prometheus/prometheus.yml`). - The path can point either to local file or to http url. See [scrape config examples](https://docs.victoriametrics.com/victoriametrics/scrape_config_examples/). - `vmagent` doesn't support some sections of Prometheus config file, so you may need either to delete these sections or +* `-promscrape.config` with the path to the [Prometheus config file](https://docs.victoriametrics.com/victoriametrics/sd_configs/) (usually located at `/etc/prometheus/prometheus.yml`). + The path can point either to a local file or to an HTTP URL. See [scrape config examples](https://docs.victoriametrics.com/victoriametrics/scrape_config_examples/). + `vmagent` doesn't support some sections of the Prometheus config file, so you may need either to delete these sections or to run `vmagent` with `-promscrape.config.strictParse=false` command-line flag. - In this case `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections). + In this case, `vmagent` ignores unsupported sections. See [the list of unsupported sections](#unsupported-prometheus-config-sections). * `-remoteWrite.url` of a Prometheus-compatible remote storage endpoint (e.g., VictoriaMetrics) to send data to. The `-remoteWrite.url` may refer to a [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) address. See [these docs](#srv-urls) for details. @@ -81,7 +81,7 @@ to [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametric /path/to/vmagent -remoteWrite.url=https://victoria-metrics-host:8428/api/v1/write ``` -See [these docs](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format) if you need to write data to [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/). +See [these docs](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format) if you need to write data to a [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/). Example command for scraping Prometheus targets and writing the data to single-node VictoriaMetrics: @@ -92,9 +92,9 @@ Example command for scraping Prometheus targets and writing the data to single-n See [how to scrape Prometheus-compatible targets](#how-to-collect-metrics-in-prometheus-format) for more details. If you use single-node VictoriaMetrics, then you can discover and scrape Prometheus-compatible targets directly from VictoriaMetrics -without the need to use `vmagent` - see [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-scrape-prometheus-exporters-such-as-node-exporter). +without the need to use `vmagent` . See [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-scrape-prometheus-exporters-such-as-node-exporter). -`vmagent` can save network bandwidth usage costs under high load when [VictoriaMetrics remote write protocol is used](#victoriametrics-remote-write-protocol). +`vmagent` can reduce network bandwidth costs under high load when the [VictoriaMetrics remote write protocol is used](#victoriametrics-remote-write-protocol). Common `vmagent` issues are covered in the [troubleshooting docs](#troubleshooting). @@ -108,24 +108,24 @@ Pass `-help` to `vmagent` in order to see [the full list of supported command-li `vmagent` can run and collect metrics in IoT environments and industrial networks with unreliable or scheduled connections to remote storage. It buffers the collected data in local files until the connection to remote storage becomes available and then sends the buffered -data to the remote storage. It re-tries sending the data to remote storage until errors are resolved. +data to the remote storage. It retries sending the data to remote storage until errors are resolved. The maximum on-disk size for the buffered metrics can be limited with `-remoteWrite.maxDiskUsagePerURL`. -`vmagent` works on several common architectures used in IoT environments - 32-bit arm, 64-bit arm, ppc64, 386, amd64. +`vmagent` supports several common architectures used in IoT environments: 32-bit ARM, 64-bit ARM, PPC64, 386, and AMD64. -`vmagent` can save on network bandwidth usage costs by using [VictoriaMetrics remote write protocol](#victoriametrics-remote-write-protocol). +`vmagent` can reduce network bandwidth costs by using the [VictoriaMetrics remote write protocol](#victoriametrics-remote-write-protocol). See [how to optimize index size at VictoriaMetrics for IoT and industrial monitoring](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#index-tuning-for-low-churn-rate). ### Drop-in replacement for Prometheus -If you use Prometheus only for scraping metrics from various targets and forwarding these metrics to remote storage -then `vmagent` can replace Prometheus. Typically, `vmagent` requires lower amounts of RAM, CPU and network bandwidth compared with Prometheus. +If you use Prometheus only for scraping metrics from various targets and forwarding these metrics to remote storage, +then `vmagent` can replace Prometheus. Typically, `vmagent` requires less RAM, CPU, and network bandwidth than Prometheus. See [these docs](#how-to-collect-metrics-in-prometheus-format) for details. ### Statsd alternative -`vmagent` can be used as an alternative to [statsd](https://github.com/statsd/statsd) +`vmagent` can be used as an alternative to [StatsD](https://github.com/statsd/statsd) when [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/) is enabled. See [these docs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#statsd-alternative) for details. @@ -133,14 +133,14 @@ See [these docs](https://docs.victoriametrics.com/victoriametrics/stream-aggrega `vmagent` can accept metrics in [various popular data ingestion protocols](#how-to-push-data-to-vmagent), apply [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) to the accepted metrics (for example, change metric names/labels or drop unneeded metrics) and then forward the relabeled metrics -to other remote storage systems, that support Prometheus `remote_write` protocol (including other `vmagent` instances). +to other remote storage systems that support Prometheus `remote_write` protocol (including other `vmagent` instances). ### Replication and high availability `vmagent` replicates the collected metrics among multiple remote storage instances configured via `-remoteWrite.url` args. -If a single remote storage instance is temporarily out of service, then the collected data remains available in the other remote storage instances. -`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again, -and then it sends the buffered data to the remote storage in order to prevent data gaps. +If a single remote storage instance is temporarily unavailable, the collected data remains available on the other remote storage instances. +`vmagent` buffers the collected data in files at `-remoteWrite.tmpDataPath` until the remote storage becomes available again. +Then it sends the buffered data to the remote storage in order to prevent data gaps. [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/) already supports replication, so there is no need to specify multiple `-remoteWrite.url` flags when writing data to the same cluster. @@ -148,42 +148,42 @@ See [these docs](https://docs.victoriametrics.com/victoriametrics/cluster-victor ### Relabeling and filtering -`vmagent` can add, remove or update labels on the collected data before sending it to the remote storage. +`vmagent` can add, remove, or update labels on the collected data before sending it to the remote storage. It can filter scrape targets or remove unwanted samples via Prometheus-like relabeling. -Please see [Relabeling cookbook](https://docs.victoriametrics.com/victoriametrics/relabeling/) for details. +Please see the [Relabeling cookbook](https://docs.victoriametrics.com/victoriametrics/relabeling/) for details. ### Sharding among remote storages -By default `vmagent` replicates data to remote storage systems via the `-remoteWrite.url` command-line flag. +By default, `vmagent` replicates data to remote storage systems via the `-remoteWrite.url` command-line flag. If the `-remoteWrite.shardByURL` command-line flag is set, then `vmagent` spreads the outgoing [time series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) evenly among all the remote storage systems listed in `-remoteWrite.url`. It is possible to replicate samples among remote storage systems by passing `-remoteWrite.shardByURLReplicas=N` -command-line flag to `vmagent` additionally to `-remoteWrite.shardByURL` command-line flag. +to `vmagent` in addition to the `-remoteWrite.shardByURL` command-line flag. This instructs `vmagent` to write every outgoing sample to `N` number of distinct remote storage systems listed in `-remoteWrite.url` in addition to sharding. -Samples for the same time series are routed to the same remote storage system if `-remoteWrite.shardByURL` flag is specified. +Samples for the same time series are routed to the same remote storage system if the `-remoteWrite.shardByURL` flag is specified. This allows building scalable data processing pipelines when a single remote storage cannot keep up with the data ingestion workload. For example, this allows horizontal scaling with [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/) by routing outgoing samples for the same time series like [counter](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#counter) and [histogram](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#histogram) types from top-level `vmagent` instances to the same second-level `vmagent` instance, so they are aggregated properly. -If `-remoteWrite.shardByURL` command-line flag is set, then all the metric labels are used for even sharding +If the `-remoteWrite.shardByURL` command-line flag is set, then all the metric labels are used for even sharding among remote storage systems specified in `-remoteWrite.url`. > The `-remoteWrite.shardByURL` may not work as expected when [SRV URLs](https://docs.victoriametrics.com/victoriametrics/vmagent/#srv-urls) are in use. > -> An SRV record might resolve to multiple addresses, one address is chosen **randomly** for all subsequent logic, including sharding. -> It will make sharding inconsistent. Samples of the same time series always go to the same **remote write URL**/**SRV record**, but they may reach different addresses randomly based on the DNS resolution. +> An SRV record might resolve to multiple addresses; one address is chosen **randomly** for all subsequent logic, including sharding. +> It will make sharding inconsistent. Samples from the same time series always go to the same **remote write URL**/**SRV record**, but they may reach different addresses at random due to DNS resolution. > > For example, if you set `-remoteWrite.url=srv+foo` and it's resolved to three addresses (`192.168.1.1`, `192.168.1.2`, `192.168.1.3`), > vmagent will only choose **one** randomly every time it (re-)creates the connection. In contrast, specifying the addresses manually (`-remoteWrite.url=192.168.1.1 -remoteWrite.url=192.168.1.2 -remoteWrite.url=192.168.1.3`) will shard samples across all three URLs. -Sometimes it may be needed to use only a particular set of labels for sharding. For example, it may be necessary to route all the metrics with the same `instance` label -to the same `-remoteWrite.url`. In this case you can specify comma-separated list of these labels in the `-remoteWrite.shardByURL.labels` +Sometimes, it may be necessary to use only a particular set of labels for sharding. For example, it may be necessary to route all the metrics with the same `instance` label +to the same `-remoteWrite.url`. In this case, you can specify a comma-separated list of these labels in the `-remoteWrite.shardByURL.labels` command-line flag. For example, `-remoteWrite.shardByURL.labels=instance,__name__` would shard metrics with the same name and `instance` label to the same `-remoteWrite.url`. @@ -192,19 +192,19 @@ For example, if all the [raw samples](https://docs.victoriametrics.com/victoriam except for the labels `instance` and `pod` must be routed to the same backend. In this case the list of ignored labels must be passed to `-remoteWrite.shardByURL.ignoreLabels` command-line flag: `-remoteWrite.shardByURL.ignoreLabels=instance,pod`. -See also [how to scrape large number of targets](#scraping-big-number-of-targets). +See also [how to scrape a large number of targets](#scraping-big-number-of-targets). ### Splitting data streams among multiple systems `vmagent` supports splitting the collected data between multiple destinations with the help of `-remoteWrite.urlRelabelConfig`, it is applied independently for each configured `-remoteWrite.url` destination. For example, it is possible to replicate or split -data among long-term remote storage, short-term remote storage and a real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter). +data among long-term remote storage, short-term remote storage, and a real-time analytical system [built on top of Kafka](https://github.com/Telefonica/prometheus-kafka-adapter). Note that each destination can receive its own subset of the collected data due to per-destination relabeling via `-remoteWrite.urlRelabelConfig`. For example, let's assume that all the metrics scraped or received by `vmagent` have the label `env` with the value of `dev` or `prod`. To route metrics with the label `env=dev` to `dev` and metrics with the label `env=prod` to `prod` apply the following config: -1. Create a relabeling config file `relabelDev.yml` to drop all metrics that don't have label `env=dev`: +1. Create a relabeling config file `relabelDev.yml` to drop all metrics that don't have the label `env=dev`: ```yaml - action: keep @@ -212,7 +212,7 @@ To route metrics with the label `env=dev` to `dev` and metrics with the label `e regex: "dev" ``` -2. Create a relabeling config file `relabelProd.yml` to drop all metrics that don't have label `env=prod`: +2. Create a relabeling config file `relabelProd.yml` to drop all metrics that don't have the label `env=prod`: ```yaml - action: keep @@ -229,7 +229,7 @@ To route metrics with the label `env=dev` to `dev` and metrics with the label `e -remoteWrite.url=http:// -remoteWrite.urlRelabelConfig=relabelProd.yml ``` -With this configuration `vmagent` will only forward metrics with the label `env=dev` to `http://` and +With this configuration, `vmagent` will only forward metrics with the label `env=dev` to `http://` and metrics with the label `env=prod` to `http://`. Please note, the order of flags is important: the first `-remoteWrite.urlRelabelConfig` will be applied to the @@ -238,21 +238,21 @@ first `-remoteWrite.url`, and so on. ### Prometheus remote_write proxy `vmagent` can be used as a proxy for Prometheus data sent via Prometheus `remote_write` protocol. It can accept data via the `remote_write` API -at the`/api/v1/write` endpoint. It then applies relabeling and filtering then proxies it to another `remote_write` system. +at the`/api/v1/write` endpoint. It then applies relabeling and filtering; then proxies it to another `remote_write` system. `vmagent` can also be configured to encrypt the incoming `remote_write` requests with `-tls*` command-line flags. -Basic Auth can be enabled for the incoming `remote_write` requests with `-httpAuth.*` command-line flags. +Basic Auth can be enabled for incoming `remote_write` requests using the `-httpAuth.*` command-line flags. ### remote_write for clustered version While `vmagent` can accept data in several supported protocols (OpenTSDB, Influx, Prometheus, Graphite) and scrape data from various targets, -writes are always performed in Prometheus remote_write protocol. Therefore, for the [clustered version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/), +writes are always performed using the Prometheus remote_write protocol. Therefore, for the [clustered version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/), the `-remoteWrite.url` command-line flag should be configured as `://:8480/insert//prometheus/api/v1/write` according to [these docs](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). There is also support for multitenant writes. See [these docs](#multitenancy). ### Flexible deduplication -[Deduplication at stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication) allows setting up arbitrary complex de-duplication schemes +[Deduplication at stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#deduplication) allows setting up arbitrary complex deduplication schemes for the collected samples. Examples: * The following command instructs `vmagent` to send only the last sample for each @@ -271,7 +271,7 @@ for the collected samples. Examples: ### Life of a sample -vmagent supports limiting, relabeling, deduplication and stream aggregation for all metric samples, scraped or pushed. +vmagent supports limiting, relabeling, deduplication, and stream aggregation for all metric samples, scraped or pushed. The received data is then forwarded to the specified `-remoteWrite.url` destinations. The pipeline is as follows: ```mermaid @@ -311,8 +311,8 @@ in addition to the pull-based Prometheus-compatible targets' scraping: * DataDog "submit metrics" API. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/datadog/). * InfluxDB line protocol via `http://:8429/write`. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/influxdb/). -* Graphite plaintext protocol if `-graphiteListenAddr` command-line flag is set. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/graphite/#ingesting). -* OpenTelemetry http API. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/opentelemetry/). +* Graphite plaintext protocol if the `-graphiteListenAddr` command-line flag is set. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/graphite/#ingesting). +* OpenTelemetry HTTP API. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/opentelemetry/). * NewRelic API. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/newrelic/#sending-data-from-agent). * OpenTSDB telnet and http protocols if `-opentsdbListenAddr` command-line flag is set. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/opentsdb/). * Zabbix Connector streaming protocol. See [these docs](https://docs.victoriametrics.com/victoriametrics/integrations/zabbixconnector/#send-data-from-zabbix-connector). @@ -324,14 +324,14 @@ in addition to the pull-based Prometheus-compatible targets' scraping: ## How to collect metrics in Prometheus format -Specify the path to `prometheus.yml` file via `-promscrape.config` command-line flag. `vmagent` takes into account the following +Specify the path to the `prometheus.yml` file via the `-promscrape.config` command-line flag. `vmagent` takes into account the following sections from [Prometheus config file](https://prometheus.io/docs/prometheus/latest/configuration/configuration/): * `global` * `scrape_configs` All other sections are ignored, including the [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write) section. -Use `-remoteWrite.*` command-line flag instead for configuring remote write settings. See [the list of unsupported config sections](#unsupported-prometheus-config-sections). +Use the `-remoteWrite.*` command-line flag instead for configuring remote write settings. See [the list of unsupported config sections](#unsupported-prometheus-config-sections). The file pointed to by `-promscrape.config` may contain `%{ENV_VAR}` placeholders that are substituted by the corresponding `ENV_VAR` environment variable values. @@ -356,21 +356,21 @@ scrape_configs: ``` * `disable_compression: true` for disabling response compression on a per-job basis. By default, `vmagent` requests compressed responses - from scrape targets for saving network bandwidth. + from scrape targets to save network bandwidth. * `disable_keepalive: true` for disabling [HTTP keep-alive connections](https://en.wikipedia.org/wiki/HTTP_persistent_connection) - on a per-job basis. By default, `vmagent` uses keep-alive connections to scrape targets for reducing overhead on connection re-establishing. + on a per-job basis. By default, vmagent uses keep-alive connections when scraping targets. This reduces overhead by eliminating the need to repeatedly reestablish connections. * `series_limit: N` for limiting the number of unique time series a single scrape target can expose. See [these docs](#cardinality-limiter). -* `stream_parse: true` for scraping targets in a streaming manner. This may be useful when targets export large number of metrics. See [these docs](#stream-parsing-mode). -* `scrape_align_interval: duration` for aligning scrapes to the given interval instead of using random offset +* `stream_parse: true` for scraping targets in a streaming manner. This may be useful when targets export a large number of metrics. See [these docs](#stream-parsing-mode). +* `scrape_align_interval: duration` for aligning scrapes to the given interval instead of using a random offset in the range `[0 ... scrape_interval]` for scraping each target. The random offset helps to spread scrapes evenly in time. -* `scrape_offset: duration` for specifying the exact offset for scraping instead of using random offset in the range `[0 ... scrape_interval]`. +* `scrape_offset: duration` for specifying the exact offset for scraping instead of using a random offset in the range `[0 ... scrape_interval]`. See [scrape_configs docs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) for more details on all the supported options. ### Loading scrape configs from multiple files `vmagent` supports loading [scrape configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) from multiple files specified -in the `scrape_config_files` section of `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent` +in the `scrape_config_files` section of the `-promscrape.config` file. For example, the following `-promscrape.config` instructs `vmagent` to load scrape configs from all the `*.yml` files under the `configs` directory, from the local file `single_scrape_config.yml` and from the URL `https://config-server/scrape_config.yml`: @@ -381,7 +381,7 @@ scrape_config_files: - https://config-server/scrape_config.yml ``` -Every referred file can contain arbitrary number of [supported scrape configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). +Every referred file can contain an arbitrary number of [supported scrape configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). There is no need to specify a top-level `scrape_configs` section in these files. For example: ```yaml @@ -393,31 +393,31 @@ There is no need to specify a top-level `scrape_configs` section in these files. - role: pod ``` -`vmagent` is able to dynamically reload these files - see [these docs](#configuration-update). +`vmagent` can dynamically reload these files, see [these docs](#configuration-update). ### Unsupported Prometheus config sections -`vmagent` doesn't support the following sections in Prometheus config file passed to `-promscrape.config` command-line flag: +`vmagent` doesn't support the following sections in the Prometheus config file passed with the `-promscrape.config` command-line flag: * [remote_write](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write). This section is substituted with various `-remoteWrite*` command-line flags. See [the full list of flags](#advanced-usage). The `remote_write` section isn't supported in order to reduce possible confusion when `vmagent` is used for accepting incoming metrics via [supported push protocols](#how-to-push-data-to-vmagent). - In this case the `-promscrape.config` file isn't needed. -* `remote_read`. This section isn't supported at all, since `vmagent` doesn't provide Prometheus querying API. - It is expected that the querying API is provided by the remote storage specified via `-remoteWrite.url` such as VictoriaMetrics. + In this case, the `-promscrape.config` file isn't needed. +* `remote_read`. This section isn't supported at all, since `vmagent` doesn't provide the Prometheus querying API. + Use the querying API on your `-remoteWrite.url` target instead, such as VictoriaMetrics. See [Prometheus querying API docs for VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#prometheus-querying-api-usage). * `rule_files` and `alerting`. These sections are supported by [vmalert](https://docs.victoriametrics.com/victoriametrics/vmalert/). The list of supported service discovery types is available in [how-to-collect-metrics-in-prometheus-format](#how-to-collect-metrics-in-prometheus-format). -Additionally, `vmagent` doesn't support `refresh_interval` option at service discovery sections. -This option is substituted with `-promscrape.*CheckInterval` command-line flags, that are specific per each service discovery type. +Additionally, `vmagent` doesn't support the `refresh_interval` option in the service discovery sections. +This option is replaced by the `-promscrape.*CheckInterval` command-line flags, which are specific to each service discovery type. See [the full list of command-line flags for vmagent](#advanced-usage). ## Configuration update -`vmagent` should be restarted in order to update config options set via command-line args. -`vmagent` supports multiple approaches for reloading configs from updated config files such as +`vmagent` should be restarted in order to update config options set via command-line arguments. +`vmagent` supports multiple approaches for reloading configs from updated config files, such as `-promscrape.config`, `-remoteWrite.relabelConfig`, `-remoteWrite.urlRelabelConfig`, `-streamAggr.config` and `-remoteWrite.streamAggr.config`: @@ -427,61 +427,60 @@ and `-remoteWrite.streamAggr.config`: kill -SIGHUP `pidof vmagent` ``` -* Sending HTTP request to `http://vmagent:8429/-/reload` endpoint. This endpoint can be protected with `-reloadAuthKey` command-line flag. +* Sending HTTP request to `http://vmagent:8429/-/reload` endpoint. This endpoint can be protected with the `-reloadAuthKey` command-line flag. -There is also `-promscrape.configCheckInterval` command-line flag, that can be used for automatic reloading configs from updated `-promscrape.config` file. +There is also the `-promscrape.configCheckInterval` command-line flag, which can be used to automatically reload configs from the updated `-promscrape.config` file. ## SRV URLs If `vmagent` encounters URLs with `srv+` prefix in hostname (such as `http://srv+some-addr/some/path`), then it resolves `some-addr` [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) -record into TCP address with hostname and TCP port, and then uses the resulting url when it needs connecting to it. +record into TCP address with hostname and TCP port, and then use the resulting URL when it needs to connect to it. SRV URLs are supported in the following places: * In `-remoteWrite.url` command-line flag. For example, if `victoria-metrics` [DNS SRV](https://en.wikipedia.org/wiki/SRV_record) record contains `victoria-metrics-host:8428` TCP address, then `-remoteWrite.url=http://srv+victoria-metrics/api/v1/write` is automatically resolved into `-remoteWrite.url=http://victoria-metrics-host:8428/api/v1/write`. If the DNS SRV record is resolved into multiple TCP addresses, then `vmagent` - uses randomly chosen address per each connection it establishes to the remote storage. + uses a randomly chosen address for each connection it establishes to the remote storage. -* In scrape target addresses aka `__address__` label - see [these docs](https://docs.victoriametrics.com/victoriametrics/relabeling/#how-to-modify-scrape-urls-in-targets) for details. +* In scrape target addresses aka `__address__` label. See [these docs](https://docs.victoriametrics.com/victoriametrics/relabeling/#how-to-modify-scrape-urls-in-targets) for details. -* In urls used for [service discovery](https://docs.victoriametrics.com/victoriametrics/sd_configs/). +* In URLs used for [service discovery](https://docs.victoriametrics.com/victoriametrics/sd_configs/). -SRV urls are useful when HTTP services run on different TCP ports or when they can change TCP ports over time (for instance, after the restart). +SRV URLs are useful when HTTP services run on different TCP ports or when their TCP ports can change over time (for instance, after a restart). ## VictoriaMetrics remote write protocol -`vmagent` supports sending data to the configured `-remoteWrite.url` either via Prometheus remote write protocol -or via VictoriaMetrics remote write protocol. +`vmagent` supports sending data to the configured `-remoteWrite.url` either via the Prometheus or VictoriaMetrics remote write protocols. -VictoriaMetrics remote write protocol provides the following benefits comparing to Prometheus remote write protocol: +When comparing the remote protocols between VictoriaMetrics and Prometheus, VictoriaMetrics provides the following benefits: * Reduced network bandwidth usage by 2x-5x. This allows saving network bandwidth usage costs when `vmagent` and - the configured remote storage systems are located in different datacenters, availability zones or regions. + the configured remote storage systems are located in different datacenters, availability zones, or regions. * Reduced disk read/write IO and disk space usage at `vmagent` when the remote storage is temporarily unavailable. - In this case `vmagent` buffers the incoming data to disk using the VictoriaMetrics remote write format. - This reduces disk read/write IO and disk space usage by 2x-5x compared to Prometheus remote write format. + In this case, `vmagent` buffers incoming data to disk using the VictoriaMetrics remote write format. + This reduces disk read/write IO and disk space usage by 2x-5x compared to the Prometheus remote write format. > See blogpost [Save network costs with VictoriaMetrics remote write protocol](https://victoriametrics.com/blog/victoriametrics-remote-write/). `vmagent` uses VictoriaMetrics remote write protocol by default {{% available_from "v1.116.0" %}} when it sends data to VictoriaMetrics components such as other `vmagent` instances, [single-node VictoriaMetrics](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) -or `vminsert` at [cluster version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/). If needed, It can automatically downgrade to a Prometheus protocol at runtime. +or `vminsert` at [cluster version](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/). If needed, it can automatically downgrade to a Prometheus protocol at runtime. It is possible to force switch to VictoriaMetrics remote write protocol by specifying `-remoteWrite.forceVMProto` command-line flag for the corresponding `-remoteWrite.url`. -It is possible to tune the compression level for VictoriaMetrics remote write protocol with `-remoteWrite.vmProtoCompressLevel` command-line flag. -Bigger values reduce network usage at the cost of higher CPU usage. Negative values reduce CPU usage at the cost of higher network usage. -The default value for the compression level is `0`, the minimum value is `-22` and the maximum value is `22`. The default value works optimally -in most cases, so it isn't recommended changing it. +It is possible to tune the compression level for VictoriaMetrics remote write protocol with the `-remoteWrite.vmProtoCompressLevel` command-line flag. +Bigger values reduce network usage at the cost of higher CPU usage. Negative values reduce CPU usage but increase network usage. +The default value for the compression level is `0`, the minimum value is `-22`, and the maximum value is `22`. The default value works optimally +in most cases, so it isn't recommended to change it. -`vmagent` automatically switches to Prometheus remote write protocol when it sends data to old versions of VictoriaMetrics components +`vmagent` automatically switches to the Prometheus remote write protocol when it sends data to old versions of VictoriaMetrics components or to other Prometheus-compatible remote storage systems. It is possible to force switch to Prometheus remote write protocol by specifying `-remoteWrite.forcePromProto` command-line flag for the corresponding `-remoteWrite.url`. ## Multitenancy -By default `vmagent` collects the data without [tenant](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy) identifiers +By default, `vmagent` collects the data without [tenant](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy) identifiers and routes it to the remote storage specified via `-remoteWrite.url` command-line flag. The `-remoteWrite.url` can point to `/insert//prometheus/api/v1/write` path at `vminsert` according to [these docs](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). ```mermaid @@ -490,11 +489,11 @@ flowchart LR B["requests_total{instance=bar}"] <--> |scrape| V V --> |"/insert/#60;tenant_id#62;/#60;suffix#62;"| C[vminsert] ``` -In this case, all the metrics written to `/insert/tenant_id/prometheus/api/v1/write` will belong to specified `` tenant. +In this case, all the metrics written to `/insert/tenant_id/prometheus/api/v1/write` will belong to the specified `` tenant. ### Multitenancy via labels -vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant url at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels) +vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant URL at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels) and tenant is specified via [multitenancy labels](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels): ```mermaid flowchart LR @@ -518,7 +517,7 @@ scrape_configs: vmagent can get tenant identifier from `__tenant_id__` label at target discovery phase. It implicitly converts `__tenant_id__` label into `vm_account_id` and `vm_project_id` labels and attaches it to the scraped metrics and metrics metadata. -For example, the following relabeling rule instructs sending metrics to `10:5` tenant defined in the `prometheus.io/tenant_id: 10:5` annotation of Kubernetes pod deployment: +For example, the following relabeling rule instructs sending metrics to the `10:5` tenant defined in the `prometheus.io/tenant_id: 10:5` annotation of the Kubernetes pod deployment: ```yaml scrape_configs: - kubernetes_sd_configs: @@ -533,7 +532,7 @@ or forwarded metrics. ### Multitenancy via path -vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant url at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels), +vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant URL at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels), tenant is specified in the [write path](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format) and `-enableMultitenantHandlers` command-line flag is set: ```mermaid flowchart LR @@ -542,14 +541,14 @@ flowchart LR V --> |"/insert/multitenant/#60;suffix#62;"| C[vminsert] ``` -In this configuration, vmagent accepts writes via the same multitenant endpoints (`/insert//`) [as vminsert](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). -For all received data, vmagent will automatically convert tenant identifiers from the URL to `vm_account_id` and `vm_project_id` labels and sets tenant info in metadata. +In this configuration, vmagent accepts writes via the same multitenant endpoints (`/insert//`) [as vminsert does](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). +For all received data, vmagent will automatically convert tenant identifiers from the URL to `vm_account_id` and `vm_project_id` labels and set tenant info in metadata. These tenant labels are added before applying [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) specified via `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig` command-line flags. ### Multitenancy via headers -vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant url at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels), +vmagent can write data to multiple distinct tenants if `-remoteWrite.url` points to [multitenant URL at VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-labels), tenant is specified [via headers](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#multitenancy-via-headers) {{% available_from "v1.143.0" %}}, both `-enableMultitenantHandlers` and `-enableMultitenancyViaHeaders` command-line flags are set: ```mermaid flowchart LR @@ -558,10 +557,10 @@ flowchart LR V --> |"/insert/multitenant/#60;suffix#62;"| C[vminsert] ``` -In this configuration, vmagent accepts writes via the same simplified multitenant endpoints (`/insert/`) [as vminsert](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). -The tenant information is extracted from the `AccountID` and `ProjectID` HTTP headers, that are expected to be attached to all the incoming requests. If headers are missing, then tenant is set to `0:0` as default. +In this configuration, vmagent accepts writes via the same simplified multitenant endpoints (`/insert/`) [as vminsert does](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/#url-format). +The tenant information is extracted from the `AccountID` and `ProjectID` HTTP headers, which are expected to be included in all incoming requests. If headers are missing, then the tenant is set to `0:0` as the default. -For all received data, vmagent will automatically convert tenant identifiers from the headers to `vm_account_id` and `vm_project_id` labels and sets tenant info in metadata. +For all received data, vmagent will automatically convert tenant identifiers from the headers to `vm_account_id` and `vm_project_id` labels and set tenant info in metadata. These tenant labels are added before applying [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) specified via `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig` command-line flags. @@ -572,7 +571,7 @@ forwarded requests via `headers` param in the config file. Extra labels can be added to metrics collected by `vmagent` via the following mechanisms: -* The `global -> external_labels` section in `-promscrape.config` file. These labels are added only to metrics scraped from targets configured +* The `global -> external_labels` section in the `-promscrape.config` file. These labels are added only to metrics scraped from targets configured in the `-promscrape.config` file. They aren't added to metrics collected via other [data ingestion protocols](#how-to-push-data-to-vmagent). * The `-remoteWrite.label` command-line flag. These labels are added **to all the collected metrics** before sending them **to all configured `-remoteWrite.url`**. For example, the following command starts `vmagent`, which adds `{datacenter="foobar"}` label to all the metrics pushed @@ -582,7 +581,7 @@ Extra labels can be added to metrics collected by `vmagent` via the following me /path/to/vmagent -remoteWrite.label=datacenter=foobar ... ``` -* Via relabeling. Relabeling can be applied globally and per each configured `-remoteWrite.url` destination. See [Relabeling Cookbook](https://docs.victoriametrics.com/victoriametrics/relabeling/). +* Via relabeling. Relabeling can be applied globally and for each configured `-remoteWrite.url` destination. See [Relabeling Cookbook](https://docs.victoriametrics.com/victoriametrics/relabeling/). * Add `extra_label` GET param to `-remoteWrite.url` address (only works when sending data to VictoriaMetrics components): ```sh @@ -591,10 +590,10 @@ Extra labels can be added to metrics collected by `vmagent` via the following me ## Automatically generated metrics -`vmagent` automatically generates the following metrics per each scrape of every [Prometheus-compatible target](#how-to-collect-metrics-in-prometheus-format) -and attaches `instance`, `job` and other target-specific labels to these metrics: +`vmagent` automatically generates the following metrics for each scrape of every [Prometheus-compatible target](#how-to-collect-metrics-in-prometheus-format) +and attaches `instance`, `job`, and other target-specific labels to these metrics: -* `up` - this metric exposes `1` value on successful scrape and `0` value on unsuccessful scrape. This allows monitoring +* `up` - this metric exposes a `1` value on successful scrape and a `0` value on unsuccessful scrape. This allows monitoring failing scrapes with the following [MetricsQL query](https://docs.victoriametrics.com/victoriametrics/metricsql/): ```metricsql @@ -612,14 +611,14 @@ and attaches `instance`, `job` and other target-specific labels to these metrics * `scrape_timeout_seconds` - the configured timeout for the current scrape target (aka `scrape_timeout`). This allows detecting targets with scrape durations close to the configured scrape timeout. For example, the following [MetricsQL query](https://docs.victoriametrics.com/victoriametrics/metricsql/) returns targets (identified by `instance` label), - which take more than 80% of the configured `scrape_timeout` during scrapes: + which takes more than 80% of the configured `scrape_timeout` during scrapes: ```metricsql scrape_duration_seconds / scrape_timeout_seconds > 0.8 ``` -* `scrape_response_size_bytes` - response size in bytes for the given target. This allows to monitor amount of data scraped - and to adjust [`max_scrape_size` option](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) for scraped targets. +* `scrape_response_size_bytes` - response size in bytes for the given target. This allows to monitor the amount of data scraped + and to adjust the [`max_scrape_size` option](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) for scraped targets. For example, the following [MetricsQL query](https://docs.victoriametrics.com/victoriametrics/metricsql/) returns targets with scrape response bigger than `10MiB`: @@ -636,10 +635,9 @@ and attaches `instance`, `job` and other target-specific labels to these metrics ``` * `scrape_samples_limit` - the configured limit on the number of [samples](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples) the given target can expose. - The limit can be set via `sample_limit` option at [scrape_configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). - This metric is exposed only if the `sample_limit` is set. This allows detecting targets, - which expose too many metrics compared to the configured `sample_limit`. For example, the following query - returns targets (identified by `instance` label), which expose more than 80% metrics compared to the configured `sample_limit`: + The limit can be set via the `sample_limit` option at [scrape_configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). + This metric is exposed only if the `sample_limit` is set. +This detects targets exposing more metrics than the configured `sample_limit`. For example, the query below returns targets (by `instance` label) exceeding 80% of the configured `sample_limit`. ```metricsql scrape_samples_scraped / scrape_samples_limit > 0.8 @@ -657,12 +655,12 @@ and attaches `instance`, `job` and other target-specific labels to these metrics * `scrape_labels_limit` - the configured limit on the number of [labels](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#labels) the given target can expose per [sample](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#raw-samples). - The limit can be set via `label_limit` option at [scrape_configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). + The limit can be set via the `label_limit` option at [scrape_configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). This metric is exposed only if the `label_limit` is set. * `scrape_series_added` - **an approximate** number of new [series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) the given target generates during the current scrape. This metric allows detecting targets (identified by `instance` label), - which lead to [high churn rate](https://docs.victoriametrics.com/victoriametrics/faq/#what-is-high-churn-rate). + which leads to [high churn rate](https://docs.victoriametrics.com/victoriametrics/faq/#what-is-high-churn-rate). For example, the following [MetricsQL query](https://docs.victoriametrics.com/victoriametrics/metricsql/) returns targets, which generate more than 1000 new series during the last hour: @@ -679,46 +677,46 @@ and attaches `instance`, `job` and other target-specific labels to these metrics This metric is exposed only if the series limit is set. * `scrape_series_current` - the number of unique [series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series) the given target exposed so far. - This metric is exposed only if the series limit is set according to [these docs](#cardinality-limiter). - This metric allows alerting when the number of exposed series by the given target reaches the limit. - For example, the following query would alert when the target exposes more than 90% of unique series compared to the configured limit. + This metric is exposed only when the series limit is set as described in [these docs](#cardinality-limiter). + This metric allows alerting when the number of exposed series for the given target reaches the limit. + For example, the following query would trigger an alert when the target exposes more than 90% of unique series. ```metricsql scrape_series_current / scrape_series_limit > 0.9 ``` * `scrape_series_limit_samples_dropped` - exposes the number of dropped samples during the scrape because of the exceeded limit - on the number of unique [series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series). This metric is exposed only if the series limit is set according to [these docs](#cardinality-limiter). - This metric allows alerting when scraped samples are dropped because of the exceeded limit. - For example, the following query alerts when at least a single sample is dropped because of the exceeded limit during the last hour: + on the number of unique [series](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#time-series). This metric is exposed only when the series limit is set as described in [these docs](#cardinality-limiter). + This metric allows alerting when scraped samples are dropped due to exceeding a limit. + For example, the following query alerts when at least a single sample is dropped because the limit has been exceeded during the last hour: ```metricsql sum_over_time(scrape_series_limit_samples_dropped[1h]) > 0 ``` -If the target exports metrics with names clashing with the automatically generated metric names, then `vmagent` automatically -adds `exported_` prefix to these metric names, so they don't clash with automatically generated metric names. +If the target exports metrics with names that clash with the automatically generated metric names, then `vmagent` automatically +prefixes `exported_` to these metric names, so they don't clash with automatically generated metric names. -Relabeling defined in `relabel_configs` or `metric_relabel_configs` of scrape config isn't applied to automatically -generated metrics. But they still can be relabeled via `-remoteWrite.relabelConfig` before sending metrics to remote address. +Relabeling defined in `relabel_configs` or `metric_relabel_configs` of the scrape config isn't applied to automatically +generated metrics. But they still can be relabeled via `-remoteWrite.relabelConfig` before sending metrics to the remote address. ## Prometheus staleness markers `vmagent` sends [Prometheus staleness markers](https://www.robustperception.io/staleness-and-promql) to `-remoteWrite.url` in the following cases: * If they are passed to `vmagent` via [Prometheus remote_write protocol](#prometheus-remote_write-proxy). -* If the metric disappears from the list of scraped metrics, then stale marker is sent to this particular metric. +* If the metric disappears from the list of scraped metrics, then a stale marker is sent to this particular metric. * If the scrape target becomes temporarily unavailable, then stale markers are sent for all the metrics scraped from this target. * If the scrape target is removed from the list of targets, then stale markers are sent for all the metrics scraped from this target. -Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body per each scrape target +Prometheus staleness markers' tracking needs additional memory, since it must store the previous response body for each scrape target in order to compare it to the current response body. The memory usage may be reduced by disabling staleness tracking in the following ways: * By passing `-promscrape.noStaleMarkers` command-line flag to `vmagent`. This disables staleness tracking across all the targets. * By specifying `no_stale_markers: true` option in the [scrape_config](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) for the corresponding target. -When staleness tracking is disabled, then `vmagent` doesn't track the number of new time series per each scrape, -e.g. it sets `scrape_series_added` metric to zero. See [these docs](#automatically-generated-metrics) for details. +When staleness tracking is disabled, then `vmagent` doesn't track the number of new time series for each scrape, +e.g., it sets `scrape_series_added` metric to zero. See [these docs](#automatically-generated-metrics) for details. ## Metric metadata @@ -737,9 +735,9 @@ This reduces network traffic and resource usage when metadata is not required. By default, `vmagent` parses the full response from the scrape target, applies [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) and then pushes the resulting metrics to the configured `-remoteWrite.url` in one go. This mode works great for the majority of cases -when the scrape target exposes small number of metrics (e.g. less than 10K). But this mode may take large amounts of memory +when the scrape target exposes a small number of metrics (e.g., less than 10K). But this mode may take large amounts of memory when the scrape target exposes a large number of metrics (for example, when `vmagent` scrapes [`kube-state-metrics`](https://github.com/kubernetes/kube-state-metrics) -in large Kubernetes cluster). It is recommended to enable stream parsing for such targets. +in a large Kubernetes cluster. It is recommended to enable stream parsing for such targets. When stream parsing is enabled, `vmagent` processes the response from the scrape target in chunks. This saves memory when scraping targets that expose millions of metrics. @@ -747,14 +745,14 @@ Stream parsing is automatically enabled for scrape targets returning response bo the `-promscrape.minResponseSizeForStreamParse` command-line flag value. Additionally, stream parsing can be explicitly enabled in the following places: -* Via `-promscrape.streamParse` command-line flag. In this case all the scrape targets defined +* Via `-promscrape.streamParse` command-line flag. In this case, all the scrape targets defined in the file pointed by `-promscrape.config` are scraped in stream parsing mode. -* Via `stream_parse: true` option at `scrape_configs` section. In this case all the scrape targets defined +* Via `stream_parse: true` option at `scrape_configs` section. In this case, all the scrape targets defined in this section are scraped in stream parsing mode. * Via `__stream_parse__=true` label, which can be set via [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) at `relabel_configs` section. - In this case stream parsing mode is enabled for the corresponding scrape targets. + In this case, stream parsing mode is enabled for the corresponding scrape targets. Typical use case: to set the label via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) - for targets exposing large number of metrics. + for targets exposing a large number of metrics. Examples: @@ -779,8 +777,8 @@ as soon as it is parsed in stream parsing mode. ## Scraping big number of targets A single `vmagent` instance can scrape tens of thousands of scrape targets. Sometimes this isn't enough due to limitations on CPU, network, RAM, etc. -In this case scrape targets can be split among multiple `vmagent` instances (aka `vmagent` horizontal scaling, sharding and clustering). -The number of `vmagent` instances in the cluster must be passed to `-promscrape.cluster.membersCount` command-line flag. +In this case, scrape targets can be split among multiple `vmagent` instances (aka `vmagent` horizontal scaling, sharding, and clustering). +The number of `vmagent` instances in the cluster must be passed to the `-promscrape.cluster.membersCount` command-line flag. Each `vmagent` instance in the cluster must use identical `-promscrape.config` files with distinct `-promscrape.cluster.memberNum` values in the range `0 ... N-1`, where `N` is the number of `vmagent` instances in the cluster specified via `-promscrape.cluster.membersCount`. For example, the following commands spread scrape targets among a cluster of two `vmagent` instances: @@ -795,7 +793,7 @@ The pod name must end with a number in the range `0 ... promscrape.cluster.membe By default, each scrape target is scraped only by a single `vmagent` instance in the cluster. If there is a need for replicating scrape targets among multiple `vmagent` instances, then `-promscrape.cluster.replicationFactor` command-line flag must be set to the desired number of replicas. For example, the following commands -start a cluster of three `vmagent` instances, where each target is scraped by two `vmagent` instances: +start a cluster of three `vmagent` instances, where two `vmagent` instances scrape each target: ```sh /path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=0 -promscrape.config=/path/to/config.yml ... @@ -803,24 +801,24 @@ start a cluster of three `vmagent` instances, where each target is scraped by tw /path/to/vmagent -promscrape.cluster.membersCount=3 -promscrape.cluster.replicationFactor=2 -promscrape.cluster.memberNum=2 -promscrape.config=/path/to/config.yml ... ``` -Every `vmagent` in the cluster exposes all the discovered targets at `http://vmagent:8429/service-discovery` page. -Each discovered target on this page contains its status (`UP`, `DOWN` or `DROPPED` with the reason why the target has been dropped). +Every `vmagent` in the cluster exposes all the discovered targets in the `http://vmagent:8429/service-discovery` page. +Each discovered target on this page contains its status (`UP`, `DOWN`, or `DROPPED` with the reason why the target has been dropped). If the target is dropped because of sharding to other `vmagent` instances in the cluster, then the status column contains `-promscrape.cluster.memberNum` values for `vmagent` instances where the given target is scraped. -The `/service-discovery` page provides links to the corresponding `vmagent` instances if `-promscrape.cluster.memberURLTemplate` command-line flag is set. +The `/service-discovery` page provides links to the corresponding `vmagent` instances if the `-promscrape.cluster.memberURLTemplate` command-line flag is set. Every occurrence of `%d` inside the `-promscrape.cluster.memberURLTemplate` is substituted with the `-promscrape.cluster.memberNum` for the corresponding `vmagent` instance. For example, `-promscrape.cluster.memberURLTemplate='http://vmagent-instance-%d:8429/targets'` -generates `http://vmagent-instance-42:8429/targets` url for `vmagent` instance, which runs with `-promscrape.cluster.memberNum=42`. +generates `http://vmagent-instance-42:8429/targets` URL for `vmagent` instance, which runs with `-promscrape.cluster.memberNum=42`. Note that `vmagent` shows up to `-promscrape.maxDroppedTargets` dropped targets on the `/service-discovery` page. Increase the `-promscrape.maxDroppedTargets` command-line flag value if the `/service-discovery` page misses some dropped targets. -If each target is scraped by multiple `vmagent` instances, then data deduplication must be enabled at remote storage pointed by `-remoteWrite.url`. +If multiple `vmagent` instances scrape the same target, data deduplication must be enabled for the remote storage specified by `-remoteWrite.url`. The `-dedup.minScrapeInterval` must be set to the `scrape_interval` configured at `-promscrape.config`. See [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) for details. -The `-promscrape.cluster.memberLabel` command-line flag allows specifying a name for `member num` label to add to all the scraped metrics. +The `-promscrape.cluster.memberLabel` command-line flag allows specifying a name for the `member num` label to add to all the scraped metrics. The value of the `member num` label is set to `-promscrape.cluster.memberNum`. For example, the following config instructs adding `vmagent_instance="0"` label to all the metrics scraped by the given `vmagent` instance: @@ -835,19 +833,19 @@ See also [how to shard data among multiple remote storage systems](#sharding-amo It is possible to run multiple **identically configured** `vmagent` instances or `vmagent` [clusters](#scraping-big-number-of-targets), so they [scrape](#how-to-collect-metrics-in-prometheus-format) the same set of targets and push the collected data to the same set of VictoriaMetrics remote storage systems. -Two **identically configured** vmagent instances or clusters is usually called an HA pair. +Two **identically configured** vmagent instances or clusters are usually called an HA pair. When running HA pairs, [deduplication](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) must be configured -on the VictoriaMetrics side in order to de-duplicate received samples. +on the VictoriaMetrics side to deduplicate received samples. See [these docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) for details. -It is also recommended to pass different values to `-promscrape.cluster.name` command-line flag per `vmagent` +It is also recommended to pass different values to the `-promscrape.cluster.name` command-line flag per `vmagent` instance or per `vmagent` cluster in a HA setup. This is needed for proper data de-duplication. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/2679) for details. ## Scraping targets via a proxy -`vmagent` supports scraping targets via http, https and socks5 proxies. Proxy address must be specified in `proxy_url` option. For example, the following scrape config instructs +`vmagent` supports scraping targets via HTTP, HTTPS, and SOCKS5 proxies. Proxy address must be specified in the `proxy_url` option. For example, the following scrape config instructs target scraping via https proxy at `https://proxy-addr:1234`: ```yaml @@ -863,7 +861,7 @@ Proxy can be configured with the following optional settings: * `proxy_bearer_token` and `proxy_bearer_token_file` for Bearer token authorization * `proxy_oauth2` for OAuth2 config. See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#oauth2). * `proxy_tls_config` for TLS config. See [these docs](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#tls_config). -* `proxy_headers` for passing additional HTTP headers in requests to proxy. +* `proxy_headers` for passing additional HTTP headers in requests to the proxy. For example: @@ -887,7 +885,7 @@ scrape_configs: ## On-disk persistence `vmagent` stores pending data that cannot be sent to the configured remote storage systems in a timely manner. -By default, `vmagent` writes all the pending data to folder configured via `-remoteWrite.tmpDataPath` cmd-line flag +By default, `vmagent` writes all the pending data to the folder configured via `-remoteWrite.tmpDataPath` cmd-line flag until this data is sent to the configured `-remoteWrite.url` systems or until the folder becomes full. The maximum data size that can be saved to `-remoteWrite.tmpDataPath` per every configured `-remoteWrite.url` can be limited via `-remoteWrite.maxDiskUsagePerURL` command-line flag. When this limit is reached, `vmagent` drops the oldest @@ -905,7 +903,7 @@ Each remote write URL corresponds to a folder similar to `1_B9EB7BE220B91E9D`. It's generated based on the following information: -1. The **sequence order** of the remote write URL cmd-line flags, starting from **1**. +1. The **sequence order** of the remote write URL command-line flags, starting from **1**. 2. The **hash result** of the remote write URL itself, excluding query parameters and fragments. For example, for the remote write configs: @@ -929,25 +927,25 @@ vmagent will generate the following persistent queue folders: There are cases when it is better to disable on-disk persistence for pending data on the `vmagent` side: * When the persistent disk performance isn't enough for the given data processing rate. -* When it is better to buffer pending data at the client side instead of buffering it at `vmagent` side in the `-remoteWrite.tmpDataPath` folder. +* When clients can buffer pending data more efficiently on their side than on the `vmagent` side (on the `-remoteWrite.tmpDataPath` folder). * When the data is already buffered at [Kafka](https://docs.victoriametrics.com/victoriametrics/integrations/kafka/#reading-metrics) or [Google PubSub](https://docs.victoriametrics.com/victoriametrics/integrations/pubsub/#reading-metrics). -* When it is better to drop pending data instead of buffering it. +* When dropping pending data is better than buffering it. -In this case `-remoteWrite.disableOnDiskQueue` command-line flag can be passed to `vmagent` per each configured `-remoteWrite.url`. +In this case, the `-remoteWrite.disableOnDiskQueue` command-line flag can be passed to `vmagent` for each configured `-remoteWrite.url`. `vmagent` works in the following way if the corresponding remote storage system at `-remoteWrite.url` cannot keep up with the data ingestion rate and the `-remoteWrite.disableOnDiskQueue` command-line flag is set: * It returns `429 Too Many Requests` HTTP error to clients, which send data to `vmagent` via [supported HTTP endpoints](#how-to-push-data-to-vmagent). - If `-remoteWrite.dropSamplesOnOverload` command-line flag is set or if multiple `-remoteWrite.url` command-line flags are set, - then the ingested samples are silently dropped instead of returning the error to clients. + If the `-remoteWrite.dropSamplesOnOverload` command-line flag is set or if multiple `-remoteWrite.url` command-line flags are set, + then the ingested samples are silently dropped instead of returning an error to clients. * It suspends consuming data from [Kafka](https://docs.victoriametrics.com/victoriametrics/integrations/kafka/#reading-metrics) or [Google PubSub](https://docs.victoriametrics.com/victoriametrics/integrations/pubsub/) until the remote storage becomes available. - If `-remoteWrite.dropSamplesOnOverload` command-line flag is set or if multiple `-remoteWrite.disableOnDiskQueue` command-line flags are set + If the `-remoteWrite.dropSamplesOnOverload` command-line flag is set or if multiple `-remoteWrite.disableOnDiskQueue` command-line flags are set for different `-remoteWrite.url` options, then the fetched samples are silently dropped instead of suspending data consumption from Kafka or Google PubSub. * It drops samples pushed to `vmagent` via non-HTTP protocols and logs the error. Pass `-remoteWrite.dropSamplesOnOverload` command-line flag in order to suppress error messages in this case. * It drops samples [scraped from Prometheus-compatible targets](#how-to-collect-metrics-in-prometheus-format), because it is better from operations perspective to drop samples instead of blocking the scrape process. -* It drops [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/) output samples, because it is better from operations perspective +* It drops [stream aggregation](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/) output samples, because it is better from an operations perspective to drop output samples instead of blocking the stream aggregation process. The number of dropped samples because of overloaded remote storage can be [monitored](#monitoring) via `vmagent_remotewrite_samples_dropped_total` metric. @@ -960,7 +958,7 @@ on spiky workloads, since `vmagent` may buffer more data in memory before return if `-remoteWrite.disableOnDiskQueue` command-line flag is specified. It may also read buffered data from `-remoteWrite.tmpDataPath` on startup. -When `-remoteWrite.disableOnDiskQueue` command-line flag is set, `vmagent` may send the same samples multiple times to the configured remote storage +When the `-remoteWrite.disableOnDiskQueue` command-line flag is set, `vmagent` may send the same samples multiple times to the configured remote storage if it cannot keep up with the data ingestion rate. In this case [deduplication](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#deduplication) must be enabled on all the configured remote storage systems. @@ -970,25 +968,25 @@ By default, `vmagent` doesn't limit the number of time series each scrape target The limit can be enforced in the following places: * Via `-promscrape.seriesLimitPerTarget` command-line flag. This limit is applied individually - to all the scrape targets defined in the file pointed by `-promscrape.config`. + to all the scrape targets defined in the file pointed to by `-promscrape.config`. * Via `series_limit` config option at [scrape_config](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) section. The `series_limit` allows overriding the `-promscrape.seriesLimitPerTarget` on a per-`scrape_config` basis. - If `series_limit` is set to `0` or to negative value, then it isn't applied to the given `scrape_config`, - even if `-promscrape.seriesLimitPerTarget` command-line flag is set. + If `series_limit` is set to `0` or to a negative value, then it isn't applied to the given `scrape_config`, + even if the `-promscrape.seriesLimitPerTarget` command-line flag is set. * Via `__series_limit__` label, which can be set with [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/) at `relabel_configs` section. The `__series_limit__` allows overriding the `series_limit` on a per-target basis. - If `__series_limit__` is set to `0` or to negative value, then it isn't applied to the given target. + If `__series_limit__` is set to `0` or to a negative value, then it isn't applied to the given target. Typical use case: to set the limit via [Kubernetes annotations](https://kubernetes.io/docs/concepts/overview/working-with-objects/annotations/) - for targets, which may expose too high number of time series. + for targets, which may expose too many time series. -Scraped metrics are dropped for time series exceeding the given limit on the time window of 24h. +Scraped metrics are dropped for time series exceeding the given limit for the 24h time window. `vmagent` creates the following additional per-target metrics for targets with non-zero series limit: * `scrape_series_limit_samples_dropped` - the number of dropped samples during the scrape when the unique series limit is exceeded. * `scrape_series_limit` - the series limit for the given target. * `scrape_series_current` - the current number of series for the given target. -These metrics are automatically sent to the configured `-remoteWrite.url` alongside with the scraped per-target metrics. +These metrics are automatically sent to the configured `-remoteWrite.url` alongside the scraped per-target metrics. These metrics allow building the following alerting rules: @@ -1005,51 +1003,50 @@ The limit can be enforced by setting the following command-line flags: * `-remoteWrite.maxDailySeries` - limits the number of unique time series `vmagent` can write to remote storage systems during the last day. Useful for limiting daily churn rate. -It is possible to use `-1` as a value for these flags{{% available_from "v1.140.0" %}} in order to enable series tracking but set limit to maximum possible value. -This is useful in order to estimate the number of unique series which is written to remote storage systems without enforcing limits. +It is possible to use `-1` as a value for these flags{{% available_from "v1.140.0" %}} in order to enable series tracking while setting the limit to the maximum possible value. +This is useful for estimating the number of unique series written to remote storage systems without enforcing limits. -Both limits can be set simultaneously. If any of these limits is reached, then samples for new time series are dropped instead of sending -them to remote storage systems. A sample of dropped series is put in the log with `WARNING` level. +Both limits can be set simultaneously. If any of these limits are reached, then samples for the new time series are dropped instead of being sent to remote storage systems. A sample of dropped series is logged with a `WARNING` level message. `vmagent` exposes the following metrics at `http://vmagent:8429/metrics` page (see [monitoring docs](#monitoring) for details): -* `vmagent_hourly_series_limit_rows_dropped_total` - the number of metrics dropped due to exceeded hourly limit on the number of unique time series. +* `vmagent_hourly_series_limit_rows_dropped_total` - the number of metrics dropped due to exceeding the hourly limit on the number of unique time series. * `vmagent_hourly_series_limit_max_series` - the hourly series limit set via `-remoteWrite.maxHourlySeries`. * `vmagent_hourly_series_limit_current_series` - the current number of unique series registered during the last hour. -* `vmagent_daily_series_limit_rows_dropped_total` - the number of metrics dropped due to exceeded daily limit on the number of unique time series. +* `vmagent_daily_series_limit_rows_dropped_total` - the number of metrics dropped due to exceeding the daily limit on the number of unique time series. * `vmagent_daily_series_limit_max_series` - the daily series limit set via `-remoteWrite.maxDailySeries`. * `vmagent_daily_series_limit_current_series` - the current number of unique series registered during the last day. -These limits are approximate, so `vmagent` can underflow/overflow the limit by a small percentage (usually less than 1%). +These limits are approximate, so `vmagent` can underflow or overflow them by a small percentage (usually less than 1%). See also [cardinality explorer docs](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#cardinality-explorer). ## Monitoring `vmagent` exports various metrics in Prometheus exposition format at `http://vmagent-host:8429/metrics` page. -We recommend setting up regular scraping of this page either through `vmagent` itself or by Prometheus-compatible scraper, +We recommend setting up regular scraping of this page either through `vmagent` itself or by a Prometheus-compatible scraper, so that the exported metrics may be analyzed later. If you use Google Cloud Managed Prometheus for scraping metrics from VictoriaMetrics components, then pass `-metrics.exposeMetadata` -command-line to them, so they add `TYPE` and `HELP` comments per each exposed metric at `/metrics` page. +command-line to them, so they add `TYPE` and `HELP` comments for each exposed metric on the `/metrics` page. See [these docs](https://cloud.google.com/stackdriver/docs/managed-prometheus/troubleshooting#missing-metric-type) for details. -Use official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for `vmagent` state overview. +Use the official [Grafana dashboard](https://grafana.com/grafana/dashboards/12683) for the `vmagent` state overview. Graphs on this dashboard contain useful hints - hover the `i` icon at the top left corner of each graph in order to read it. -If you have suggestions for improvements or have found a bug - please open an issue on [github](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) -or add a review to the dashboard. +If you have suggestions for improvements or have found a bug, please open an issue on [GitHub](https://github.com/VictoriaMetrics/VictoriaMetrics/issues) +, or add a review to the dashboard. `vmagent` also exports the status for various targets at the following pages: -* `http://vmagent-host:8429/targets`. This pages shows the current status for every active target. -* `http://vmagent-host:8429/service-discovery`. This pages shows the list of discovered targets with the discovered `__meta_*` labels +* `http://vmagent-host:8429/targets`. This page shows the current status for every active target. +* `http://vmagent-host:8429/service-discovery`. This page shows the list of discovered targets with the discovered `__meta_*` labels according to [these docs](https://docs.victoriametrics.com/victoriametrics/sd_configs/). - This page may help debugging target [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/). -* `http://vmagent-host:8429/api/v1/targets`. This handler returns JSON response + This page may help with debugging target [relabeling](https://docs.victoriametrics.com/victoriametrics/relabeling/). +* `http://vmagent-host:8429/api/v1/targets`. This handler returns a JSON response compatible with [the corresponding page from Prometheus API](https://prometheus.io/docs/prometheus/latest/querying/api/#targets). -* `http://vmagent-host:8429/ready`. This handler returns http 200 status code when `vmagent` finishes +* `http://vmagent-host:8429/ready`. This handler returns an HTTP 200 status code when `vmagent` finishes its initialization for all the [service_discovery configs](https://docs.victoriametrics.com/victoriametrics/sd_configs/). - It may be useful to perform `vmagent` rolling update without any scrape loss. + It may be useful to perform a `vmagent` rolling update without any scrape loss. ## Troubleshooting @@ -1061,20 +1058,20 @@ or add a review to the dashboard. * If `vmagent` uses too much RAM or CPU, then follow [these recommendations](#performance-optimizations). * When `vmagent` scrapes many unreliable targets, it can flood the error log with scrape errors. It is recommended to investigate and fix these errors. - If it is unfeasible to fix all the reported errors, then they can be suppressed by passing `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. - The most recent scrape error per each target can be observed at `http://vmagent-host:8429/targets` and `http://vmagent-host:8429/api/v1/targets`. + If it is unfeasible to fix all the reported errors, then they can be suppressed by passing the `-promscrape.suppressScrapeErrors` command-line flag to `vmagent`. + The most recent scrape error for each target can be observed at `http://vmagent-host:8429/targets` and `http://vmagent-host:8429/api/v1/targets`. -* The `http://vmagent-host:8429/config` page shows current active `-promscrape.config` configuration. - Access to endpoint can be protected via `-configAuthKey` command-line flag. +* The `http://vmagent-host:8429/config` page shows the current active `-promscrape.config` configuration. + Access to the endpoint can be protected via the `-configAuthKey` command-line flag. * Pages `http://vmagent-host:8429/remotewrite-relabel-config` and `http://vmagent-host:8429/remotewrite-url-relabel-config` {{% available_from "v1.129.0" %}} show current active `-remoteWrite.relabelConfig` and `-remoteWrite.urlRelabelConfig` configuration - correspondingly. Access to endpoints can be protected via `-configAuthKey` command-line flag. + correspondingly. Access to endpoints can be protected via the `-configAuthKey` command-line flag. * The `http://vmagent-host:8429/service-discovery` page could be useful for debugging the relabeling process for scrape targets. This page contains original labels for targets dropped during relabeling. By default, the `-promscrape.maxDroppedTargets` targets are shown here. If your setup drops more targets during relabeling, - then increase `-promscrape.maxDroppedTargets` command-line flag value to see all the dropped targets. + then increase the `-promscrape.maxDroppedTargets` command-line flag value to see all the dropped targets. Note that tracking each dropped target requires up to 10Kb of RAM. Therefore, big values for `-promscrape.maxDroppedTargets` may result in increased memory usage if a large number of scrape targets are dropped during relabeling. @@ -1087,26 +1084,26 @@ or add a review to the dashboard. Therefore, it starts dropping the buffered data if the on-disk buffer size exceeds `-remoteWrite.maxDiskUsagePerURL`. * `vmagent` drops data blocks if remote storage replies with `400 Bad Request` or `409 Conflict` HTTP responses. - The number of dropped blocks can be monitored via `vmagent_remotewrite_packets_dropped_total` metric exported at [/metrics page](#monitoring). + The number of dropped blocks can be monitored via the `vmagent_remotewrite_packets_dropped_total` metric, which is exported on the [/metrics page](#monitoring). * Use `-remoteWrite.queues=1` when `-remoteWrite.url` points to remote storage, which doesn't accept out-of-order samples (aka data backfilling). - Such storage systems include Prometheus, Mimir, Cortex and Thanos, which typically emit `out of order sample` errors. + Such storage systems include Prometheus, Mimir, Cortex, and Thanos, which typically emit `out of order sample` errors. The best solution is to use remote storage with [backfilling support](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#backfilling) such as VictoriaMetrics. * `vmagent` buffers scraped data at the `-remoteWrite.tmpDataPath` directory until it is sent to `-remoteWrite.url`. - The directory can grow large when remote storage is unavailable for extended periods of time and if the maximum directory size isn't limited - with `-remoteWrite.maxDiskUsagePerURL` command-line flag. - If you don't want to send all the buffered data from the directory to remote storage then simply stop `vmagent` and delete the directory. + The directory can grow large when remote storage is unavailable for extended periods of time, and if the maximum directory size isn't limited + with the `-remoteWrite.maxDiskUsagePerURL` command-line flag. + If you don't want to send all the buffered data from the directory to remote storage, then simply stop `vmagent` and delete the directory. * If `vmagent` runs on a host with slow persistent storage, which cannot keep up with the volume of processed samples, then it is possible to disable - the persistent storage with `-remoteWrite.disableOnDiskQueue` command-line flag. See [these docs](#disabling-on-disk-persistence) for more details. + the persistent storage with the `-remoteWrite.disableOnDiskQueue` command-line flag. See [these docs](#disabling-on-disk-persistence) for more details. -* By default `vmagent` masks `-remoteWrite.url` with `secret-url` values in logs and at `/metrics` page because - the url may contain sensitive information such as auth tokens or passwords. - Pass `-remoteWrite.showURL` command-line flag when starting `vmagent` in order to see all the valid urls. +* By default, `vmagent` masks `-remoteWrite.url` with `secret-url` values in logs and at the `/metrics` page because + the URL may contain sensitive information such as auth tokens or passwords. + Pass `-remoteWrite.showURL` command-line flag when starting `vmagent` in order to see all the valid URLs. -* By default `vmagent` evenly spreads scrape load in time. If a particular scrape target must be scraped at the beginning of some interval, - then `scrape_align_interval` option must be used. For example, the following config aligns hourly scrapes to the beginning of hour: +* By default, `vmagent` evenly spreads scrape load in time. If a particular scrape target must be scraped at the beginning of some interval, + then the `scrape_align_interval` option must be used. For example, the following config aligns hourly scrapes to the beginning of the hour: ```yaml scrape_configs: @@ -1115,8 +1112,8 @@ or add a review to the dashboard. scrape_align_interval: 1h ``` -* By default `vmagent` evenly spreads scrape load in time. If a particular scrape target must be scraped at specific offset, then `scrape_offset` option must be used. - For example, the following config instructs `vmagent` to scrape the target at 10 seconds of every minute: +* By default, vmagent evenly spreads scrape load in time. If a particular scrape target needs scraping at a specific offset, use the `scrape_offset` option. + For example, the following config instructs `vmagent` to scrape the target 10 seconds after the start of every 1-minute interval: ```yaml scrape_configs: @@ -1129,14 +1126,14 @@ or add a review to the dashboard. or they use an init container. These errors can either be fixed or suppressed with the `-promscrape.suppressDuplicateScrapeTargetErrors` command-line flag. See the available options below if you prefer fixing the root cause of the error: - The following relabeling rule may be added to `relabel_configs` section in order to filter out pods with unneeded ports: + The following relabeling rule may be added to the `relabel_configs` section in order to filter out pods with unneeded ports: ```yaml - action: keep_if_equal source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_port, __meta_kubernetes_pod_container_port_number] ``` - The following relabeling rule may be added to `relabel_configs` section in order to filter out init container pods: + The following relabeling rule may be added to the `relabel_configs` section in order to filter out init container pods: ```yaml - action: drop @@ -1154,12 +1151,12 @@ See also: `vmagent` buffers collected metrics on disk at the directory specified via `-remoteWrite.tmpDataPath` command-line flag until the metrics are sent to remote storage configured via `-remoteWrite.url` command-line flag. The `-remoteWrite.tmpDataPath` directory can grow large when remote storage is unavailable for extended -periods of time and if the maximum directory size isn't limited with `-remoteWrite.maxDiskUsagePerURL` command-line flag. +periods of time and if the maximum directory size isn't limited with the `-remoteWrite.maxDiskUsagePerURL` command-line flag. -To estimate the allocated disk size for persistent queue, or to estimate `-remoteWrite.maxDiskUsagePerURL` command-line flag value, +To estimate the allocated disk size for the persistent queue, or to estimate the `-remoteWrite.maxDiskUsagePerURL` command-line flag value, take into account the following attributes: -1. The **size in bytes** of data stream sent by vmagent: +1. The **size in bytes** of the data stream sent by vmagent: Run the query `sum(rate(vmagent_remotewrite_bytes_sent_total[1h])) by(instance,url)` in [vmui](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#vmui) or Grafana to get the amount of bytes sent by each vmagent instance per second. @@ -1179,40 +1176,40 @@ Additional notes: 1. Re-evaluate the estimation each time when: * there is an increase in the vmagent's workload * there is a change in [relabeling rules](https://docs.victoriametrics.com/victoriametrics/relabeling/) which could increase the amount metrics to send - * there is a change in number of configured `-remoteWrite.url` addresses + * there is a change in the number of configured `-remoteWrite.url` addresses 1. The minimum disk size to allocate for the persistent queue is 500Mi per each `-remoteWrite.url`. -1. On-disk persistent queue can be disabled if needed. See [these docs](https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence). +1. The on-disk persistent queue can be disabled if needed. See [these docs](https://docs.victoriametrics.com/victoriametrics/vmagent/#disabling-on-disk-persistence). ## Security See general recommendations regarding [security](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#security). vmagent's `/remotewrite-relabel-config` and `/remotewrite-url-relabel-config` endpoints {{% available_from "v1.129.0" %}} -can be protected via `-configAuthKey` command-line flag. +can be protected via the `-configAuthKey` command-line flag. ### mTLS protection -By default `vmagent` accepts http requests at port `8429` (this port can be changed via `-httpListenAddr` command-line flags), -it is expected to run in an isolated trusted network. -[Enterprise version of vmagent](https://docs.victoriametrics.com/victoriametrics/enterprise/) supports the ability to accept [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication) +By default, `vmagent` accepts HTTP requests at port `8429` (this port can be changed via `-httpListenAddr` command-line flags). +It is expected that `vmagent` runs in an isolated, trusted network. +The [Enterprise version of vmagent](https://docs.victoriametrics.com/victoriametrics/enterprise/) supports the ability to accept [mTLS](https://en.wikipedia.org/wiki/Mutual_authentication) requests at this port, by specifying `-tls` and `-mtls` command-line flags. For example, the following command runs `vmagent`, which accepts only mTLS requests at port `8429`: ```sh ./vmagent -tls -mtls -remoteWrite.url=... ``` -By default, the system-wide [TLS Root CA](https://en.wikipedia.org/wiki/Root_certificate) is used for verifying client certificates if `-mtls` command-line flag is specified. -It is possible to specify custom TLS Root CA via `-mtlsCAFile` command-line flag. +By default, the system-wide [TLS Root CA](https://en.wikipedia.org/wiki/Root_certificate) is used for verifying client certificates if the `-mtls` command-line flag is specified. +It is possible to specify a custom TLS Root CA via the `-mtlsCAFile` command-line flag. ## Performance optimizations -`vmagent` is optimized for low CPU usage and low RAM usage without the need to tune any configs. Sometimes it is needed to optimize the CPU / RAM usage of `vmagent` even more. +`vmagent` is optimized for low CPU and RAM usage, with no need to tune any configs. Sometimes, however, it is necessary to further optimize the CPU/RAM usage of `vmagent`. For example, if `vmagent` needs to scrape thousands of targets in resource-constrained environments, then the following options may help to reduce CPU and RAM usage: -* Set [GOGC](https://pkg.go.dev/runtime#hdr-Environment_Variables) environment variable to `100`. This reduces CPU usage at the cost of higher RAM usage. +* Set the [GOGC](https://pkg.go.dev/runtime#hdr-Environment_Variables) environment variable to `100`. This reduces CPU usage at the cost of higher RAM usage. -* Set [GOMAXPROCS](https://pkg.go.dev/runtime#hdr-Environment_Variables) environment variable to the value slightly bigger than the number of CPU cores used by `vmagent`. - Another option is to set CPU limit in Kubernetes / Docker to the integer value bigger than the number of CPU cores used by `vmagent`. +* Set [GOMAXPROCS](https://pkg.go.dev/runtime#hdr-Environment_Variables) environment variable to a value slightly bigger than the number of CPU cores used by `vmagent`. + Another option is to set the CPU limit in Kubernetes / Docker to an integer value bigger than the number of CPU cores used by `vmagent`. This reduces RAM and CPU usage when `vmagent` runs in an environment with a large number of available CPU cores. Note that it may be necessary to increase the `-remoteWrite.queues` command-line flag to a larger value if `GOMAXPROCS` is set to too small of a value, since by default `-remoteWrite.queues` is proportional to `GOMAXPROCS`. @@ -1229,22 +1226,22 @@ For example, if `vmagent` needs to scrape thousands of targets in resource-const between `vmagent` and scrape targets. * Disable tracking of original labels for the discovered targets via `-promscrape.dropOriginalLabels` command-line flag. This helps to reduce RAM usage when `vmagent` - discovers a large number of scrape targets and the majority of these targets are [dropped](https://docs.victoriametrics.com/victoriametrics/relabeling/#how-to-drop-discovered-targets). - This is a typical case when `vmagent` discovers Kubernetes targets. The downside of using `-promscrape.dropOriginalLabels` command-line flag + discovers a large number of scrape targets, and the majority of these targets are [dropped](https://docs.victoriametrics.com/victoriametrics/relabeling/#how-to-drop-discovered-targets). + This is a typical case when `vmagent` discovers Kubernetes targets. The downside of using the `-promscrape.dropOriginalLabels` command-line flag is the reduced [debuggability](https://docs.victoriametrics.com/victoriametrics/relabeling/#relabel-debugging) for improperly configured per-target relabeling. * Disable [staleness markers](https://docs.victoriametrics.com/victoriametrics/vmagent/#prometheus-staleness-markers) via `-promscrape.noStaleMarkers` command-line flag or via `no_stale_markers: true` option in the [scrape_config](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs). This reduces RAM and CPU usage. - Note that disabling staleness markers may result in unexpected query results when scrape targets are frequently rotated (this is a typical case in Kubernetes). + Note that disabling staleness markers may yield unexpected query results when scrape targets are frequently rotated (which is typical in Kubernetes). -* Set `-memory.allowedBytes` command-line flag to a value close to the actual memory usage of `vmagent`. Another option is to set memory limit in Kubernetes / Docker +* Set `-memory.allowedBytes` command-line flag to a value close to the actual memory usage of `vmagent`. Another option is to set the memory limit in Kubernetes / Docker to the value 50% larger than the actual memory usage of `vmagent`. This should reduce memory usage spikes for `vmagent` running in the environment with a large amount of available memory and - when the remote storage cannot keep up with the data ingestion rate. Increasing `-remoteWrite.queues` command-line flag value may help in this case too. + when the remote storage cannot keep up with the data ingestion rate. Increasing the `-remoteWrite.queues` command-line flag value may help in this case, too. -* In extreme cases it may be useful to set `-promscrape.disableKeepAlive` command-line flag in order to save RAM on HTTP keep-alive connections to thousands of scrape targets. +* In extreme cases, it may be useful to set the `-promscrape.disableKeepAlive` command-line flag in order to save RAM on HTTP keep-alive connections to thousands of scrape targets. * Increase the `scrape_interval` option in the `global` section of the `-promscrape.config` and/or at every [scrape_config](https://docs.victoriametrics.com/victoriametrics/sd_configs/#scrape_configs) - to reduce CPU usage. For example, increasing the `scrape_interval` from `10s` to `30s` across all the targets decreases CPU usage at `vmagent` by up to 3x. + to reduce CPU usage. For example, increasing the `scrape_interval` from `10s` to `30s` across all targets reduces CPU usage on `vmagent` by up to 3x. Example command, which runs `vmagent` in an optimized mode: @@ -1256,7 +1253,7 @@ GOGC=100 GOMAXPROCS=1 ./vmagent -promscrape.disableCompression -promscrape.dropO We recommend using the [official binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest) - `vmagent` is located in the `vmutils-...` archives. -It may be needed to build `vmagent` from source code when developing or testing new feature or bugfix. +It may be necessary to build `vmagent` from source code when developing or testing a new feature or bug fix. ### Development build @@ -1266,9 +1263,9 @@ It may be needed to build `vmagent` from source code when developing or testing ### Production build -1. [Install docker](https://docs.docker.com/install/). +1. [Install Docker](https://docs.docker.com/install/). 1. Run `make vmagent-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). - It builds `vmagent-prod` binary and puts it into the `bin` folder. + It builds the `vmagent-prod` binary and puts it into the `bin` folder. ### Building docker images @@ -1276,8 +1273,8 @@ Run `make package-vmagent`. It builds `victoriametrics/vmagent:` docker `` is an auto-generated image tag, which depends on source code in [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). The `` may be manually set via `PKG_TAG=foobar make package-vmagent`. -The base docker image is [alpine](https://hub.docker.com/_/alpine) but it is possible to use any other base image -by setting it via `` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image: +The base Docker image is [alpine](https://hub.docker.com/_/alpine), but it is possible to use any other base image +by setting it via the `` environment variable. For example, the following command builds the image on top of [scratch](https://hub.docker.com/_/scratch) image: ```sh ROOT_IMAGE=scratch make package-vmagent @@ -1291,13 +1288,13 @@ ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://b 1. [Install Go](https://golang.org/doc/install). 1. Run `make vmagent-linux-arm` or `make vmagent-linux-arm64` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics) - It builds `vmagent-linux-arm` or `vmagent-linux-arm64` binary respectively and puts it into the `bin` folder. + It builds `vmagent-linux-arm` or `vmagent-linux-arm64` binary, respectively, and puts it into the `bin` folder. ### Production ARM build -1. [Install docker](https://docs.docker.com/install/). +1. [Install Docker](https://docs.docker.com/install/). 1. Run `make vmagent-linux-arm-prod` or `make vmagent-linux-arm64-prod` from the root folder of [the repository](https://github.com/VictoriaMetrics/VictoriaMetrics). - It builds `vmagent-linux-arm-prod` or `vmagent-linux-arm64-prod` binary respectively and puts it into the `bin` folder. + It builds `vmagent-linux-arm-prod` or `vmagent-linux-arm64-prod` binary, respectively, and puts it into the `bin` folder. ## Profiling @@ -1315,15 +1312,15 @@ curl http://0.0.0.0:8429/debug/pprof/heap > mem.pprof curl http://0.0.0.0:8429/debug/pprof/profile > cpu.pprof ``` -The command for collecting CPU profile waits for 30 seconds before returning. +The command for collecting the CPU profile waits for 30 seconds before returning. The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof). -It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information. +It is safe to share the collected profiles from a security perspective, since they do not contain sensitive information. ## Advanced usage -`vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their descriptions and default value. +`vmagent` can be fine-tuned with various command-line flags. Run `./vmagent -help` in order to see the full list of these flags with their descriptions and default values. ### Common flags These flags are available in both VictoriaMetrics OSS and VictoriaMetrics Enterprise. @@ -1335,7 +1332,7 @@ These flags are available only in [VictoriaMetrics enterprise](https://docs.vict --- -Section below contains backward-compatible anchors for links that were moved or renamed. +The section below contains backward-compatible anchors for links that were moved or renamed. ###### Relabeling