VictoriaMetrics

VictoriaMetrics is a fast, cost-effective, and scalable solution for monitoring and managing time series data. It delivers high performance and reliability, making it an ideal choice for businesses of all sizes.

{width="600"}

Case studies: Grammarly, Roblox, Wix, Spotify,....
Available: Binary releases, Docker images on Docker Hub and Quay, Source code.
Deployment types: Single-node version and Cluster version under Apache License 2.0.
Getting started: Read key concepts and follow the quick start guide.
Community: Slack(join via Slack Inviter), X (Twitter), YouTube. See full list here.
Changelog: Project evolves fast - check the CHANGELOG, and How to upgrade.
Enterprise support: Contact us for commercial support with additional enterprise features.
Enterprise releases: Enterprise and long-term support releases (LTS) are publicly available and can be evaluated for free using a free trial license.
Security: we achieved security certifications for Database Software Development and Software-Based Monitoring Services.

Prominent features

VictoriaMetrics has the following prominent features:

It can be used as long-term storage for Prometheus. See these docs for details.
It can be used as a drop-in replacement for Prometheus in Grafana, because it supports the Prometheus querying API.
It can be used as a drop-in replacement for Graphite in Grafana, because it supports the Graphite API. VictoriaMetrics allows reducing infrastructure costs by more than 10x comparing to Graphite - see this case study.
It is easy to setup and operate:
- VictoriaMetrics consists of a single small executable without external dependencies.
- All the configuration is done via explicit command-line flags with reasonable defaults.
- All the data is stored in a single directory specified by the -storageDataPath command-line flag.
- Easy and fast backups from instant snapshots can be done with vmbackup / vmrestore tools. See this article for more details.
It implements a PromQL-like query language - MetricsQL, which provides improved functionality on top of PromQL.
It provides a global query view. Multiple Prometheus instances or any other data sources may ingest data into VictoriaMetrics. Later this data may be queried via a single query.
It provides high performance and good vertical and horizontal scalability for both data ingestion and data querying. It outperforms InfluxDB and TimescaleDB by up to 20x.
It uses 10x less RAM than InfluxDB and up to 7x less RAM than Prometheus, Thanos or Cortex when dealing with millions of unique time series (aka high cardinality).
It is optimized for time series with high churn rate.
It provides high data compression: up to 70x more data points may be stored into limited storage compared with TimescaleDB according to these benchmarks, and up to 7x less storage space is required compared to Prometheus, Thanos or Cortex. according to this benchmark.
It is optimized for storage with high-latency IO and low IOPS (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See disk IO graphs from these benchmarks.
A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, M3DB, Cortex, InfluxDB or TimescaleDB. See vertical scalability benchmarks, comparing Thanos to VictoriaMetrics cluster and Remote Write Storage Wars talk from PromCon 2019.
It protects the storage from data corruption on unclean shutdown (i.e. OOM, hardware reset or kill -9) thanks to the storage architecture.
It supports metrics scraping, ingestion and backfilling via the following protocols:
- Metrics scraping from Prometheus exporters.
- Prometheus remote write API.
- Prometheus exposition format.
- InfluxDB line protocol over HTTP, TCP and UDP.
- Graphite plaintext protocol with tags.
- OpenTSDB put message.
- HTTP OpenTSDB /api/put requests.
- JSON line format.
- Arbitrary CSV data.
- Native binary format.
- DataDog agent or DogStatsD.
- NewRelic infrastructure agent.
- OpenTelemetry metrics format.
- Zabbix Connector streaming format.
It supports powerful stream aggregation, which can be used as a statsd alternative.
It supports metrics relabeling.
It can deal with high cardinality issues and high churn rate issues via series limiter.
It ideally works for big amounts of time series with both high churn rate (APM, Kubernetes) and low churn rate (IoT sensors, connected cars, industrial telemetry, financial data - see these docs), plus various Enterprise workloads.
It has an open source cluster version.
It can store data on NFS-based storages such as Amazon EFS and Google Filestore.

See case studies for VictoriaMetrics and various Articles about VictoriaMetrics.

Components

VictoriaMetrics ecosystem contains the following components additionally to single-node VictoriaMetrics:

vmagent - lightweight agent for receiving metrics via pull-based and push-based protocols, transforming and sending them to the configured Prometheus-compatible remote storage systems such as VictoriaMetrics.
vmalert - a service for processing Prometheus-compatible alerting and recording rules.
vmalert-tool - a tool for validating alerting and recording rules.
vmauth - authorization proxy and load balancer optimized for VictoriaMetrics products.
vmgateway - authorization proxy with per-tenant rate limiting capabilities.
vmctl - a tool for migrating and copying data between different storage systems for metrics.
vmbackup, vmrestore and vmbackupmanager - tools for creating backups and restoring from backups for VictoriaMetrics data.
vminsert, vmselect and vmstorage - components of VictoriaMetrics cluster.
VictoriaLogs - user-friendly cost-efficient database for logs.

Operation

Install

To quickly try VictoriaMetrics, just download the VictoriaMetrics executable or docker image from Docker Hub or Quay and start it with the desired command-line flags. See also QuickStart guide for additional information.

VictoriaMetrics can also be installed via these installation methods:

How to start VictoriaMetrics

The following command-line flags are used the most:

-storageDataPath - VictoriaMetrics stores all the data in this directory. The default path is victoria-metrics-data in the current working directory.
-retentionPeriod - retention for stored data. Older data is automatically deleted. Default retention is 1 month (31 days). The minimum retention period is 24h or 1d. See these docs for more details.

Other flags have good enough default values, so set them only if you really need to. Pass -help to see all the available flags with description and default values.

The following docs may be useful during initial VictoriaMetrics setup:

VictoriaMetrics accepts Prometheus querying API requests on port 8428 by default.

It is recommended setting up monitoring for VictoriaMetrics.

Environment variables

All the VictoriaMetrics components allow referring environment variables in yaml configuration files (such as -promscrape.config) and in command-line flags via %{ENV_VAR} syntax. For example, -metricsAuthKey=%{METRICS_AUTH_KEY} is automatically expanded to -metricsAuthKey=top-secret if METRICS_AUTH_KEY=top-secret environment variable exists at VictoriaMetrics startup. This expansion is performed by VictoriaMetrics itself.

VictoriaMetrics recursively expands %{ENV_VAR} references in environment variables on startup. For example, FOO=%{BAR} environment variable is expanded to FOO=abc if BAR=a%{BAZ} and BAZ=bc environment variables exist.

Additionally, all the VictoriaMetrics components allow setting flag values via environment variables according to these rules:

The -envflag.enable flag must be set.
Each . char in flag name must be substituted with _ (for example -insert.maxQueueDuration <duration> will translate to insert_maxQueueDuration=<duration>).
Repeated flags can be replaced by an environment variable with comma separated values for the repeated flags. For example -storageNode <nodeA> -storageNode <nodeB> command-line flags can be set as storageNode=<nodeA>,<nodeB> environment variable.
Environment var prefix can be set via -envflag.prefix flag. For instance, if -envflag.prefix=VM_, then env vars must be prepended with VM_.

Setting up service

Read instructions on how to set up VictoriaMetrics as a service for your OS. See also ansible playbooks.

Running as Windows service

In order to run VictoriaMetrics as a Windows service it is required to create a service configuration for WinSW and then install it as a service according to the following guide:

Create a service configuration:

<service>
  <id>VictoriaMetrics</id>
  <name>VictoriaMetrics</name>
  <description>VictoriaMetrics</description>
  <executable>%BASE%\victoria-metrics-windows-amd64-prod.exe"</executable>

  <onfailure action="restart" delay="10 sec"/>
  <onfailure action="restart" delay="20 sec"/>

  <resetfailure>1 hour</resetfailure>

  <arguments>-envflag.enable</arguments>

  <priority>Normal</priority>

  <stoptimeout>15 sec</stoptimeout>

  <stopparentprocessfirst>true</stopparentprocessfirst>
    <startmode>Automatic</startmode>
    <waithint>15 sec</waithint>
    <sleeptime>1 sec</sleeptime>

  <logpath>%BASE%\logs</logpath>
  <log mode="roll">
    <sizeThreshold>10240</sizeThreshold>
    <keepFiles>8</keepFiles>
  </log>

  <env name="loggerFormat" value="json" />
  <env name="loggerOutput" value="stderr" />
  <env name="promscrape_config" value="C:\Program Files\victoria-metrics\promscrape.yml" />

</service>

Install WinSW by following this documentation.
Install VictoriaMetrics as a service by running the following from elevated PowerShell:
```
winsw install VictoriaMetrics.xml
Get-Service VictoriaMetrics | Start-Service
```

See this issue for more details.

Start with docker-compose

Docker-compose helps to spin up VictoriaMetrics, vmagent and Grafana with one command.

How to upgrade VictoriaMetrics

VictoriaMetrics is developed at a fast pace, so it is recommended periodically checking the CHANGELOG page and performing regular upgrades.

It is safe upgrading VictoriaMetrics to new versions unless release notes say otherwise. It is safe skipping multiple versions during the upgrade unless release notes say otherwise. It is recommended performing regular upgrades to the latest version, since it may contain important bug fixes, performance optimizations or new features.

It is also safe downgrading to older versions unless release notes say otherwise.

The following steps must be performed during the upgrade / downgrade procedure:

Send SIGINT signal to VictoriaMetrics process in order to gracefully stop it. See how to send signals to processes.
Wait until the process stops. This can take a few seconds.
Start the upgraded VictoriaMetrics.

Prometheus doesn't drop data during VictoriaMetrics restart. See this article for details. The same applies also to vmagent.

If you'd prefer not to manage upgrades yourself, VictoriaMetrics Cloud performs version upgrades automatically during maintenance windows with no action required on your part. See the VictoriaMetrics Cloud documentation to get started.

vmui

VictoriaMetrics provides UI for query troubleshooting and exploration. The UI is available at http://victoriametrics:8428/vmui (or at http://<vmselect>:8481/select/<accountID>/vmui/ in cluster version of VictoriaMetrics).

See VMUI at VictoriaMetrics playground.

VMUI provides the following features:

Query tab for ad-hoc queries in MetricsQL, supporting time series, tables and histogram representation
Raw Query tab {{% available_from "v1.107.0" %}} for viewing raw samples. Helps in debugging of unexpected query results.
Explore:
- Metrics explorer - automatically builds graphs for selected metrics;
- Cardinality explorer - stats about existing metrics in TSDB;
- Top queries - shows most frequently executed queries;
- Active queries - shows currently executed queries;
Tools:
- Trace analyzer - explore query traces loaded from JSON;
- Query analyzer - explore query results and traces loaded from JSON. See Export query button below;
- WITH expressions playground - test how WITH expressions work;
- Metric relabel debugger - debug relabeling rules.
- Downsampling filters debugger {{% available_from "v1.105.0" %}} - debug downsampling configs.
- Retention filters debugger {{% available_from "v1.105.0" %}} - debug retention filter configs.
Alerting {{% available_from "v1.125.0" %}} for displaying groups and rules from the vmalert service. The tab is available only if VictoriaMetrics single-node or vmselect are configured with -vmalert.proxyURL command-line flag.

Querying:

Enter the MetricsQL query in Query field and hit Enter. Multi-line queries can be entered by pressing Shift-Enter.

VMUI provides auto-completion for MetricsQL functions, metric names, label names and label values. The auto-completion can be enabled by checking the Autocomplete toggle. When the auto-completion is disabled, it can still be triggered for the current cursor position by pressing ctrl+space.

To correlate between multiple queries on the same graph click Add Query button and enter an additional query. Results for all the queries are displayed simultaneously on the same graph.

Results of a particular query can be hidden by clicking the eye icon on the right side of the input field. Clicking on the eye icon while holding the ctrl key hides results of all other queries.

VMUI automatically adjusts the interval between datapoints on the graph depending on the horizontal resolution and on the selected time range. The step value can be customized by changing Step value in the top-right corner.

Clicking on the line on graph pins the tooltip. User can pin multiple tooltips. Press x icon to unpin the tooltip.

Query history can be navigated by holding Ctrl (or Cmd on MacOS) and pressing up or down arrows on the keyboard while the cursor is located in the query input field.

VMUI automatically switches from graph view to heatmap view when the query returns histogram buckets (both Prometheus histograms and VictoriaMetrics histograms are supported). Try, for example, this query. To disable heatmap view press on settings icon in the top-right corner of graph area and disable Histogram mode toggle.

Time range:

The time range for graphs can be adjusted in multiple ways:

Click on time picker in the top-right corner to select a relative (Last N minutes) or absolute time range (specify From and To);
Zoom-in into graph by click-and-drag motion over the graph area;
When hovering cursor over the graph area, hold ctrl (or cmd on MacOS) and scroll up or down to zoom out or zoom in;
When hovering cursor over the graph area, hold ctrl (or cmd on MacOS) and drag the graph to the left / right to move the displayed time range into the future / past.

Legend:

Legend is displayed below the graph area. Clicking on item in legend hides all other items from displaying. Clicking on the item while holding the ctrl key hides only this item.

Clicking on the label-value pair in item automatically copies it into buffer, so it can be pasted later.

There are additional visualization settings in the top right-corner of the legend view: switching to table view, hiding common labels, etc.

Troubleshooting:

When querying the backfilled data or during query troubleshooting, it may be useful disabling response cache by clicking Disable cache checkbox.

Query can be traced by clicking on Trace query toggle below query input area and executing query again. Once trace is generated, click on it to expand for more details.

The query and its trace can be exported by clicking on debug icon in top right corner of trace block. The exported file file can be loaded again in VMUI on Tools=>Query Analyzer page.

Raw query page allows displaying raw, unmodified data. It can be useful for seeing the actual scrape interval or detecting sample duplicates.

Top queries

VMUI provides top queries tab, which can help determining the following query types:

the most frequently executed queries;
queries with the biggest average execution duration;
queries that took the most summary time for execution;
queries with the highest memory usage.

This information is obtained from the /api/v1/status/top_queries HTTP endpoint.

Active queries

VMUI provides active queries tab, which shows currently execute queries. It provides the following information per each query:

The query itself, together with the time range and step args passed to /api/v1/query_range.
The duration of the query execution.
The client address, who initiated the query execution.

This information is obtained from the /api/v1/status/active_queries HTTP endpoint.

Metrics explorer

VMUI provides an ability to explore metrics exported by a particular job / instance in the following way:

Open the vmui at http://victoriametrics:8428/vmui/.
Click the Explore Prometheus metrics tab.
Select the job you want to explore.
Optionally select the instance for the selected job to explore.
Select metrics you want to explore and compare.

It is possible to change the selected time range for the graphs in the top right corner.

Cardinality explorer

VictoriaMetrics provides an ability to explore time series cardinality at Explore cardinality tab in vmui:

Metric names with the highest number of series.
Labels with the highest number of series.
Values with the highest number of series for the selected label (aka focusLabel).
label=name pairs with the highest number of series.
Labels with the highest number of unique values.
Read usage statistics of metric names, based on metric name usage tracker. Shows the number of times the metric name was queried (Requests count), and the last time (Last request) when it was queried.

Note that cluster version of VictoriaMetrics may show lower than expected number of unique label values for labels with small number of unique values. This is because of implementation limits.

By default, cardinality explorer analyzes time series for the current date. It provides the ability to select different day at the top right corner. By default, all the time series for the selected date are analyzed. To narrow down the analysis specify series selector.

Cardinality explorer is built on top of /api/v1/status/tsdb.

Resources:

Cardinality explorer statistic inaccuracy

In cluster version of VictoriaMetrics each vmstorage tracks the stored time series individually. vmselect requests stats via /api/v1/status/tsdb API from each vmstorage node and merges the results by summing per-series stats. This may lead to inflated values when samples for the same time series are spread across multiple vmstorage nodes due to replication or rerouting.

How to apply new config to VictoriaMetrics

VictoriaMetrics is configured via command-line flags, so it must be restarted when new command-line flags should be applied:

Send SIGINT signal to VictoriaMetrics process in order to gracefully stop it.
Wait until the process stops. This can take a few seconds.
Start VictoriaMetrics with the new command-line flags.

Prometheus doesn't drop data during VictoriaMetrics restart. See this article for details. The same applies also to vmagent.

How to scrape Prometheus exporters such as node-exporter

VictoriaMetrics can be used as drop-in replacement for Prometheus for scraping targets configured in prometheus.yml config file according to the specification. Just set -promscrape.config command-line flag to the path to prometheus.yml config - and VictoriaMetrics should start scraping the configured targets. If the provided configuration file contains unsupported options, then either delete them from the file or just pass -promscrape.config.strictParse=false command-line flag to VictoriaMetrics, so it will ignore unsupported options.

The file pointed by -promscrape.config may contain %{ENV_VAR} placeholders, which are substituted by the corresponding ENV_VAR environment variable values.

Prometheus querying API usage

VictoriaMetrics supports the following handlers from Prometheus querying API:

/api/v1/query
/api/v1/query_range
/api/v1/series
/api/v1/labels
/api/v1/label/.../values
/api/v1/status/tsdb. See these docs for details.
/api/v1/targets - see these docs for more details.
/api/v1/metadata - see these docs for more details.
/federate - see these docs for more details.

These handlers can be queried from Prometheus-compatible clients such as Grafana or curl. All the Prometheus querying API handlers can be prepended with /prometheus prefix. For example, both /prometheus/api/v1/query and /api/v1/query should work.

Prometheus querying API enhancements

VictoriaMetrics accepts optional extra_label=<label_name>=<label_value> query arg, which can be used for enforcing additional label filters for queries. For example, /api/v1/query_range?extra_label=user_id=123&extra_label=group_id=456&query=<query> would automatically add {user_id="123",group_id="456"} label filters to the given <query>. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the extra_label query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See vmauth and vmgateway as examples of such proxies.

VictoriaMetrics accepts optional extra_filters[]=series_selector query arg, which can be used for enforcing arbitrary label filters for queries. For example, /api/v1/query_range?extra_filters[]={env=~"prod|staging",user="xyz"}&query=<query> would automatically add {env=~"prod|staging",user="xyz"} label filters to the given <query>. This functionality can be used for limiting the scope of time series visible to the given tenant. It is expected that the extra_filters[] query args are automatically set by auth proxy sitting in front of VictoriaMetrics. See vmauth and vmgateway as examples of such proxies.

VictoriaMetrics accepts multiple formats for time, start and end query args - see these docs.

VictoriaMetrics accepts round_digits query arg for /api/v1/query and /api/v1/query_range handlers. It can be used for rounding response values to the given number of digits after the decimal point. For example, /api/v1/query?query=avg_over_time(temperature[1h])&round_digits=2 would round response values to up to two digits after the decimal point.

VictoriaMetrics accepts limit query arg for /api/v1/labels and /api/v1/label/<labelName>/values handlers for limiting the number of returned entries. For example, the query to /api/v1/labels?limit=5 returns a sample of up to 5 unique labels, while ignoring the rest of labels. If the provided limit value exceeds the corresponding -search.maxTagKeys / -search.maxTagValues command-line flag values, then limits specified in the command-line flags are used.

By default, VictoriaMetrics returns time series for the last day starting at 00:00 UTC from /api/v1/series, /api/v1/labels and /api/v1/label/<labelName>/values, while the Prometheus API defaults to all time. Explicitly set start and end to select the desired time range. VictoriaMetrics rounds the specified start..end time range to day granularity because of performance optimization concerns. If you need the exact set of label names and label values on the given time range, then send queries to /api/v1/query or to /api/v1/query_range.

VictoriaMetrics accepts limit query arg at /api/v1/series for limiting the number of returned entries. For example, the query to /api/v1/series?limit=5 returns a sample of up to 5 series, while ignoring the rest of series. If the provided limit value exceeds the corresponding -search.maxSeries command-line flag values, then limits specified in the command-line flags are used.

VictoriaMetrics returns an extra object stats in JSON response for /api/v1/query and /api/v1/query_range APIs. This object contains two fields: executionTimeMsec with number of milliseconds the request took and seriesFetched with number of series that were fetched from database before filtering. The seriesFetched field is effectively used by vmalert for detecting misconfigured rule expressions. Please note, seriesFetched provides approximate number of series, it is not recommended to rely on it in tests.

Additionally, VictoriaMetrics provides the following handlers:

/vmui - Basic Web UI. See these docs.
/api/v1/series/count - returns the total number of time series in the database. Some notes:
- the handler scans all IndexDBs entirely, so it can be slow if the database contains tens of millions of time series;
- it can return an inflated value if the same time series is stored in more than one IndexDB.
- the handler may count deleted time series additionally to normal time series due to internal implementation restrictions;
/api/v1/status/active_queries - returns the list of currently running queries. This list is also available at active queries page at VMUI.
/api/v1/status/top_queries - returns the following query lists:
- the most frequently executed queries - topByCount
- queries with the biggest average execution duration - topByAvgDuration
- queries that took the most time for execution - topBySumDuration
The number of returned queries can be limited via topN query arg. Old queries can be filtered out with maxLifetime query arg. For example, request to /api/v1/status/top_queries?topN=5&maxLifetime=30s would return up to 5 queries per list, which were executed during the last 30 seconds. VictoriaMetrics tracks the last -search.queryStats.lastQueriesCount queries with durations at least -search.queryStats.minQueryDuration.

See also top queries page at VMUI.

Timestamp formats

VictoriaMetrics accepts the following formats for time, start and end query args in query APIs and in export APIs.

Unix timestamps in seconds with optional milliseconds after the point. For example, 1562529662.678.
Unix timestamps in milliseconds. For example, 1562529662678.
Unix timestamps in microseconds. For example, 1562529662678901.
Unix timestamps in nanoseconds. For example, 1562529662678901234.
RFC3339. For example, 2022-03-29T01:02:03Z or 2022-03-29T01:02:03+02:30.
Partial RFC3339. Examples: 2022, 2022-03, 2022-03-29, 2022-03-29T01, 2022-03-29T01:02, 2022-03-29T01:02:03. The partial RFC3339 time is in local timezone of the host where VictoriaMetrics runs. It is possible to specify the needed timezone by adding Z (UTC), +hh:mm or -hh:mm suffix to partial time. For example, 2022-03-01Z corresponds to the given date in UTC timezone, while 2022-03-01+06:30 corresponds to 2022-03-01 date at 06:30 timezone.
Relative duration comparing to the current time. For example, 1h5m, -1h5m or now-1h5m means one hour and five minutes ago, while now means now.

How to build from sources

We recommend using either binary releases or docker images (Docker Hub and Quay) instead of building VictoriaMetrics from sources. Building from sources is reasonable when developing additional features specific to your needs or when testing bugfixes.

Development build

Install Go.
Run make victoria-metrics from the root folder of the repository. It builds victoria-metrics binary and puts it into the bin folder.

Production build

Install docker.
Run make victoria-metrics-prod from the root folder of the repository. It builds victoria-metrics-prod binary and puts it into the bin folder.

ARM build

ARM build may run on Raspberry Pi or on energy-efficient ARM servers.

Development ARM build

Install Go.
Run make victoria-metrics-linux-arm or make victoria-metrics-linux-arm64 from the root folder of the repository. It builds victoria-metrics-linux-arm or victoria-metrics-linux-arm64 binary respectively and puts it into the bin folder.

Production ARM build

Install docker.
Run make victoria-metrics-linux-arm-prod or make victoria-metrics-linux-arm64-prod from the root folder of the repository. It builds victoria-metrics-linux-arm-prod or victoria-metrics-linux-arm64-prod binary respectively and puts it into the bin folder.

Pure Go build (CGO_ENABLED=0)

Pure Go mode builds only Go code without cgo dependencies.

Install Go.
Run make victoria-metrics-pure from the root folder of the repository. It builds victoria-metrics-pure binary and puts it into the bin folder.

Building docker images

Run make package-victoria-metrics. It builds victoriametrics/victoria-metrics:<PKG_TAG> docker image locally. <PKG_TAG> is auto-generated image tag, which depends on source code in the repository. The <PKG_TAG> may be manually set via PKG_TAG=foobar make package-victoria-metrics.

The base docker image is alpine but it is possible to use any other base image by setting it via <ROOT_IMAGE> environment variable. For example, the following command builds the image on top of scratch image:

ROOT_IMAGE=scratch make package-victoria-metrics

Building VictoriaMetrics with Podman

VictoriaMetrics can be built with Podman in either rootful or rootless mode.

When building via rootful Podman, simply add DOCKER=podman to the relevant make commandline. To build via rootless Podman, add DOCKER=podman DOCKER_RUN="podman run --userns=keep-id" to the make commandline.

For example: make victoria-metrics-pure DOCKER=podman DOCKER_RUN="podman run --userns=keep-id"

Note that production builds are not supported via Podman because Podman does not support buildx.

How to work with snapshots

Create snapshot

Send a request to http://<victoriametrics-addr>:8428/snapshot/create endpoint in order to create an instant snapshot. The page returns the following JSON response on successful creation of snapshot:

{"status":"ok","snapshot":"<snapshot-name>"}

Snapshots are created under <-storageDataPath>/snapshots directory, where <-storageDataPath> is the corresponding command-line flag value. Snapshots can be archived to backup storage at any time with vmbackup.

Snapshots consist of a mix of hard-links and soft-links to various files and directories inside -storageDataPath. See this article for more details. This adds some restrictions on what can be done with the contents of <-storageDataPath>/snapshots directory:

Do not delete subdirectories inside <-storageDataPath>/snapshots with rm or similar commands, since this will leave some snapshot data undeleted. Prefer using the /snapshot/delete API for deleting snapshot. See below for more details about this API.
Do not copy subdirectories inside <-storageDataPath>/snapshot with cp, rsync or similar commands, since there are high chances that these commands won't copy some data stored in the snapshot. Prefer using vmbackup for making copies of snapshot data.

Delete snapshot

Send a query to http://<victoriametrics-addr>:8428/snapshot/delete?snapshot=<snapshot-name> in order to delete the snapshot with <snapshot-name> name.

Navigate to http://<victoriametrics-addr>:8428/snapshot/delete_all in order to delete all the snapshots.

How to restore from a snapshot

Stop VictoriaMetrics with kill -INT.
Restore snapshot contents from backup with vmrestore to the directory pointed by -storageDataPath.
Start VictoriaMetrics.

Snapshot troubleshooting

Snapshot doesn't occupy disk space just after its' creation thanks to the used approach. Old snapshots may start occupying additional disk space if they refer to old parts, which were already deleted during background merge. That's why it is recommended deleting old snapshots after they are no longer needed in order to free up disk space used by old snapshots. This can be done either manually or automatically if the -snapshotsMaxAge command-line flag is set. Make sure that the backup process has enough time to complete when setting -snapshotsMaxAge command-line flag.

VictoriaMetrics exposes the current number of available snapshots via vm_snapshots metric at /metrics page.

How to delete time series

Send a request to http://<victoriametrics-addr>:8428/api/v1/admin/tsdb/delete_series?match[]=<timeseries_selector_for_delete>, where <timeseries_selector_for_delete> may contain any time series selector for metrics to delete. Delete API doesn't support the deletion of specific time ranges, the series can only be deleted completely. Storage space for the deleted time series isn't freed instantly - it is freed during subsequent background merges of data files.

Note that background merges may never occur for data from previous months, so storage space won't be freed for historical data. In this case forced merge may help freeing up storage space.

It is recommended verifying which metrics will be deleted with the call to http://<victoria-metrics-addr>:8428/api/v1/series?match[]=<timeseries_selector_for_delete> before actually deleting the metrics. By default, this query will only scan series in the past 5 minutes, so you may need to adjust start and end to a suitable range to achieve match hits. Also, if the number of returned time series is rather big you will need to set -search.maxDeleteSeries flag (see Resource usage limits).

The /api/v1/admin/tsdb/delete_series handler may be protected with authKey if -deleteAuthKey command-line flag is set. Note that handler accepts any HTTP method, so sending a GET request to /api/v1/admin/tsdb/delete_series will result in deletion of time series.

The delete API is intended mainly for the following cases:

One-off deleting of accidentally written invalid (or undesired) time series.
One-off deleting of user data due to GDPR.

Using the delete API is not recommended in the following cases, since it brings a non-zero overhead:

Regular cleanups for unneeded data. Just prevent writing unneeded data into VictoriaMetrics. This can be done with relabeling. See this article for details.
Reducing disk space usage by deleting unneeded time series. This doesn't work as expected, since the deleted time series occupy disk space until the next merge operation, which can never occur when deleting too old data. Forced merge may be used for freeing up disk space occupied by old data. Note that VictoriaMetrics doesn't delete entries from IndexDB for the deleted time series. IndexDB is cleaned up along with the corresponding data partition once it becomes outside the -retentionPeriod.

It's better to use the -retentionPeriod command-line flag for efficient pruning of old data.

Forced merge

VictoriaMetrics performs data compactions in background in order to keep good performance characteristics when accepting new data. These compactions (merges) are performed independently on per-month partitions. This means that compactions are stopped for per-month partitions if no new data is ingested into these partitions. Sometimes it is necessary to trigger compactions for old partitions. For instance, in order to free up disk space occupied by deleted time series. In this case forced compaction may be initiated on the specified per-month partition by sending request to /internal/force_merge?partition_prefix=YYYY_MM, where YYYY_MM is per-month partition name. For example, http://victoriametrics:8428/internal/force_merge?partition_prefix=2020_08 would initiate forced merge for August 2020 partition. The call to /internal/force_merge returns immediately, while the corresponding forced merge continues running in background.

Forced merges may require additional CPU, disk IO and storage space resources. It is unnecessary to run forced merge under normal conditions, since VictoriaMetrics automatically performs optimal merges in background when new data is ingested into it.

How to export time series

VictoriaMetrics provides the following handlers for exporting data:

/api/v1/export for exporting data in JSON line format. See these docs for details.
/api/v1/export/csv for exporting data in CSV. See these docs for details.
/api/v1/export/native for exporting data in native binary format. This is the most efficient format for data export. See these docs for details.

How to export data in JSON line format

Send a request to http://<victoriametrics-addr>:8428/api/v1/export?match[]=<timeseries_selector_for_export>, where <timeseries_selector_for_export> may contain any time series selector for metrics to export. Use {__name__!=""} selector for fetching all the time series.

The response would contain all the data for the selected time series in JSON line format - see these docs for details on this format.

Each JSON line contains samples for a single time series. An example output:

{"metric":{"__name__":"up","job":"node_exporter","instance":"localhost:9100"},"values":[0,0,0],"timestamps":[1549891472010,1549891487724,1549891503438]}
{"metric":{"__name__":"up","job":"prometheus","instance":"localhost:9090"},"values":[1,1,1],"timestamps":[1549891461511,1549891476511,1549891491511]}

Optional start and end args may be added to the request in order to limit the time frame for the exported data. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

Optional max_rows_per_line arg may be added to the request for limiting the maximum number of rows exported per each JSON line. Optional reduce_mem_usage=1 arg may be added to the request for reducing memory usage when exporting big number of time series. In this case the output may contain multiple lines with samples for the same time series.

Pass Accept-Encoding: gzip HTTP header in the request to /api/v1/export in order to reduce network bandwidth during exporting big amounts of time series data. This enables gzip compression for the exported data. Example for exporting gzipped data:

curl -H 'Accept-Encoding: gzip' http://localhost:8428/api/v1/export -d 'match[]={__name__!=""}' > data.jsonl.gz

The maximum duration for each request to /api/v1/export is limited by -search.maxExportDuration command-line flag.

Exported data can be imported via POST'ing it to /api/v1/import.

By default, data exported via /api/v1/export is deduplicated according to -dedup.minScrapeInterval setting. Pass GET param reduce_mem_usage=1 in export request to disable deduplication for recently written data. After background merges deduplication becomes permanent.

How to export CSV data

Send a request to http://<victoriametrics-addr>:8428/api/v1/export/csv?format=<format>&match=<timeseries_selector_for_export>, where:

<format> must contain comma-delimited label names for the exported CSV. The following special label names are supported:
- __name__ - metric name
- __value__ - sample value
- __timestamp__:<ts_format> - sample timestamp. <ts_format> can have the following values:
  - unix_s - unix seconds
  - unix_ms - unix milliseconds
  - unix_ns - unix nanoseconds
  - rfc3339 - RFC3339 time (in the timezone of the server)
  - custom:<layout> - custom layout for time that is supported by time.Format function from Go.
<timeseries_selector_for_export> may contain any time series selector for metrics to export.

Optional start and end args may be added to the request in order to limit the time frame for the exported data. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/csv -d 'format=<format>' -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

The exported CSV data can be imported to VictoriaMetrics via /api/v1/import/csv. The first line of the file is a header row derived from the format parameter.

The deduplication is applied for the data exported in CSV by default. It is possible to export raw data without de-duplication by passing reduce_mem_usage=1 query arg to /api/v1/export/csv.

How to export data in native format

Send a request to http://<victoriametrics-addr>:8428/api/v1/export/native?match[]=<timeseries_selector_for_export>, where <timeseries_selector_for_export> may contain any time series selector for metrics to export. Use {__name__=~".*"} selector for fetching all the time series.

On large databases you may experience problems with limit on the number of time series, which can be exported. In this case you need to adjust -search.maxExportSeries command-line flag:

# count unique time series in database
wget -O- -q 'http://your_victoriametrics_instance:8428/api/v1/series/count' | jq '.data[0]'

# relaunch victoriametrics with search.maxExportSeries more than value from previous command

Optional start and end args may be added to the request in order to limit the time frame for the exported data. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/api/v1/export/native -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

The exported data can be imported to VictoriaMetrics via /api/v1/import/native. The native export format may change in incompatible way between VictoriaMetrics releases, so the data exported from the release X can fail to be imported into VictoriaMetrics release Y.

The deduplication isn't applied for the data exported in native format. It is expected that the de-duplication is performed during data import.

How to import time series data

VictoriaMetrics can discover and scrape metrics from Prometheus-compatible targets (aka "pull" protocol) - see these docs. Additionally, VictoriaMetrics can accept metrics via the following popular data ingestion protocols (aka "push" protocols):

Prometheus remote_write API. See these docs for details.
DataDog submit metrics API. See these docs for details.
InfluxDB line protocol. See these docs for details.
Graphite plaintext protocol. See these docs for details.
OpenTelemetry http API. See these docs for details.
OpenTSDB telnet put protocol. See these docs for details.
OpenTSDB http /api/put protocol. See these docs for details.
/api/v1/import for importing data obtained from /api/v1/export. See these docs for details.
/api/v1/import/native for importing data obtained from /api/v1/export/native. See these docs for details.
/api/v1/import/csv for importing arbitrary CSV data. See these docs for details.
/api/v1/import/prometheus for importing data in Prometheus exposition format and in Pushgateway format. See these docs for details.

Please note, most of the ingestion APIs (except Prometheus remote_write API, OpenTelemetry and Influx Line Protocol) are optimized for performance and processes data in a streaming fashion. It means that client can transfer unlimited amount of data through the open connection. Because of this, import APIs may not return parsing errors to the client, as it is expected for data stream to be not interrupted. Instead, look for parsing errors on the server side (VictoriaMetrics single-node or vminsert) or check for changes in vm_rows_invalid_total (exported by server side) metric.

How to import data in JSON line format

VictoriaMetrics accepts metrics data in JSON line format at /api/v1/import endpoint. See these docs for details on this format.

Example for importing data obtained via /api/v1/export:

# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl

# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl

Pass Content-Encoding: gzip HTTP request header to /api/v1/import for importing gzipped data:

# Export gzipped data from <source-victoriametrics>:
curl -H 'Accept-Encoding: gzip' http://source-victoriametrics:8428/api/v1/export -d 'match={__name__!=""}' > exported_data.jsonl.gz

# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import -T exported_data.jsonl.gz

Extra labels may be added to all the imported time series by passing extra_label=name=value query args. For example, /api/v1/import?extra_label=foo=bar would add "foo":"bar" label to all the imported time series.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

VictoriaMetrics parses input JSON lines one-by-one. It loads the whole JSON line in memory, then parses it and then saves the parsed samples into persistent storage. This means that VictoriaMetrics can occupy big amounts of RAM when importing too long JSON lines. The solution is to split too long JSON lines into shorter lines. It is OK if samples for a single time series are split among multiple JSON lines. JSON line length can be limited via max_rows_per_line query arg when exporting via /api/v1/export.

The maximum JSON line length, which can be parsed by VictoriaMetrics, is limited by -import.maxLineLen command-line flag value.

How to import data in native format

The specification of VictoriaMetrics' native format may yet change and is not formally documented yet. So currently we do not recommend that external clients attempt to pack their own metrics in native format file.

If you have a native format file obtained via /api/v1/export/native however this is the most efficient protocol for importing data in.

# Export the data from <source-victoriametrics>:
curl http://source-victoriametrics:8428/api/v1/export/native -d 'match={__name__!=""}' > exported_data.bin

# Import the data to <destination-victoriametrics>:
curl -X POST http://destination-victoriametrics:8428/api/v1/import/native -T exported_data.bin

Extra labels may be added to all the imported time series by passing extra_label=name=value query args. For example, /api/v1/import/native?extra_label=foo=bar would add "foo":"bar" label to all the imported time series.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

How to import CSV data

Arbitrary CSV data can be imported via /api/v1/import/csv. The CSV data is imported according to the provided format query arg. The format query arg must contain comma-separated list of parsing rules for CSV fields. Each rule consists of three parts delimited by a colon:

<column_pos>:<type>:<context>

<column_pos> is the position of the CSV column (field). Column numbering starts from 1. The order of parsing rules may be arbitrary.
<type> describes the column type. Supported types are:
- metric - the corresponding CSV column at <column_pos> contains metric value, which must be integer or floating-point number. The metric name is read from the <context>. CSV line must have at least a single metric field. Multiple metric fields per CSV line is OK.
- label - the corresponding CSV column at <column_pos> contains label value. The label name is read from the <context>. CSV line may have arbitrary number of label fields. All these labels are attached to all the configured metrics.
- time - the corresponding CSV column at <column_pos> contains metric time. CSV line may contain either one or zero columns with time. If CSV line has no time, then the current time is used. The time is applied to all the configured metrics. The format of the time is configured via <context>. Supported time formats are:
  - unix_s - unix timestamp in seconds.
  - unix_ms - unix timestamp in milliseconds.
  - unix_ns - unix timestamp in nanoseconds. Note that VictoriaMetrics rounds the timestamp to milliseconds.
  - rfc3339 - timestamp in RFC3339 format, i.e. 2006-01-02T15:04:05Z.
  - custom:<layout> - custom layout for the timestamp. The <layout> may contain arbitrary time layout according to time.Parse rules in Go.

The first row is treated as a header but can be skipped if any time or metric column contains a non-numeric value.

Each request to /api/v1/import/csv may contain arbitrary number of CSV lines.

Example for importing CSV data via /api/v1/import/csv:

# Import via POST data:
curl -d "GOOG,1.23,4.56,NYSE" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'
curl -d "MSFT,3.21,1.67,NASDAQ" 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market'

# Import via file upload:
curl -X POST 'http://localhost:8428/api/v1/import/csv?format=2:metric:ask,3:metric:bid,1:label:ticker,4:label:market' -T exported_data.csv

After that the data may be read via /api/v1/export endpoint:

curl -G 'http://localhost:8428/api/v1/export' -d 'match[]={ticker!=""}'

The following response should be returned:

{"metric":{"__name__":"bid","market":"NASDAQ","ticker":"MSFT"},"values":[1.67],"timestamps":[1583865146520]}
{"metric":{"__name__":"bid","market":"NYSE","ticker":"GOOG"},"values":[4.56],"timestamps":[1583865146495]}
{"metric":{"__name__":"ask","market":"NASDAQ","ticker":"MSFT"},"values":[3.21],"timestamps":[1583865146520]}
{"metric":{"__name__":"ask","market":"NYSE","ticker":"GOOG"},"values":[1.23],"timestamps":[1583865146495]}

Extra labels may be added to all the imported lines by passing extra_label=name=value query args. For example, /api/v1/import/csv?extra_label=foo=bar would add "foo":"bar" label to all the imported lines.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

How to import data in Prometheus exposition format

VictoriaMetrics accepts data in Prometheus exposition format, in OpenMetrics format and in Pushgateway format via /api/v1/import/prometheus path.

For example, the following command imports a single line in Prometheus exposition format into VictoriaMetrics:

curl -d 'foo{bar="baz"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus'

The following command may be used for verifying the imported data:

curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__=~"foo"}'

It should return something like the following:

{"metric":{"__name__":"foo","bar":"baz"},"values":[123],"timestamps":[1594370496905]}

The following command imports a single metric via Pushgateway format with {job="my_app",instance="host123"} labels:

curl -d 'metric{label="abc"} 123' -X POST 'http://localhost:8428/api/v1/import/prometheus/metrics/job/my_app/instance/host123'

Pass Content-Encoding: gzip HTTP request header to /api/v1/import/prometheus for importing gzipped data:

# Import gzipped data to <destination-victoriametrics>:
curl -X POST -H 'Content-Encoding: gzip' http://destination-victoriametrics:8428/api/v1/import/prometheus -T prometheus_data.gz

Extra labels may be added to all the imported metrics either via Pushgateway format or by passing extra_label=name=value query args. For example, /api/v1/import/prometheus?extra_label=foo=bar would add {foo="bar"} label to all the imported metrics.

If timestamp is missing in <metric> <value> <timestamp> Prometheus exposition format line, then the current timestamp is used during data ingestion. It can be overridden by passing unix timestamp in milliseconds via timestamp query arg. For example, /api/v1/import/prometheus?timestamp=1594370496905.

VictoriaMetrics accepts arbitrary number of lines in a single request to /api/v1/import/prometheus, i.e. it supports data streaming.

Note that it could be required to flush response cache after importing historical data. See these docs for detail.

VictoriaMetrics also may scrape Prometheus targets - see these docs.

JSON line format

VictoriaMetrics accepts data in JSON line format at /api/v1/import and exports data in this format at /api/v1/export.

The format follows JSON streaming concept, e.g. each line contains JSON object with metrics data in the following format:

{
  // metric contains metric name plus labels for a particular time series
  "metric":{
    "__name__": "metric_name",  // <- this is metric name

    // Other labels for the time series

    "label1": "value1",
    "label2": "value2",
    ...
    "labelN": "valueN"
  },

  // values contains raw sample values for the given time series
  "values": [1, 2.345, -678],

  // timestamps contains raw sample UNIX timestamps in milliseconds for the given time series
  // every timestamp is associated with the value at the corresponding position
  "timestamps": [1549891472010,1549891487724,1549891503438]
}

Note that every JSON object must be written in a single line, e.g. all the newline chars must be removed from it. /api/v1/import handler doesn't accept JSON lines longer than the value passed to -import.maxLineLen command-line flag (by default this is 10MB).

It is recommended passing 1K-10K samples per line for achieving the maximum data ingestion performance at /api/v1/import. Too long JSON lines may increase RAM usage at VictoriaMetrics side.

/api/v1/export handler accepts max_rows_per_line query arg, which allows limiting the number of samples per each exported line.

It is OK to split raw samples for the same time series across multiple lines.

The number of lines in the request to /api/v1/import can be arbitrary - they are imported in streaming manner.

Relabeling

VictoriaMetrics supports Prometheus-compatible relabeling for all the ingested metrics if -relabelConfig command-line flag points to a file containing a list of relabel_config entries. The -relabelConfig also can point to http or https url. For example, -relabelConfig=https://config-server/relabel_config.yml.

The following docs can be useful in understanding the relabeling:

The -relabelConfig files can contain special placeholders in the form %{ENV_VAR}, which are replaced by the corresponding environment variable values.

Example contents for -relabelConfig file:

# Add {cluster="dev"} label.
- target_label: cluster
  replacement: dev

# Drop the metric (or scrape target) with `{__meta_kubernetes_pod_container_init="true"}` label.
- action: drop
  source_labels: [__meta_kubernetes_pod_container_init]
  regex: true

VictoriaMetrics provides additional relabeling features such as Graphite-style relabeling. See these docs for more details.

The relabeling can be debugged at http://victoriametrics:8428/metric-relabel-debug page or at our public demo playground. See these docs for more details.

Federation

VictoriaMetrics exports Prometheus-compatible federation data at http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for_federation>.

Optional start and end args may be added to the request in order to scrape the last point for each selected time series on the [start ... end] interval. See allowed formats for these args.

For example:

curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=1654543486' -d 'end=1654543486'
curl http://<victoriametrics-addr>:8428/federate -d 'match[]=<timeseries_selector_for_export>' -d 'start=2022-06-06T19:25:48' -d 'end=2022-06-06T19:29:07'

By default, the last point on the interval [now - max_lookback ... now] is scraped for each time series. The default value for max_lookback is 5m (5 minutes), but it can be overridden with max_lookback query arg. For instance, /federate?match[]=up&max_lookback=1h would return last points on the [now - 1h ... now] interval. This may be useful for time series federation with scrape intervals exceeding 5m.

Capacity planning

VictoriaMetrics uses lower amounts of CPU, RAM and storage space on production workloads compared to competing solutions (Prometheus, Thanos, Cortex, TimescaleDB, InfluxDB, QuestDB, M3DB) according to our case studies.

VictoriaMetrics capacity scales linearly with the available resources. The needed amounts of CPU and RAM highly depends on the workload - the number of active time series, series churn rate, query types, query qps, etc. It is recommended setting up a test VictoriaMetrics for your production workload and iteratively scaling CPU and RAM resources until it becomes stable according to troubleshooting docs. A single-node VictoriaMetrics works perfectly with the following production workload according to our case studies:

Ingestion rate: 1.5+ million samples per second
Active time series: 50+ million
Total time series: 5+ billion
Time series churn rate: 150+ million of new series per day
Total number of samples: 10+ trillion
Queries: 200+ qps
Query latency (99th percentile): 1 second

The needed storage space for the given retention (the retention is set via -retentionPeriod command-line flag) can be extrapolated from disk space usage in a test run. For example, if -storageDataPath directory size becomes 10GB after a day-long test run on a production workload, then it will need at least 10GB*100=1TB of disk space for -retentionPeriod=100d (100-days retention period).

It is recommended leaving the following amounts of spare resources:

50% of free RAM for reducing the probability of OOM (out of memory) crashes. Exceeding 50% of free RAM may cause cache evictions, excessive I/O and overall slowdown (see #9895-comment for more details).
50% of spare CPU for reducing the probability of slowdowns during temporary spikes in workload.
At least 20% of free storage space at the directory pointed by -storageDataPath command-line flag. See also -storage.minFreeDiskSpaceBytes command-line flag description.

Resource usage limits

By default, VictoriaMetrics is tuned for an optimal resource usage under typical workloads. Some workloads may need fine-grained resource usage limits. In these cases the following command-line flags may be useful:

-maxIngestionRate limits samples/second ingested. This may be useful when CPU resources are limited or overloaded.
-memory.allowedPercent and -memory.allowedBytes limit the amounts of memory, which may be used for various internal caches at VictoriaMetrics. Note that VictoriaMetrics may use more memory, since these flags don't limit additional memory, which may be needed on a per-query basis.
-search.maxMemoryPerQuery limits the amounts of memory, which can be used for processing a single query. Queries, which need more memory, are rejected. Heavy queries, which select big number of time series, may exceed the per-query memory limit by a small percent. The total memory limit for concurrently executed queries can be estimated as -search.maxMemoryPerQuery multiplied by -search.maxConcurrentRequests.
-search.maxUniqueTimeseries limits the number of unique time series a single query can find and process. By default, VictoriaMetrics calculates the limit automatically based on the available memory and the maximum number of concurrent requests it can process (see -search.maxConcurrentRequests). VictoriaMetrics keeps in memory some metainformation about the time series located by each query and spends some CPU time for processing the found time series. This means that the maximum memory usage and CPU usage a single query can use is proportional to -search.maxUniqueTimeseries.
-search.maxQueryDuration limits the duration of a single query. If the query takes longer than the given duration, then it is canceled. This allows saving CPU and RAM when executing unexpected heavy queries. The limit can be overridden to a smaller value by passing timeout GET parameter.
-search.maxConcurrentRequests limits the number of concurrent requests VictoriaMetrics can process. Bigger number of concurrent requests usually means bigger memory usage. For example, if a single query needs 100 MiB of additional memory during its execution, then 100 concurrent queries may need 100 * 100 MiB = 10 GiB of additional memory. So it is better to limit the number of concurrent queries, while pausing additional incoming queries if the concurrency limit is reached. VictoriaMetrics provides -search.maxQueueDuration command-line flag for limiting the max wait time for paused queries. See also -search.maxMemoryPerQuery command-line flag.
-search.maxQueueDuration limits the maximum duration queries may wait for execution when -search.maxConcurrentRequests concurrent queries are executed.
-search.ignoreExtraFiltersAtLabelsAPI enables ignoring of match[], extra_filters[] and extra_label query args at /api/v1/labels and /api/v1/label/.../values. This may be useful for reducing the load on VictoriaMetrics if the provided extra filters match too many time series. The downside is that the endpoints can return labels and series, which do not match the provided extra filters.
-search.maxSamplesPerSeries limits the number of raw samples the query can process per each time series. VictoriaMetrics sequentially processes raw samples per each found time series during the query. It unpacks raw samples on the selected time range per each time series into memory and then applies the given rollup function. The -search.maxSamplesPerSeries command-line flag allows limiting memory usage in the case when the query is executed on a time range, which contains hundreds of millions of raw samples per each located time series.
-search.maxSamplesPerQuery limits the number of raw samples a single query can process. This allows limiting CPU usage for heavy queries.
-search.maxResponseSeries limits the number of time series a single query can return from /api/v1/query and /api/v1/query_range.
-search.maxPointsPerTimeseries limits the number of calculated points, which can be returned per each matching time series from range query.
-search.maxPointsSubqueryPerTimeseries limits the number of calculated points, which can be generated per each matching time series during subquery evaluation.
-search.maxSeriesPerAggrFunc limits the number of time series, which can be generated by MetricsQL aggregate functions in a single query.
-search.maxSeries limits the number of time series, which may be returned from /api/v1/series. This endpoint is used mostly by Grafana for auto-completion of metric names, label names and label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of high churn rate. In this case it might be useful to set the -search.maxSeries to quite low value in order limit CPU and memory usage. See also -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries.
-search.maxDeleteSeries limits the number of unique time series that can be deleted by a single /api/v1/admin/tsdb/delete_series call. The duration is limited via -search.maxDeleteDuration flag{{% available_from "v1.110.0" %}}. Deleting too many time series may require big amount of CPU and memory and this limit guards against unplanned resource usage spikes. Also see How to delete time series section to learn about different ways of deleting series.
-search.maxTSDBStatusTopNSeries at vmselect limits the number of unique time series that can be queried with topN argument by a single /api/v1/status/tsdb?topN=N call.
-search.maxTagKeys limits the number of items, which may be returned from /api/v1/labels. This endpoint is used mostly by Grafana for auto-completion of label names. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of high churn rate. In this case it might be useful to set the -search.maxTagKeys to quite low value in order to limit CPU and memory usage. See also -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries.
-search.maxTagValues limits the number of items, which may be returned from /api/v1/label/.../values. This endpoint is used mostly by Grafana for auto-completion of label values. Queries to this endpoint may take big amounts of CPU time and memory when the database contains big number of unique time series because of high churn rate. In this case it might be useful to set the -search.maxTagValues to quite low value in order to limit CPU and memory usage. See also -search.maxLabelsAPIDuration and -search.maxLabelsAPISeries.
-search.maxLabelsAPISeries limits the number of time series, which can be scanned when performing /api/v1/labels or /api/v1/label/.../values requests. These endpoints are used mostly by Grafana for auto-completion of label names and label values. Queries to these endpoints may take big amounts of CPU time and memory when the database contains big number of unique time series because of high churn rate. In this case it might be useful to set the -search.maxLabelsAPISeries to quite low value in order to limit CPU and memory usage. See also -search.maxLabelsAPIDuration and -search.ignoreExtraFiltersAtLabelsAPI.
-search.maxLabelsAPIDuration limits the duration for requests to /api/v1/labels, /api/v1/label/.../values or /api/v1/series. The limit can be overridden to a smaller value by passing timeout GET parameter. These endpoints are used mostly by Grafana for auto-completion of label names and label values. Queries to these endpoints may take big amounts of CPU time and memory when the database contains big number of unique time series because of high churn rate. In this case it might be useful to set the -search.maxLabelsAPIDuration to quite low value in order to limit CPU and memory usage. See also -search.maxLabelsAPISeries and -search.ignoreExtraFiltersAtLabelsAPI.
-search.maxTagValueSuffixesPerSearch limits the number of entries, which may be returned from /metrics/find endpoint. See Graphite Metrics API usage docs.
-search.maxFederateSeries limits maximum number of time series, which can be returned via /federate API. The duration of the /federate queries is limited via -search.maxQueryDuration flag. This option allows limiting memory usage.
-search.maxExportSeries limits maximum number of time series, which can be returned from /api/v1/export* APIs. The duration of the export queries is limited via -search.maxExportDuration flag. This option allows limiting memory usage.
-search.maxTSDBStatusSeries limits maximum number of time series, which can be processed during the call to /api/v1/status/tsdb. The duration of the status queries is limited via -search.maxStatusRequestDuration flag. This option allows limiting memory usage.

High availability

The general approach for achieving high availability is the following:

To run two identically configured VictoriaMetrics instances in distinct datacenters (availability zones);
To store the collected data simultaneously into these instances via vmagent or Prometheus.
To query the first VictoriaMetrics instance and to fail over to the second instance when the first instance becomes temporarily unavailable. This can be done via vmauth according to these docs.

Such a setup guarantees that the collected data isn't lost when one of VictoriaMetrics instance becomes unavailable. The collected data continues to be written to the available VictoriaMetrics instance, so it should be available for querying. Both vmagent and Prometheus buffer the collected data locally if they cannot send it to the configured remote storage. So the collected data will be written to the temporarily unavailable VictoriaMetrics instance after it becomes available.

If you use vmagent for storing the data into VictoriaMetrics, then it can be configured with multiple -remoteWrite.url command-line flags, where every flag points to the VictoriaMetrics instance in a particular availability zone, in order to replicate the collected data to all the VictoriaMetrics instances. For example, the following command instructs vmagent to replicate data to vm-az1 and vm-az2 instances of VictoriaMetrics:

/path/to/vmagent \
  -remoteWrite.url=http://<vm-az1>:8428/api/v1/write \
  -remoteWrite.url=http://<vm-az2>:8428/api/v1/write

If you use Prometheus for collecting and writing the data to VictoriaMetrics, then the following remote_write section in Prometheus config can be used for replicating the collected data to vm-az1 and vm-az2 VictoriaMetrics instances:

remote_write:
  - url: http://<vm-az1>:8428/api/v1/write
  - url: http://<vm-az2>:8428/api/v1/write

It is recommended to use vmagent instead of Prometheus for highly loaded setups, since it uses lower amounts of RAM, CPU and network bandwidth than Prometheus.

If you use identically configured vmagent instances for collecting the same data and sending it to VictoriaMetrics, then do not forget enabling deduplication at VictoriaMetrics side.

See victoria-metrics-distributed chart for an example.

Deduplication

VictoriaMetrics leaves a single raw sample with the biggest timestamp for each time series per each -dedup.minScrapeInterval discrete interval if -dedup.minScrapeInterval is set to positive duration. For example, -dedup.minScrapeInterval=60s would leave a single raw sample with the biggest timestamp per each discrete 60s interval. This aligns with the staleness rules in Prometheus.

If multiple raw samples have the same timestamp on the given -dedup.minScrapeInterval discrete interval, then the sample with the biggest value is kept. Numerical values are preferred over stale markers.

Please note, labels of raw samples should be identical in order to be deduplicated. For example, this is why HA pair of vmagents needs to be identically configured.

The -dedup.minScrapeInterval=D is equivalent to -downsampling.period=0s:D in downsampling. It is also safe to use deduplication and downsampling simultaneously.

The recommended value for -dedup.minScrapeInterval must equal to scrape_interval config from Prometheus configs. It is recommended to have a single scrape_interval across all the scrape targets. See this article for details.

The de-duplication reduces disk space usage if multiple identically configured vmagent or Prometheus instances in HA pair write data to the same VictoriaMetrics instance. These vmagent or Prometheus instances must have identical external_labels section in their configs, so they write data to the same time series. See also how to set up multiple vmagent instances for scraping the same targets. Note that de-duplication doesn't reduce the indexdb size - see why IndexDB size is so large?.

It is recommended passing different -promscrape.cluster.name values to each distinct HA pair of vmagent instances, so the de-duplication consistently leaves samples for one vmagent instance and removes duplicate samples from other vmagent instances. See these docs for details.

VictoriaMetrics stores all the ingested samples to disk even if -dedup.minScrapeInterval command-line flag is set. The ingested samples are de-duplicated during background merges and during query execution. VictoriaMetrics also supports de-duplication during data ingestion before the data is stored to disk, via -streamAggr.dedupInterval command-line flag - see these docs.

Metrics Metadata

Single-node VictoriaMetrics can store metric metadata (TYPE, HELP, UNIT) {{% available_from "v1.130.0" %}}. Metadata ingestion and querying are enabled by default{{% available_from "v1.137.0" %}}. To disable them, set -enableMetadata=false.

The metadata is cached in-memory in a ring buffer and can use up to 1% of available memory by default (see -storage.maxMetadataStorageSize cmd-line flag). When in-memory size is exceeded, the least updated entries are dropped first. Entries that weren't updated for 1h are cleaned up automatically.

The following expression helps to understand if metadata cache capacity is utilized for more than 90%: vm_metrics_metadata_storage_size_bytes / vm_metrics_metadata_storage_max_size_bytes > 0.9. Setup monitoring and recommended alerting rules to get notified about cache capacity issues.

Metadata is ingested independently from metrics, so a metric can exist without metadata, and vice versa. Metadata is expected to be ephemeral and constantly updated on ingestion. For this reason, metadata cache isn't persisted during restarts.

Metadata can be queried via the /api/v1/metadata endpoint, which provides a response compatible with the Prometheus metadata API. See /api/v1/metadata example.

Storage

VictoriaMetrics buffers the ingested data in memory for up to a second. Then the buffered data is written to in-memory parts, which can be searched during queries. The in-memory parts are periodically persisted to disk, so they could survive unclean shutdown such as out of memory crash, hardware power loss or SIGKILL signal. The interval for flushing the in-memory data to disk can be configured with the -inmemoryDataFlushInterval command-line flag (note that too short flush interval may significantly increase disk IO).

In-memory parts are persisted to disk into part directories under the <-storageDataPath>/data/small/YYYY_MM/ folder, where YYYY_MM is the month partition for the stored data. For example, 2022_11 is the partition for parts with raw samples from November 2022. Each partition directory contains parts.json file with the actual list of parts in the partition.

Every part directory contains metadata.json file with the following fields:

RowsCount - the number of raw samples stored in the part
BlocksCount - the number of blocks stored in the part (see details about blocks below)
MinTimestamp and MaxTimestamp - minimum and maximum timestamps across raw samples stored in the part
MinDedupInterval - the deduplication interval applied to the given part.

Each part consists of blocks sorted by internal time series id (aka TSID). Each block contains up to 8K raw samples, which belong to a single time series. Raw samples in each block are sorted by timestamp. Blocks for the same time series are sorted by the timestamp of the first sample. Timestamps and values for all the blocks are stored in compressed form in separate files under part directory - timestamps.bin and values.bin.

The part directory also contains index.bin and metaindex.bin files - these files contain index for fast block lookups, which belong to the given TSID and cover the given time range.

Parts are periodically merged into bigger parts in background. The background merge provides the following benefits:

keeping the number of data files under control, so they don't exceed limits on open files
improved data compression, since bigger parts are usually compressed better than smaller parts
improved query speed, since queries over smaller number of parts are executed faster
various background maintenance tasks such as de-duplication, downsampling and freeing up disk space for the deleted time series are performed during the merge

Newly added parts either successfully appear in the storage or fail to appear. The newly added part is atomically registered in the parts.json file under the corresponding partition after it is fully written and fsynced to the storage. Thanks to this algorithm, storage never contains partially created parts, even if hardware power off occurs in the middle of writing the part to disk - such incompletely written parts are automatically deleted on the next VictoriaMetrics start. The same applies to merge process — parts are either fully merged into a new part or fail to merge, leaving the source parts untouched.

Hardware issues may cause data already stored on disk to become corrupted, regardless of the VictoriaMetrics process. VictoriaMetrics can detect corruption during reading, decompressing, decoding or sanity checking of the data blocks. Process will intentionally panic when this happens, so human operator can detect corruption as fast as possible.

VictoriaMetrics cannot fix the corrupted data parts on its own. Data parts that fail to load on startup or during reads need to be deleted or restored from backups. It is recommended performing regular backups.

VictoriaMetrics doesn't use checksums for stored data blocks. See why in this GitHub Issue.

VictoriaMetrics does not merge parts if their combined size exceeds the available free disk space. This behavior protects against potential "out of disk space" errors during merges. If there is not enough free disk space to perform merges, the number of parts may increase significantly over time. This increases query overhead, because VictoriaMetrics must read data from a larger number of parts for each request.

It is recommended to keep at least 20% of disk space free in the directory specified by the -storageDataPath command-line flag.

Information about merging process is available in the dashboard for single-node VictoriaMetrics and the dashboard for VictoriaMetrics cluster. See more details in monitoring docs.

See this article for more details.

See also how to work with snapshots and IndexDB.

IndexDB

VictoriaMetrics identifies time series by TSID (time series ID) and stores raw samples sorted by TSID (see Storage). Thus, the TSID is a primary index and could be used for searching and retrieving raw samples. However, the TSID is never exposed to the clients, i.e. it is for internal use only.

Instead, VictoriaMetrics maintains an inverted index (known as indexDB) that enables searching the raw samples by metric name, label name, and label value by mapping these values to the corresponding TSIDs. Every data partition has its own indexDB.

VictoriaMetrics uses two types of inverted indexes:

Global index. Searches using this index is performed across the entire partition time range.
Per-day index. This index stores mappings similar to ones in global index but also includes the date in each mapping. This speeds up data retrieval for queries within a shorter time range (which is often just the last day).

When the search query is executed, VictoriaMetrics decides which index to use based on the time range of the query:

Per-day index is used if the search time range is less than the partition time range.
Global index is used for search queries with a time range that matches exactly or greater than the partition time range.

Mappings are added to the indexes during the data ingestion:

In global index each mapping is created only once per partition.
In the per-day index each mapping is created for each unique date that has been seen in the samples for the corresponding time series.

Since indexDB is a part of a partition, it is dropped along with it as it becomes outside the retention period.

Index tuning for low churn rate

By default, VictoriaMetrics uses the following indexes for data retrieval: global and per-day. Both store the same data and on query time VictoriaMetrics can choose between indexes for optimal performance. See IndexDB for details.

If your use case involves high cardinality with high churn rate then this default setting should be ideal for you.

A prominent example is Kubernetes. Services in k8s expose big number of series with short lifetime, significantly increasing churn rate. The per-day index speeds up data retrieval in this case.

But if your use case assumes low or no churn rate, then you might benefit from disabling the per-day index by setting the flag -disablePerDayIndex{{% available_from "v1.112.0" %}}. This will improve the time series ingestion speed and decrease disk space usage, since no time or disk space is spent maintaining the per-day index.

Example use cases:

Historical weather data, such as ERA5. It consists of millions time series whose hourly values span tens of years. The time series set never changes. If the per-day index is disabled, once the first hour of data is ingested the entire time series set will be written into the global index and subsequent portions of data will not result in index update. But if the per-day index is enabled, the same set of time-series will be written to the per-day index for every day of data.
IoT: a huge set of sensors exports time series with the sensor ID used as a metric label value. Since sensor additions or removals happen infrequently, the time series churn rate will be low. With the per-day index disabled, the entire time series set will be registered in global index during the initial data ingestion and the global index will receive small updates when a sensor is added or removed.

What to expect:

Prefer setting this flag on fresh installations.
Disabling per-day index on installations with historical data is Ok.
Re-enabling per-day index on installations with historical data will make it unsearchable.

Retention

Retention is configured with the -retentionPeriod command-line flag, which takes a number followed by a time unit character - h(ours), d(ays), w(eeks), M(onth), y(ears). If the time unit is not specified, a month (31 days) is assumed. For instance, -retentionPeriod=3 means that the data will be stored for 3 months (93 days) and then deleted. The default retention period is one month: 1M (31 days). The minimum retention period is 24h or 1d.

Data is split in per-month partitions inside <-storageDataPath>/data/{small,big} folders. Data partitions outside the configured retention are deleted on the first day of the new month. Each partition consists of one or more data parts. Data parts outside the configured retention are eventually deleted during background merge. The time range covered by data part is not limited by retention period unit. One data part can cover hours or days of data. Hence, a data part can be deleted only when fully outside the configured retention. See more about partitions and parts in the Storage section.

The maximum disk space usage for a given -retentionPeriod is going to be (-retentionPeriod + 1) months. For example, if -retentionPeriod is set to 1, data for January is deleted on March 1st.

It is safe to extend -retentionPeriod on existing data. If -retentionPeriod is set to a lower value than before, then data outside the configured period will be eventually deleted.

VictoriaMetrics does not support indefinite retention, but you can specify an arbitrarily high duration, e.g. -retentionPeriod=100y.

By default, VictoriaMetrics doesn't accept samples with timestamps bigger than now+2d, e.g. 2 days in the future. If you need accepting samples with bigger timestamps, then specify the desired "future retention" via -futureRetention command-line flag. This flag accepts values starting from 2d.

For example, the following command starts VictoriaMetrics, which accepts samples with timestamps up to a year in the future:

/path/to/victoria-metrics -futureRetention=1y

Multiple retentions

Distinct retentions for distinct time series can be configured via retention filters in VictoriaMetrics Enterprise.

Community version of VictoriaMetrics supports only a single retention, which can be configured via -retentionPeriod command-line flag. If you need multiple retentions in community version of VictoriaMetrics, then you may start multiple VictoriaMetrics instances with distinct values for the following flags:

-retentionPeriod
-storageDataPath, so the data for each retention period is saved in a separate directory
-httpListenAddr, so clients may reach VictoriaMetrics instance with proper retention

Then set up vmauth in front of VictoriaMetrics instances, so it could route requests from particular user to VictoriaMetrics with the desired retention.

Similar scheme can be applied for multiple tenants in VictoriaMetrics cluster. See these docs for multi-retention setup details.

Retention filters

Enterprise version of VictoriaMetrics supports retention filters, which allow configuring multiple retentions for distinct sets of time series matching the configured series filters via -retentionFilter command-line flag. This flag accepts filter:duration options, where filter must be a valid series filter, while the duration must contain valid retention for time series matching the given filter. The duration of the -retentionFilter must be lower or equal to -retentionPeriod flag value. If series doesn't match any configured -retentionFilter, then the retention configured via -retentionPeriod command-line flag is applied to it. If series matches multiple configured retention filters, then the smallest retention is applied.

For example, the following config sets 3 days retention for time series with team="juniors" label, 30 days retention for time series with env="dev" or env="staging" label and 1 year retention for the remaining time series:

-retentionFilter='{team="juniors"}:3d' -retentionFilter='{env=~"dev|staging"}:30d' -retentionPeriod=1y

There are two gauge metrics to monitor the retention filters process:

vm_retention_filters_partitions_scheduled shows the total number of partitions scheduled for retention filters
vm_retention_filters_partitions_scheduled_size_bytes shows the total size of scheduled partitions.

Additionally, a log message with the filter expression and the partition name is written to the log on the start and completion of the operation.

Important notes:

The data outside the configured retention isn't deleted instantly - it is deleted eventually during background merges.
The -retentionFilter doesn't remove old data from IndexDB until the configured -retentionPeriod. So the IndexDB size can grow big under high churn rate even for small retentions configured via -retentionFilter.

Retention filters configuration can be tested in enterprise version of vmui on the page Tools.Retention filters debug. It is safe updating -retentionFilter during VictoriaMetrics restarts - the updated retention filters are applied eventually to historical data.

It's expected that resource usage will temporarily increase when -retentionFilter is applied. This is because additional operations are required to read the data, filter and apply retention to partitions, which will cost extra CPU and memory.

See how to configure multiple retentions in VictoriaMetrics cluster.

Downsampling

VictoriaMetrics Enterprise supports multi-level downsampling via -downsampling.period=offset:interval command-line flag. This command-line flag instructs leaving the last sample per each interval for time series samples older than the offset. The offset must be a multiple of interval. For example, -downsampling.period=30d:5m instructs leaving the last sample per each 5-minute interval for samples older than 30 days, while the rest of samples are dropped.

The -downsampling.period command-line flag can be specified multiple times in order to apply different downsampling levels for different time ranges (aka multi-level downsampling). For example, -downsampling.period=30d:5m,180d:1h instructs leaving the last sample per each 5-minute interval for samples older than 30 days, while leaving the last sample per each 1-hour interval for samples older than 180 days.

VictoriaMetrics supports{{% available_from "v1.100.0" %}} configuring independent downsampling per different sets of time series via -downsampling.period=filter:offset:interval syntax. In this case the given offset:interval downsampling is applied only to time series matching the given filter. The filter can contain arbitrary series filter. For example, -downsampling.period='{__name__=~"(node|process)_.*"}:1d:1m instructs VictoriaMetrics to downsample samples older than one day with one minute interval only for time series with names starting with node_ or process_ prefixes. The downsampling for other time series can be configured independently via additional -downsampling.period command-line flags. Downsampling configuration can be tested in enterprise version of vmui on the page Tools.Downsampling filters debug.

If the time series doesn't match any filter, then it isn't downsampled. If the time series matches multiple filters, then the downsampling for the first matching filter is applied. For example, -downsampling.period='{env="prod"}:1d:30s,{__name__=~"node_.*"}:1d:5m' de-duplicates samples older than one day with 30 seconds interval across all the time series with env="prod" label, even if their names start with node_ prefix. All the other time series with names starting with node_ prefix are de-duplicated with 5 minutes interval.

If downsampling shouldn't be applied to some time series matching the given filter, then pass -downsampling.period=filter:0s:0s command-line flag to VictoriaMetrics. For example, if series with env="prod" label shouldn't be downsampled, then pass -downsampling.period='{env="prod"}:0s:0s' command-line flag in front of other -downsampling.period flags. But -downsampling.period=0s:interval or -downsampling.period=filter:0s:0s cannot be used with deduplication simultaneously as they could conflict.

Downsampling is applied independently per each time series and leaves a single raw sample with the biggest timestamp on the configured interval, in the same way as deduplication does. It works the best for counters and histograms, as their values are always increasing. Downsampling gauges and summaries lose some changes within the downsampling interval, since only the last sample on the given interval is left and the rest of samples are dropped.

You can use recording rules or streaming aggregation to apply custom aggregation functions, like min/max/avg etc., in order to make gauges more resilient to downsampling.

Downsampling can reduce disk space usage and improve query performance if it is applied to time series with big number of samples per each series. The downsampling doesn't improve query performance and doesn't reduce disk space if the database contains big number of time series with small number of samples per each series, since downsampling doesn't reduce the number of time series. So there is little sense in applying downsampling to time series with high churn rate. In this case the majority of query time is spent on searching for the matching time series instead of processing the found samples. See Why IndexDB size is so large?.

Downsampling is performed during background merges. It cannot be performed if there is not enough of free disk space or if vmstorage is in read-only mode.

Downsampling period changes /api/v1/export API output. During query requests, if export start period is not specified and reduce_mem_usage param is omitted, the biggest downsampling.period is applied. As an example, export request /api/v1/export?match[]=series with -downsampling.period=30d:1h,180d:24h will return samples downsampled with 24h interval.

It's expected that resource usage will temporarily increase when downsampling with filters is applied. This is because additional operations are required to read historical data, downsample, and persist it back, which will cost extra CPU and memory.

Please, note that intervals of -downsampling.period for a single filter must be multiples of each other. In case deduplication is enabled, value of -dedup.minScrapeInterval command-line flag must also be multiple of -downsampling.period intervals. This is required to ensure consistency of deduplication and downsampling results.

It is safe updating -downsampling.period during VictoriaMetrics restarts - the updated downsampling configuration will be applied eventually to historical data during background merges.

See how to configure downsampling in VictoriaMetrics cluster.

Multi-tenancy

Single-node VictoriaMetrics doesn't support multi-tenancy. Use the cluster version instead.

Scalability and cluster version

Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU. This means that a single-node VictoriaMetrics may scale vertically and substitute a moderately sized cluster built with competing solutions such as Thanos, Uber M3, InfluxDB or TimescaleDB. See vertical scalability benchmarks.

So try single-node VictoriaMetrics at first and then switch to the cluster version if you still need horizontally scalable long-term remote storage for really large Prometheus deployments. Contact us for enterprise support.

Alerting

It is recommended using vmalert for alerting.

Additionally, alerting can be set up with the following tools:

With Prometheus - see the corresponding docs.
With Promxy - see the corresponding docs.
With Grafana - see the corresponding docs.

Security

Supported Versions

The following versions of VictoriaMetrics receive regular security fixes:

Version	Supported
Latest release	✅
LTS releases	✅
other releases	❌

Software Bill of Materials (SBOM)

Every VictoriaMetrics container{{% available_from "v1.137.0" %}} image published to Docker Hub and Quay.io include an SPDX SBOM attestation generated automatically by BuildKit during docker buildx build.

To inspect the SBOM for an image:

docker buildx imagetools inspect \
  docker.io/victoriametrics/victoria-metrics:latest \
  --format "{{ json .SBOM }}"

To scan an image using its SBOM attestation with Trivy:

trivy image --sbom-sources oci \
  docker.io/victoriametrics/victoria-metrics:latest

Reporting a Vulnerability

Please report any security issues to security@victoriametrics.com

General security recommendations:

All the VictoriaMetrics components must run in protected private networks without direct access from untrusted networks such as Internet. The exception is vmauth and vmgateway, which are intended for serving public requests and performing authorization with TLS termination.
All the requests from untrusted networks to VictoriaMetrics components must go through auth proxy such as vmauth or vmgateway. The proxy must be set up with proper authentication and authorization.
Prefer using lists of allowed API endpoints, while disallowing access to other endpoints when configuring vmauth in front of VictoriaMetrics components.
Set reasonable Strict-Transport-Security header value to all the components to mitigate MitM attacks, for example: max-age=31536000; includeSubDomains. See -http.header.hsts flag.
Set reasonable Content-Security-Policy header value to mitigate XSS attacks. See -http.header.csp flag.
Set reasonable X-Frame-Options header value to mitigate clickjacking attacks, for example DENY. See -http.header.frameOptions flag.

VictoriaMetrics provides the following security-related command-line flags:

-tls, -tlsCertFile and -tlsKeyFile for switching from HTTP to HTTPS at -httpListenAddr (TCP port 8428 is listened by default). Enterprise version of VictoriaMetrics supports automatic issuing of TLS certificates. See these docs.
-mtls and -mtlsCAFile for enabling mTLS for requests to -httpListenAddr. See these docs.
-httpAuth.username and -httpAuth.password for protecting all the HTTP endpoints with HTTP Basic Authentication.
-deleteAuthKey for protecting /api/v1/admin/tsdb/delete_series endpoint. See how to delete time series.
-snapshotAuthKey for protecting /snapshot* endpoints. See how to work with snapshots.
-forceFlushAuthKey for protecting /internal/force_flush endpoint. See these docs.
-forceMergeAuthKey for protecting /internal/force_merge endpoint. See force merge docs.
-search.resetCacheAuthKey for protecting /internal/resetRollupResultCache endpoint. See backfilling for more details.
-reloadAuthKey for protecting /-/reload endpoint, which is used for force reloading of -promscrape.config.
-configAuthKey for protecting /config endpoint, since it may contain sensitive information such as passwords.
-flagsAuthKey for protecting /flags endpoint.
-pprofAuthKey for protecting /debug/pprof/* endpoints, which can be used for profiling.
-metricNamesStatsResetAuthKey for protecting /api/v1/admin/status/metric_names_stats/reset endpoint, used for Metric Names Tracker.
-denyQueryTracing for disallowing query tracing.
-http.header.hsts, -http.header.csp, and -http.header.frameOptions for serving Strict-Transport-Security, Content-Security-Policy and X-Frame-Options HTTP response headers.

Explicitly set internal network interface for TCP and UDP ports for data ingestion with Graphite and OpenTSDB formats. For example, substitute -graphiteListenAddr=:2003 with -graphiteListenAddr=<internal_iface_ip>:2003. This protects from unexpected requests from untrusted network interfaces.

CVE handling policy

Source code: Go dependencies are scanned by govulncheck in CI. All vulnerabilities must be fixed before next scheduled release and backported to LTS releases.

Docker images: CVE findings in Alpine base image pose minimal risk since VictoriaMetrics binaries are statically compiled with no OS dependencies. When detected, only the Alpine base tag is updated. Releases proceed as planned even if upstream fixes are not yet available. For maximum security, hardened scratch-based images are also provided. All images are continuously scanned by Docker Hub and verified before release using grype.

mTLS protection

By default VictoriaMetrics accepts http requests at 8428 port (this port can be changed via -httpListenAddr command-line flags). Enterprise version of VictoriaMetrics supports the ability to accept mTLS requests at this port, by specifying -tls and -mtls command-line flags. For example, the following command runs VictoriaMetrics, which accepts only mTLS requests at port 8428:

./victoria-metrics -tls -mtls

By default, system-wide TLS Root CA is used for verifying client certificates if -mtls command-line flag is specified. It is possible to specify custom TLS Root CA via -mtlsCAFile command-line flag.

Automatic issuing of TLS certificates

All the VictoriaMetrics Enterprise components support automatic issuing of TLS certificates for public HTTPS server running at -httpListenAddr via Let's Encrypt service. The following command-line flags must be set in order to enable automatic issuing of TLS certificates:

-httpListenAddr must be set for listening TCP port 443. For example, -httpListenAddr=:443. This port must be accessible by the Let's Encrypt service.
-tls must be set in order to accept HTTPS requests at -httpListenAddr. Note that -tlcCertFile and -tlsKeyFile aren't needed when automatic TLS certificate issuing is enabled.
-tlsAutocertHosts must be set to comma-separated list of hosts, which can be reached via -httpListenAddr. TLS certificates are automatically issued for these hosts.
-tlsAutocertEmail must be set to contact email for the issued TLS certificates.
-tlsAutocertCacheDir may be set to the directory path for persisting the issued TLS certificates between VictoriaMetrics restarts. If this flag isn't set, then TLS certificates are re-issued on every restart.

This functionality can be evaluated for free according to these docs.

Tuning

No need in tuning for VictoriaMetrics - it uses reasonable defaults for command-line flags, which are automatically adjusted for the available CPU and RAM resources.
No need in tuning for Operating System - VictoriaMetrics is optimized for default OS settings. The only option is increasing the limit on the number of open files in the OS. The recommendation is not specific for VictoriaMetrics only but also for any service which handles many HTTP connections and stores data on disk.
VictoriaMetrics is a write-heavy application and its performance depends on disk performance. So be careful with other applications or utilities (like fstrim) which could exhaust disk resources.
The recommended filesystem is ext4, the recommended persistent storage is persistent HDD-based disk on GCP, since it is protected from hardware failures via internal replication and it can be resized on the fly. If you plan to store more than 1TB of data on ext4 partition, then the following options are recommended to pass to mkfs.ext4:

mkfs.ext4 ... -O 64bit,huge_file,extent -T huge

Monitoring

VictoriaMetrics exports internal metrics in Prometheus exposition format at /metrics page. These metrics can be scraped via vmagent or any other Prometheus-compatible scraper.

Single-node VictoriaMetrics can self-scrape its metrics when -selfScrapeInterval command-line flag is set to duration greater than 0. For example, -selfScrapeInterval=10s scrapes /metrics page every 10 seconds.

See the list of official Grafana dashboards for VictoriaMetrics components.

Please follow the monitoring recommendations below:

Prefer giving distinct scrape job names per each component type. I.e. vmagent and vmalert should have corresponding job names.
Never use load balancer address for scraping metrics. All the monitored components should be scraped directly by their address.
Set up recommended alerts via vmalert or via Prometheus.
See currently running queries and their execution times at active queries page.
See queries that take the most time to execute at top queries page.

VictoriaMetrics Cloud provides built-in monitoring dashboards and automatic alerts when resource consumption is high or configured limits are approached, so you get notified before issues impact your workload. See the VictoriaMetrics Cloud documentation to get started.

VictoriaMetrics components do not expose metadata TYPE and HELP fields on /metrics page. Services like Google Cloud Managed Prometheus could require metadata to be present for scraping. In this case, pass -metrics.exposeMetadata command-line to them. See these docs for details.

TSDB stats

VictoriaMetrics returns TSDB stats at /api/v1/status/tsdb page in the way similar to Prometheus - see these Prometheus docs. VictoriaMetrics accepts the following optional query args at /api/v1/status/tsdb page:

topN=N where N is the number of top entries to return in the response. By default, top 10 entries are returned.
date=YYYY-MM-DD where YYYY-MM-DD is the date for collecting the stats. By default, the stats is collected for the current day.
focusLabel=LABEL_NAME returns label values with the highest number of time series for the given LABEL_NAME in the seriesCountByFocusLabelValue list.
match[]=SELECTOR where SELECTOR is an arbitrary time series selector for series to take into account during stats calculation. By default all the series are taken into account.
extra_label=LABEL=VALUE. See these docs for more details.

VictoriaMetrics provides UI on top of /api/v1/status/tsdb - see cardinality explorer docs.

VictoriaMetrics enhances Prometheus stats with requestsCount and lastRequestTimestamp for seriesCountByMetricName. This stats added if tracking metric names stats is configured.

Track ingested metrics usage

VictoriaMetrics can track statistics of fetched metric names during querying {{% available_from "v1.113.0" %}}. It tracks only metric names, as the number of names is usually limited (thousands) compared to time series (millions or billions). This feature can be disabled via the flag --storage.trackMetricNamesStats=false (enabled by default) on a single-node VictoriaMetrics or vmstorage.

During querying, VictoriaMetrics tracks how many times the requested metric name was fetched from the database and when was the last time it happened. In this way, it is possible to identify metric names that were never queried. Or if metric was queried occasionally - when the last time it happened.

The usage stats for a metric won't update in these two cases:

Querying a metric with non-matching filters. For example, querying for vm_log_messages_total{level!="info"} won't update usage stats for vm_log_messages_total if there is no {level!="info"} series yet.
The query response is fully cached in the rollup result cache.

To get metric names usage statistics, use the /prometheus/api/v1/status/metric_names_stats API endpoint for a single-node VictoriaMetrics (or at http://<vmselect>:8481/select/<accountID>/prometheus/api/v1/status/metric_names_stats in cluster version of VictoriaMetrics). It accepts the following query parameters:

limit - integer value to limit the number of metric names in response. By default, API returns 1000 records.
le - less than or equal, is an integer threshold for filtering metric names by their usage count in queries. For example, with ?le=1 API returns metric names that were queried <=1 times.
match_pattern - a regex pattern to match metric names. For example, ?match_pattern=vm_ will match any metric names with vm_ pattern, like vm_http_requests, max_vm_memory_available.

The API endpoint returns the following JSON response:

{
  "status": "success",
  "statsCollectedSince": 1737534094,
  "statsCollectedRecordsTotal": 2,
  "records": [
    {
      "metricName": "node_disk_writes_completed_total",
      "queryRequestsCount": 50,
      "lastRequestTimestamp": 1737534262
    },
    {
      "metricName": "node_network_transmit_errs_total",
      "queryRequestsCount": 100,
      "lastRequestTimestamp": 1737534262
    }
  ]
}

statsCollectedSince is a timestamp since tracker was enabled (or reset, see below);
statsCollectedRecordsTotal total number of metric names it contains;
records:
- metricName a metric name;
- queryRequestsCount a cumulative counter of times the metric was fetched. If metric name foo has 10 time series, then one read query foo will increment counter by 10.
- lastRequestTimestamp a timestamp when last time this statistic was updated.

VictoriaMetrics tracks metric names query statistics for /api/v1/query, /api/v1/query_range, /render, /federate and /api/v1/export API calls.

VictoriaMetrics stores tracked metric names in memory and saves the state to disk in the <-storageDataPath>/cache folder during restarts. The size of the in-memory state is limited to 1% of the available memory by default. This limit can be adjusted using the -storage.cacheSizeMetricNamesStats flag.

When the maximum state capacity is reached, VictoriaMetrics will stop tracking stats for newly registered time series. However, read request statistics for already tracked time series will continue to work as expected.

VictoriaMetrics exposes the following metrics for the metric name tracker:

vm_cache_size_bytes{type="storage/metricNamesStatsTracker"}
vm_cache_size{type="storage/metricNamesStatsTracker"}
vm_cache_size_max_bytes{type="storage/metricNamesStatsTracker"}

An alerting rule with query vm_cache_size_bytes{type="storage/metricNamesStatsTracker"} \ vm_cache_size_max_bytes{type="storage/metricNamesStatsTracker"} > 0.9 can be used to notify the user of cache utilization exceeding 90%.

The metric name tracker state can be reset via the API endpoint /api/v1/admin/status/metric_names_stats/reset for a single-node VictoriaMetrics (or at http://<vmselect>:8481/admin/api/v1/admin/status/metric_names_stats/reset in cluster version of VictoriaMetrics) or via cache removal procedure. This reset state endpoint can be protected via -metricNamesStatsResetAuthKey cmd-line flag. See Security for details.

Query tracing

VictoriaMetrics supports query tracing, which can be used for determining bottlenecks during query processing. This is like EXPLAIN ANALYZE from Postgresql.

Query tracing can be enabled for a specific query by passing trace=1 query arg. In this case VictoriaMetrics puts query trace into trace field in the output JSON.

For example, the following command:

curl http://localhost:8428/api/v1/query_range -d 'query=2*rand()' -d 'start=-1h' -d 'step=1m' -d 'trace=1' | jq '.trace'

would return the following trace:

{
  "duration_msec": 0.099,
  "message": "/api/v1/query_range: start=1654034340000, end=1654037880000, step=60000, query=\"2*rand()\": series=1",
  "children": [
    {
      "duration_msec": 0.034,
      "message": "eval: query=2 * rand(), timeRange=[1654034340000..1654037880000], step=60000, mayCache=true: series=1, points=60, pointsPerSeries=60",
      "children": [
        {
          "duration_msec": 0.032,
          "message": "binary op \"*\": series=1",
          "children": [
            {
              "duration_msec": 0.009,
              "message": "eval: query=2, timeRange=[1654034340000..1654037880000], step=60000, mayCache=true: series=1, points=60, pointsPerSeries=60"
            },
            {
              "duration_msec": 0.017,
              "message": "eval: query=rand(), timeRange=[1654034340000..1654037880000], step=60000, mayCache=true: series=1, points=60, pointsPerSeries=60",
              "children": [
                {
                  "duration_msec": 0.015,
                  "message": "transform rand(): series=1"
                }
              ]
            }
          ]
        }
      ]
    },
    {
      "duration_msec": 0.004,
      "message": "sort series by metric name and labels"
    },
    {
      "duration_msec": 0.044,
      "message": "generate /api/v1/query_range response for series=1, points=60"
    }
  ]
}

All the durations and timestamps in traces are in milliseconds.

Query tracing is allowed by default. It can be denied by passing -denyQueryTracing command-line flag to VictoriaMetrics.

VMUI provides an UI:

for query tracing - just click Trace query checkbox and re-run the query in order to investigate its' trace.
for exploring custom trace - go to the tab Trace analyzer and upload or paste JSON with trace information.

Cardinality limiter

By default, VictoriaMetrics doesn't limit the number of stored time series. The limit can be enforced by setting the following command-line flags:

-storage.maxHourlySeries - limits the number of time series that can be added during the last hour. Useful for limiting the number of active time series.
-storage.maxDailySeries - limits the number of time series that can be added during the last day. Useful for limiting daily churn rate.

Both limits can be set simultaneously. If any of these limits is reached, then incoming samples for new time series are dropped. A sample of dropped series is put in the log with WARNING level.

It is possible to use -1 as a value for these flags{{% available_from "v1.140.0" %}} in order to enable series tracking but set limit to maximum possible value. This is useful in order to estimate the number of unique series which is written to VictoriaMetrics single without enforcing limits.

The exceeded limits can be monitored with the following metrics:

vm_hourly_series_limit_rows_dropped_total - the number of metrics dropped due to exceeded hourly limit on the number of unique time series.
vm_hourly_series_limit_max_series - the hourly series limit set via -storage.maxHourlySeries command-line flag.
vm_hourly_series_limit_current_series - the current number of unique series during the last hour. The following query can be useful for alerting when the number of unique series during the last hour exceeds 90% of the -storage.maxHourlySeries:
```
vm_hourly_series_limit_current_series / vm_hourly_series_limit_max_series > 0.9
```
vm_daily_series_limit_rows_dropped_total - the number of metrics dropped due to exceeded daily limit on the number of unique time series.
vm_daily_series_limit_max_series - the daily series limit set via -storage.maxDailySeries command-line flag.
vm_daily_series_limit_current_series - the current number of unique series during the last day. The following query can be useful for alerting when the number of unique series during the last day exceeds 90% of the -storage.maxDailySeries:
```
vm_daily_series_limit_current_series / vm_daily_series_limit_max_series > 0.9
```

These limits are approximate, so VictoriaMetrics can underflow/overflow the limit by a small percentage (usually less than 1%).

See also more advanced cardinality limiter in vmagent and cardinality explorer docs.

Troubleshooting

It is recommended to use default command-line flag values (i.e. don't set them explicitly) until the need of tweaking these flag values arises.
It is recommended inspecting logs during troubleshooting, since they may contain useful information.
It is recommended upgrading to the latest available release from this page, since the encountered issue could be already fixed there.
It is recommended to have at least 50% of spare resources for CPU, disk IO and RAM, so VictoriaMetrics could handle short spikes in the workload without performance issues.
VictoriaMetrics requires free disk space for merging data files to bigger ones. It may slow down when there is no enough free space left. So make sure -storageDataPath directory has at least 20% of free space. The remaining amount of free space can be monitored via vm_free_disk_space_bytes metric. The total size of data stored on the disk can be monitored via sum of vm_data_size_bytes metrics.
If you run VictoriaMetrics on a host with 16 or more CPU cores, then it may be needed to tune the -search.maxWorkersPerQuery command-line flag in order to improve query performance. If VictoriaMetrics serves big number of concurrent select queries, then try reducing the value for this flag. If VictoriaMetrics serves heavy queries, which select >10K of time series and/or process >100M of raw samples per query, then try setting the value for this flag to the number of available CPU cores.
VictoriaMetrics buffers incoming data in memory for up to a few seconds before flushing it to persistent storage. This may lead to the following "issues":
- Data becomes available for querying in a few seconds after inserting. It is possible to flush in-memory buffers to searchable parts by requesting /internal/force_flush http handler. This handler is mostly needed for testing and debugging purposes.
- The last few seconds of inserted data may be lost on unclean shutdown (i.e. OOM, kill -9 or hardware reset). The -inmemoryDataFlushInterval command-line flag allows controlling the frequency of in-memory data flush to persistent storage. See storage docs and this article for more details.
If VictoriaMetrics works slowly and eats more than a CPU core per 100K ingested data points per second, then it is likely you have too many active time series for the current amount of RAM. VictoriaMetrics exposes vm_slow_* metrics such as vm_slow_row_inserts_total and vm_slow_metric_name_loads_total, which could be used as an indicator of low amounts of RAM. It is recommended increasing the amount of RAM on the node with VictoriaMetrics in order to improve ingestion and query performance in this case.
If the order of labels for the same metrics can change over time (e.g. if metric{k1="v1",k2="v2"} may become metric{k2="v2",k1="v1"}), then it is recommended running VictoriaMetrics with -sortLabels command-line flag in order to reduce memory usage and CPU usage.
VictoriaMetrics prioritizes data ingestion over data querying. So if it has no enough resources for data ingestion, then data querying may slow down significantly.
If VictoriaMetrics doesn't work because of certain parts are corrupted due to disk errors, then just remove directories with broken parts. It is safe removing subdirectories under <-storageDataPath>/data/{big,small}/YYYY_MM directories when VictoriaMetrics isn't running. This recovers VictoriaMetrics at the cost of data loss stored in the deleted broken parts. The names of broken parts should be present in the error message. If you see that error message is truncated and doesn't contain all the information try increasing -loggerMaxArgLen cmd-line flag to higher values to avoid error messages truncation.
If you see gaps on the graphs, try resetting the cache by sending request to /internal/resetRollupResultCache. If this removes gaps on the graphs, then it is likely data with timestamps older than -search.cacheTimestampOffset is ingested into VictoriaMetrics. Make sure that data sources have synchronized time with VictoriaMetrics.

If the gaps are related to irregular intervals between samples, then try adjusting -search.minStalenessInterval command-line flag to value close to the maximum interval between samples.
If you are switching from InfluxDB or TimescaleDB, then it may be needed to set -search.setLookbackToStep command-line flag. This suppresses default gap filling algorithm used by VictoriaMetrics - by default it assumes each time series is continuous instead of discrete, so it fills gaps between real samples with regular intervals.
Metrics and labels leading to high cardinality or high churn rate can be determined via cardinality explorer and via /api/v1/status/tsdb endpoint.
New time series can be logged if -logNewSeries command-line flag is passed to VictoriaMetrics or temporary enabled via /internal/log_new_series API call. /internal/log_new_series API accepts query parameter seconds, with default value of 60, which defines a duration for logging newly created series.
VictoriaMetrics limits the number of labels per each series, label name length and label value length via -maxLabelsPerTimeseries, -maxLabelNameLen and -maxLabelValueLen command-line flags respectively. Series that exceed the limits are ignored on ingestion. This prevents from ingesting malformed series. It is recommended monitoring vm_rows_ignored_total metric and VictoriaMetrics logs in order to determine whether limits must be adjusted for your workload. Alternatively, you can use relabeling to change metric target labels.
If you store Graphite metrics like foo.bar.baz in VictoriaMetrics, then {__graphite__="foo.*.baz"} filter can be used for selecting such metrics. See these docs for details. You can also query Graphite metrics with Graphite querying API.
VictoriaMetrics ignores NaN values during data ingestion.

Push metrics

All the VictoriaMetrics components support pushing their metrics exposed at /metrics page to remote storage in Prometheus text exposition format. This functionality may be used instead of classic Prometheus-like metrics scraping if VictoriaMetrics components are located in isolated networks, so they cannot be scraped by local vmagent.

The following command-line flags are related to pushing metrics from VictoriaMetrics components:

-pushmetrics.url - the url to push metrics to. For example, -pushmetrics.url=http://victoria-metrics:8428/api/v1/import/prometheus instructs to push internal metrics to /api/v1/import/prometheus endpoint according to these docs. The -pushmetrics.url can be specified multiple times. In this case metrics are pushed to all the specified urls. The url can contain basic auth params in the form http://user:pass@hostname/api/v1/import/prometheus. Metrics are pushed to the provided -pushmetrics.url in a compressed form with Content-Encoding: gzip request header. This allows reducing the required network bandwidth for metrics push. The compression can be disabled by passing -pushmetrics.disableCompression command-line flag.
-pushmetrics.extraLabel - labels to add to all the metrics before sending them to every -pushmetrics.url. Each label must be specified in the format label="value". It is OK to specify multiple -pushmetrics.extraLabel command-line flags. In this case all the specified labels are added to all the metrics before sending them to all the configured -pushmetrics.url addresses.
-pushmetrics.interval - the interval between pushes. By default it is set to 10 seconds.
-pushmetrics.header - an optional HTTP header to send to every -pushmetrics.url. For example, -pushmetrics.header='Authorization: Basic foo' instructs to send Authorization: Basic foo HTTP header with every request to every -pushmetrics.url. It is possible to set multiple -pushmetrics.header command-line flags for sending multiple different HTTP headers to -pushmetrics.url.

For example, the following command instructs VictoriaMetrics to push metrics from /metrics page to https://maas.victoriametrics.com/api/v1/import/prometheus with user:pass Basic auth. The instance="foobar" and job="vm" labels are added to all the metrics before sending them to the remote storage:

/path/to/victoria-metrics \
  -pushmetrics.url=https://user:pass@maas.victoriametrics.com/api/v1/import/prometheus \
  -pushmetrics.extraLabel='instance="foobar"' \
  -pushmetrics.extraLabel='job="vm"'

Caches

Cache removal

VictoriaMetrics uses various internal caches. These caches are stored to <-storageDataPath>/cache directory during graceful shutdown (e.g. when VictoriaMetrics is stopped by sending SIGINT signal). The caches are read on the next VictoriaMetrics startup. Sometimes it is needed to remove such caches on the next startup. This can be done in the following ways:

By manually removing the <-storageDataPath>/cache directory when VictoriaMetrics is stopped.
By placing reset_cache_on_startup file inside the <-storageDataPath>/cache directory before the restart of VictoriaMetrics. In this case VictoriaMetrics will automatically remove all the caches on the next start. See this issue for details.

It is also possible removing rollup result cache on startup by passing -search.resetRollupResultCacheOnStartup command-line flag to VictoriaMetrics.

Rollup result cache

VictoriaMetrics caches query responses by default and utilizes the cache for future queries when possible. This improves performance for repeated queries to /api/v1/query and /api/v1/query_range with the increasing time, start and end query args.

For range query: the cache can be used for queries with the same expression and step. For instant query: the cache can be used for queries with the same expression that uses a lookbehind window larger than -search.minWindowForInstantRollupOptimization and specific functions such as xx_over_time, increase, rate. (For rate, the cache result may be inaccurate in edge cases, see this issue for details)

This cache may work incorrectly when ingesting historical data into VictoriaMetrics. See these docs for details.

The rollup cache can be disabled either globally by running VictoriaMetrics with -search.disableCache command-line flag or on a per-query basis by passing nocache=1 query arg to /api/v1/query and /api/v1/query_range.

Cache tuning

VictoriaMetrics uses various in-memory caches for faster data ingestion and query performance. The following metrics for each type of cache are exported at /metrics page:

vm_cache_size_bytes - the actual cache size
vm_cache_size_max_bytes - cache size limit
vm_cache_requests_total - the number of requests to the cache
vm_cache_misses_total - the number of cache misses
vm_cache_entries - the number of entries in the cache

Both Grafana VictoriaMetrics - single-node and VictoriaMetrics - cluster dashboards contain Troubleshooting section where the cache metrics are visualized. The Cache usage % panel shows the percentage of used cache size from the allowed size by type. If the percentage is below 100%, then no further tuning needed. The Cache miss ratio panel shows the percentage of reads for which no value was found in the cache. If the cache utilization is 100% and there are cache misses, then the cache is either not accepting new entries or evicting existing ones. Its size may need to be increased.

Please note, default cache sizes were carefully adjusted accordingly to the most practical scenarios and workloads. Change the defaults only if you understand the implications and vmstorage has enough free memory to accommodate new cache sizes.

To override the default values see command-line flags with -storage.cacheSize prefix. See the full description of command-line flags.

Data migration

From VictoriaMetrics

The simplest way to migrate data from one single-node (source) to another (destination), or from one vmstorage node to another is to do the following:

Stop the VictoriaMetrics (source) with kill -INT;
Copy (via rsync or any other tool) the entire folder specified via -storageDataPath from the source node to an empty folder at the destination node.
Once copy is done, stop the VictoriaMetrics (destination) with kill -INT and verify that its -storageDataPath points to the copied folder from p.2;
Start the VictoriaMetrics (destination). The copied data should be now available.

Things to consider when copying data:

Data formats between single-node and vmstorage node aren't compatible and can't be copied.
Copying a data folder means complete replacement of the previous data on destination VictoriaMetrics.
Data can't be mixed: make sure that the destination folder is empty before copying.

For scenarios like single-to-cluster, cluster-to-single, re-sharding or migrating only a fraction of data: see how to migrate data from VictoriaMetrics via vmctl.

From other systems

Use vmctl to migrate data from other systems to VictoriaMetrics.

Backfilling

VictoriaMetrics accepts out-of-order historical data via any supported ingestion method without limitations. Only make sure that backfilled data is within of the configured retention period.

See how to backfill recording rules via vmalert.

It is recommended disabling query cache with -search.disableCache command-line flag when writing historical data with timestamps from the past, since the cache assumes that the data is written with the current timestamps. Query cache can be enabled after the backfilling is complete.

An alternative solution is to query /internal/resetRollupResultCache after the backfilling is complete. This will reset the query cache, which could contain incomplete data cached during the backfilling.

Yet another solution is to increase -search.cacheTimestampOffset flag value to disable caching for data with timestamps close to the current time. Single-node VictoriaMetrics automatically resets response cache when samples with timestamps older than now - search.cacheTimestampOffset are ingested to it.

Data updates

VictoriaMetrics doesn't support updating already existing sample values to new ones. It stores all the ingested data points for the same time series with identical timestamps. While it is possible substituting old time series with new time series via removal of old time series and then writing new time series, this approach should be used only for one-off updates. It shouldn't be used for frequent updates because of non-zero overhead related to data removal.

Replication

Single-node VictoriaMetrics doesn't support application-level replication. Use cluster version instead. See these docs for details.

Storage-level replication may be offloaded to durable persistent storage such as Google Cloud disks.

See also high availability docs and backup docs.

Backups

For backup configuration and setup, please refer to vmbackup documentation.

vmalert

A single-node VictoriaMetrics is capable of proxying requests to vmalert when -vmalert.proxyURL flag is set. Use this feature for the following cases:

for proxying requests from Grafana Alerting UI;
for accessing vmalerts UI through single-node VictoriaMetrics Web interface.

For accessing vmalerts UI through single-node VictoriaMetrics configure -vmalert.proxyURL flag and visit http://<victoriametrics-addr>:8428/vmalert/ link.

Benchmarks

Note, that vendors (including VictoriaMetrics) are often biased when doing such tests. E.g. they try highlighting the best parts of their product, while highlighting the worst parts of competing products. So we encourage users and all independent third parties to conduct their benchmarks for various products they are evaluating in production and publish the results.

As a reference, please see benchmarks conducted by VictoriaMetrics team. Please also see the helm chart for running ingestion benchmarks based on node_exporter metrics.

Profiling

VictoriaMetrics provides handlers for collecting the following Go profiles:

Memory profile. It can be collected with the following command (replace 0.0.0.0 with hostname if needed):

curl http://0.0.0.0:8428/debug/pprof/heap > mem.pprof

CPU profile. It can be collected with the following command (replace 0.0.0.0 with hostname if needed):

curl http://0.0.0.0:8428/debug/pprof/profile > cpu.pprof

The command for collecting CPU profile waits for 30 seconds before returning.

The collected profiles may be analyzed with go tool pprof. It is safe sharing the collected profiles from security point of view, since they do not contain sensitive information.

Third-party contributions

Prometheus -> VictoriaMetrics exporter #1
Prometheus -> VictoriaMetrics exporter #2
Prometheus Oauth proxy - see this article for details.

Contacts

Community and contributions

Feel free asking any questions regarding VictoriaMetrics:

If you like VictoriaMetrics and want to contribute, then please read these docs.

Reporting bugs

Report bugs and propose new features in our GitHub Issues.

Documentation

VictoriaMetrics documentation is available at https://docs.victoriametrics.com/victoriametrics/. It is built from *.md files located in docs folder and gets automatically updated once changes are merged to master branch. To update the documentation follow the steps below:

Fork VictoriaMetrics repo and apply changes to the docs:
- To update the main page modify this file.
- To update other pages, apply changes to the corresponding file in docs folder.
If your changes contain an image then see images in documentation.
Create a pull request with proposed changes and wait for it to be merged. See pull request checklist.

Requirements for changes to docs:

Keep backward compatibility of existing links. Avoid changing anchors or deleting pages as they could have been used or posted in other docs, GitHub issues, Stack Overflow answers, etc.
Keep docs clear, concise and simple. Try using as simple wording as possible, without sacrificing clarity.
Keep docs consistent. When modifying existing docs, verify that other places referencing to this doc are still relevant.
Prefer improving the existing docs instead of adding new ones.
Use absolute links. This simplifies moving docs between different files.

Periodically run make spellcheck - this command detects spelling errors at docs/ folder. Please fix the found spelling errors and commit the fixes in a separate commit.

Images in documentation

Please, keep image size and number of images per single page low. Keep the docs page as lightweight as possible.

Image files must be placed in the same folder as the doc itself and they must have the same prefix as the doc filename. For example, all the images for docs/foo/bar.md should have filenames starting from docs/foo/bar. This simplifies lifetime management of the images:

when the corresponding doc is removed, then it is clear how to remove the associated images
when the corresponding doc is renamed, then it is clear how to rename the associated images.

If the page needs to have many images, consider using WEB-optimized image format webp. When adding a new doc with many images use webp format right away. Or use a Makefile command below to convert already existing images at docs folder automatically to web format:

make docs-images-to-webp

Once conversion is done, update the path to images in your docs and verify everything is correct.

VictoriaMetrics Logo

Zip contains three folders with different image orientations (main color and inverted version).

Files included in each folder:

2 JPEG Preview files
2 PNG Preview files with transparent background
2 EPS Adobe Illustrator EPS10 files

Logo Usage Guidelines

Font used

Lato Black
Lato Regular

Color Palette

HEX #110f0f
HEX #ffffff

We kindly ask

Please don't use any other font instead of suggested.
To keep enough clear space around the logo.
Do not change spacing, alignment, or relative locations of the design elements.
Do not change the proportions for any of the design elements or the design itself. You may resize as needed but must retain all proportions.

List of command-line flags

Pass -help to VictoriaMetrics in order to see the list of supported command-line flags with their description:

Common flags

These flags are available in both VictoriaMetrics OSS and VictoriaMetrics Enterprise. {{% content "victoria_metrics_common_flags.md" %}}

Enterprise flags

These flags are available only in VictoriaMetrics enterprise. {{% content "victoria_metrics_enterprise_flags.md" %}}

Section below contains backward-compatible anchors for links that were moved or renamed.

Sending data via OpenTelemetry

See OpenTelemetry integration for protocol details, metric naming and histogram conversion.
See OpenTelemetry Collector for collector configuration.

How to send data from NewRelic agent

Moved to integrations/newrelic.

Graphite API usage

Moved to integrations/graphite/#graphite-api-usage.

Graphite Render API usage

Moved to integrations/graphite/#render-api.

Graphite Metrics API usage

Moved to integrations/graphite/#metrics-api.

Graphite Tags API usage

Moved to integrations/graphite/#tags-api.

Integrations

Moved to integrations.

Playgrounds

The VictoriaMetrics playgrounds have been moved to Playgrounds.