Compare commits

...

5 Commits

Author SHA1 Message Date
hagen1778
60d8c35274 update wording
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-05-25 13:16:12 +02:00
hagen1778
b0a6e5c78b fix diagram typo
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-05-25 13:11:28 +02:00
Pablo (Tomas) Fernandez
ec5fc77d25 Apply suggestions from code review
Co-authored-by: cubic-dev-ai[bot] <191113872+cubic-dev-ai[bot]@users.noreply.github.com>
Co-authored-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
Signed-off-by: Pablo (Tomas) Fernandez <46322567+TomFern@users.noreply.github.com>
2026-05-25 11:54:16 +01:00
hagen1778
decfc0feff wording
Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-05-25 12:04:35 +02:00
hagen1778
0e9e60ceba docs: add HA section to stream aggregation
Adds guidance on how to build horizontally scalable pipeline for
stream aggregation.

Signed-off-by: hagen1778 <roman@victoriametrics.com>
2026-05-25 12:03:49 +02:00

View File

@@ -26,6 +26,7 @@ Stream aggregation has the following features:
and/or scraped from [Prometheus-compatible targets](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#how-to-scrape-prometheus-exporters-such-as-node-exporter)
- It can filter out raw samples matched by aggregation rules, so raw data will never reach the remote destination. See `-streamAggr.keepInput` and `-streamAggr.dropInput` in [aggregation config](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/configuration/);
- It allows building [flexible processing pipelines](#routing);
- It is [horizontally scalable](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#scaling-aggregation-horizontally).
# Limitations
@@ -598,6 +599,45 @@ Below is an example of an `aggr.yaml` configuration that drops the `replica` and
keep_metric_names: true
```
## Scaling aggregation horizontally
Aggregation output is only correct when all contributing samples are processed by the same aggregator instance.
To scale the aggregation horizontally, always shard the input samples in a deterministic way. This can be achieved by
building a two layer topology of vmagents where the first layer is responsible for sharding, and the second layer is responsible for aggregating:
```mermaid
flowchart LR
V1[vmagent-shard-1] -- requests_total{env=test, pod=foo} --> SV1[vmagent-aggr-1]
V1[vmagent-shard-1] -- requests_total{env=prod, pod=bar} --> SV2[vmagent-aggr-1]
V2[vmagent-shard-2] -- requests_total{env=prod, pod=baz} --> SV2[vmagent-aggr-2]
SV1 -- requests_total:5m_without_pod_total{env=test} --> x(( ))
SV2 -- requests_total:5m_without_pod_total{env=prod} --> y(( ))
style x fill:none,stroke:none
style y fill:none,stroke:none
```
The sharding layer of vmagents can be configured via the `-remoteWrite.shardByURL.labels` or `-remoteWrite.shardByURL.ignoreLabels`
command line flags. See how to [shard data across remote write destinations](https://docs.victoriametrics.com/victoriametrics/vmagent/#sharding-among-remote-storages) for more details.
The following requirements must be met for sharded aggregation to work correctly:
- All sharding vmagents should have the same deterministic sharding configuration.
- The sharding configuration must align with the `by` and `without` lists:
- If you aggregate `by: env` - make sure that `env` label is listed in the routing key of sharding agents: `-remoteWrite.shardByURL.labels=env`.
This makes sure that all the samples for the same `env` are aggregated together and produce the complete output.
- If you aggregate `without: pod` - make sure that `pod` label is excluded from the routing key of sharding agents: `-remoteWrite.shardByURL.ignoreLabels=pod`.
This makes sure that `requests_total{env=test, pod=foo}` and `requests_total{env=test, pod=bar}` are routed to the same aggregator
and are aggregated together. See also [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5938#issuecomment-2018470324).
- Aggregating vmagents should not produce collisions: the aggregation output should be unique across all the sharded agents.
For example, `requests_total:5m_without_env_pod_total` produced by both `vmagent-aggr-1` and `vmagent-aggr-2` will collide
unless they have labels uniquely identifying them. These labels should be either preserved during sharding and aggregation config,
or enforced on the output via `-remoteWrite.label` - see [these docs](https://docs.victoriametrics.com/victoriametrics/stream-aggregation/#cluster-mode) for more details.
> Never shard histograms by `le` (or `vmrange` in case of VM histograms) label. A histogram is a logical group of series differing
only in the bucket label. All of those buckets must land on the same aggregator at the same time so it can produce a
coherent bucket set. See more about [aggregating histograms](https://docs.victoriametrics.com/stream-aggregation/#aggregating-histograms).
See also [why you shouldn't put an aggregator behind a load balancer](https://docs.victoriametrics.com/stream-aggregation/#put-aggregator-behind-load-balancer).
# Troubleshooting
- [Unexpected spikes for `total` or `increase` outputs](#staleness).