Compare commits

...

3 Commits

Author SHA1 Message Date
Hui Wang
955bd6981e Merge branch 'master' into correct-sa-shard-doc 2026-06-25 20:13:46 +08:00
Fred Navruzov
50a827256a docs/vmanomaly: fill in missing args and links (post v1.29.7 update) (#11165)
Addition of missing links/args and slight refactor of changelog notes
for clarity (post v1.29.7 update)

Follow-up on e30e8be1f4
2026-06-25 10:27:18 +03:00
Haley Wang
73ba62c741 doc: correct sharding configuration guidance for stream aggregation 2026-06-25 11:38:56 +08:00
3 changed files with 22 additions and 9 deletions

View File

@@ -17,11 +17,11 @@ Please find the changelog for VictoriaMetrics Anomaly Detection below.
## v1.29.7
Released: 2026-06-25
- UI: updated [vmanomaly UI](https://docs.victoriametrics.com/anomaly-detection/ui/) from [v1.7.1](https://docs.victoriametrics.com/anomaly-detection/ui/#v171) to [v1.7.2](https://docs.victoriametrics.com/anomaly-detection/ui/#v172), see respective [release notes](https://docs.victoriametrics.com/anomaly-detection/ui/#v172) for details.
- UI: updated [vmanomaly UI](https://docs.victoriametrics.com/anomaly-detection/ui/) from [v1.7.1](https://docs.victoriametrics.com/anomaly-detection/ui/#v171) to [v1.7.2](https://docs.victoriametrics.com/anomaly-detection/ui/#v172), see respective [release notes](https://docs.victoriametrics.com/anomaly-detection/ui/#v172) for details. Notable mentions include `api/v1/server/model` endpoint for accessing production models config and queries from UI, manually or through [AI assistant](https://docs.victoriametrics.com/anomaly-detection/ui/#ai-assistance).
- IMPROVEMENT: Increased high-cardinality inference scaling by optionally scattering periodic infer jobs to reduce contention on shared resources (e.g. datasource, CPU, RAM) when `settings.n_workers > 1` and `scheduler.infer_every` is smaller than the total time to fetch and process all queries. This is controlled by new `scatter_infer_jobs` boolean argument of [Periodic Scheduler](https://docs.victoriametrics.com/anomaly-detection/components/scheduler/#parameters-1) (default: `false`).
- IMPROVEMENT: Optimized internal batching for reader post-fetch series processing, exposing reader processing queue depth, and clarifying inference skip logs after data fetch timeouts.
- IMPROVEMENT: Optimized internal batching for reader post-fetch series processing, exposing reader processing queue depth (`vmanomaly_reader_processing_tasks_queued` [metric](https://docs.victoriametrics.com/anomaly-detection/components/monitoring/#reader-behaviour-metrics)), and clarifying inference skip logs after data fetch timeouts. See `series_processing_batch_size` argument of [VmReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#vm-reader) and [VLogsReader](https://docs.victoriametrics.com/anomaly-detection/components/reader/#victorialogs-reader) for details.
- IMPROVEMENT: Refined `VmReader` and `VLogsReader` logging after datasource request failures by suppressing the follow-up generic "No data" or "No unseen data" warning for failed fetches. Failed requests now keep the original datasource error while empty successful responses still emit the no-data warning.

View File

@@ -893,6 +893,19 @@ If a path to a CA bundle file (like `ca.crt`), it will verify the certificate us
(Optional) Password for authentication. If set, it will be used to authenticate the request.
</td>
</tr>
<tr>
<td>
<span style="white-space: nowrap;">`series_processing_batch_size`</span>
</td>
<td>
`8`
</td>
<td>
Optional argument {{% available_from "v1.29.7" anomaly %}}, allows specifying the number of time series to process together while preparing data for fit or infer stages. Defaults to `8`. Suggested values are 4-16 for high-cardinality queries.
</td>
</tr>
</tbody>
</table>
@@ -911,6 +924,7 @@ reader:
# tenant_id: '0:0' # for cluster version only
sampling_period: '1m'
max_points_per_query: 10000
series_processing_batch_size: 8
data_range: [0, 'inf'] # reader-level
offset: '0s' # reader-level
timeout: '30s'

View File

@@ -624,13 +624,12 @@ command line flags. See how to [shard data across remote write destinations](htt
The following requirements must be met for sharded aggregation to work correctly:
- All sharding vmagents should have the same deterministic sharding configuration.
- The sharding configuration must align with the `by` and `without` lists:
- Labels listed in `by` setting should be a subset of shard's routing key `-remoteWrite.shardByURL.labels`.
With `-remoteWrite.shardByURL.labels=env,job` aggregator's `by` should include `by: env`, `by: job` or both: `by: [env, job]`.
This makes sure that all the samples for the same `env` and `job` are aggregated together and produce the complete output.
- Labels listed in `without` setting should be a superset of shard's routing key `--remoteWrite.shardByURL.ignoreLabels`.
With `-remoteWrite.shardByURL.ignoreLabels=env,job` aggegator's `without` should include at least both labels `without: [env,job]`.
This makes sure that `requests_total{env=test, job=foo}` and `requests_total{env=prod, job=foo}` are routed to the same aggregator
and are aggregated together. See also [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/5938#issuecomment-2018470324).
- Labels configured in `-remoteWrite.shardByURL.labels` must be a subset of the labels listed in `by`.
For example, if the aggregation config specifies `by: [env, job]`, then `-remoteWrite.shardByURL.labels` may include `env`, `job`, or both.
This ensures that all samples contributing to the same aggregation result are routed to the same aggregator instance and aggregated together to produce a complete output.
- Labels configured in `-remoteWrite.shardByURL.ignoreLabels` must be a superset of the labels listed in `without`.
For example, if the aggregation config specifies `without: [env, pod]`, then `-remoteWrite.shardByURL.ignoreLabels` must include at least `env` and `pod`.
This ensures that labels removed during aggregation are not used for shard routing.
- Aggregating vmagents should not produce collisions: the aggregation output should be unique across all the sharded agents.
For example, `requests_total:5m_without_env_pod_total` produced by both `vmagent-aggr-1` and `vmagent-aggr-2` will collide
unless they have labels uniquely identifying them. These labels should be either preserved during sharding and aggregation config,