app/vmalert: align group evaluation time with the eval_offset option

Align group evaluation time with the `eval_offset` option to allow users
to manage group execution more effectively by understanding the exact
time each group will be scheduled, particularly in cases of spreading
rule execution within a window, chaining groups, or debugging data delay
issue.

If the group evaluation takes less than the group interval, but the
initial evaluation combined with the additional restore operation
exceeds the group interval, the evaluation time will be gradually
corrected in subsequent evaluations, as the interval ticker schedule
remains unchanged.

For groups without `eval_offset`, this change also ensures that all
evaluations follow the interval. Previously, the gap between the first
and second evaluations was larger than the interval. And the
`eval_delay` continues to help prevent partial responses.

fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10772.
This commit is contained in:
Hui Wang
2026-04-10 14:31:36 +08:00
committed by f41gh7
parent 7653b89442
commit faf4fd240c
3 changed files with 15 additions and 10 deletions

View File

@@ -381,7 +381,9 @@ func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasourc
if len(g.Rules) < 1 {
g.metrics.iterationDuration.UpdateDuration(start)
g.mu.Lock()
g.LastEvaluation = start
g.mu.Unlock()
return ts
}
@@ -395,7 +397,9 @@ func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasourc
}
}
g.metrics.iterationDuration.UpdateDuration(start)
g.mu.Lock()
g.LastEvaluation = start
g.mu.Unlock()
return ts
}
@@ -405,11 +409,11 @@ func (g *Group) Start(ctx context.Context, rw remotewrite.RWClient, rr datasourc
g.mu.Unlock()
defer g.evalCancel()
realEvalTS := eval(evalCtx, evalTS)
t := time.NewTicker(g.Interval)
defer t.Stop()
realEvalTS := eval(evalCtx, evalTS)
// restore the rules state after the first evaluation
// so only active alerts can be restored.
if rr != nil {

View File

@@ -44,6 +44,7 @@ See also [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-rel
* FEATURE: [vmui](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/#vmui): CSV export on the `Raw Query` tab now includes all labels from the executed query. VMUI no longer prepends a header row, as it is now provided by the backend. See [#10667](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10667) and [#10666](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10666). Thanks to @lawrence3699 for the contribution.
* FEATURE: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): add header row to `/api/v1/export/csv` output and auto-detect header rows during import via `/api/v1/import/csv`. See [#10666](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10666). Thanks to @andriibeee for the contribution.
* FEATURE: all VictoriaMetrics components: expose operating system name and release version as metric `vm_os_info`. See [#10481](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10481).
* FEATURE: [vmalert](https://docs.victoriametrics.com/victoriametrics/vmalert/): align group evaluation time with the `eval_offset` option to help manage group execution more effectively. See [#10772](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10772).
* BUGFIX: [vmbackup](https://docs.victoriametrics.com/vmbackup/), [vmbackupmanager](https://docs.victoriametrics.com/victoriametrics/vmbackupmanager/): retry the requests that failed with unexpected EOF due to unstable network to S3 service. See [#10699](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10699).
* BUGFIX: All VictoriaMetrics components: Fix an issue where `unsupported` metric metadata type was exposed for summaries and quantiles if a summary wasn't updated within a certain time window. See [metrics#120](https://github.com/VictoriaMetrics/metrics/issues/120) and [metrics#121](https://github.com/VictoriaMetrics/metrics/pull/121).

View File

@@ -1236,8 +1236,8 @@ For example:
```yaml
groups:
- name: BaseGroup
interval: 1m
eval_offset: 10s
interval: 5m
eval_offset: 1m
rules:
- record: http_server_request_duration_seconds:sum_rate:5m:http_get
expr: |
@@ -1258,8 +1258,8 @@ groups:
)
)
- name: TopGroup
interval: 1m
eval_offset: 40s
interval: 5m
eval_offset: 3m
rules:
- record: http_server_request_duration_seconds:sum_rate:5m:merged
expr: |
@@ -1271,20 +1271,20 @@ groups:
This configuration ensures that rules in `BaseGroup` are executed at(assuming vmalert starts at `12:00:00`):
```
[12:00:10, 12:01:10, 12:02:10, 12:03:10...]
[12:01:00, 12:06:00, 12:11:00, 12:16:00...]
```
while rules in group `TopGroup` are executed at:
```
[12:00:40, 12:01:40, 12:02:40, 12:03:40...]
[12:03:00, 12:08:00, 12:13:00, 12:18:00...]
```
As a result, `TopGroup` always gets the latest results of `BaseGroup`.
As a result, `TopGroup` can consistently obtain the latest results from `BaseGroup` if `BaseGroup` completes its evaluation and uploads its results to the datasource within 2 minutes.
By default, the `eval_offset` values should be at least 30 seconds apart to accommodate the
`-search.latencyOffset(default 30s)` command-line flag at vmselect or VictoriaMetrics single-node.
The minimum `eval_offset` gap can be adjusted accordingly with `-search.latencyOffset`.
The minimum `eval_offset` gap should be adjusted according to the sum of the execution duration of `BaseGroup` and `-search.latencyOffset`.
### Notifier configuration file