mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2026-07-02 15:15:04 +03:00
Compare commits
13 Commits
VMSelectCo
...
vmauth-inv
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
d33bda1b13 | ||
|
|
38c1b65b64 | ||
|
|
75aea9f609 | ||
|
|
7a4d4bc6d4 | ||
|
|
af4ae4d458 | ||
|
|
b96f63b588 | ||
|
|
6b980bdb6f | ||
|
|
222bfb0f0e | ||
|
|
13036b9297 | ||
|
|
359c634157 | ||
|
|
290029897b | ||
|
|
9a150c309c | ||
|
|
2a40a40e9e |
@@ -1687,10 +1687,6 @@ func assertInstantValues(tss []*timeseries) {
|
||||
|
||||
var memoryIntensiveQueries = metrics.NewCounter(`vm_memory_intensive_queries_total`)
|
||||
|
||||
var _ = metrics.NewGauge(`vm_max_memory_per_query`, func() float64 {
|
||||
return float64(maxMemoryPerQuery.N)
|
||||
})
|
||||
|
||||
func evalRollupFuncWithMetricExpr(qt *querytracer.Tracer, ec *EvalConfig, funcName string, rf rollupFunc,
|
||||
expr metricsql.Expr, me *metricsql.MetricExpr, iafc *incrementalAggrFuncContext, windowExpr *metricsql.DurationExpr,
|
||||
) ([]*timeseries, error) {
|
||||
|
||||
@@ -223,16 +223,4 @@ groups:
|
||||
Unexpected TSID misses for \"{{ $labels.job }}\" ({{ $labels.instance }}) for the last 15 minutes.
|
||||
If this happens after unclean shutdown of VictoriaMetrics process (via \"kill -9\", OOM or power off),
|
||||
then this is OK - the alert must go away in a few minutes after the restart.
|
||||
Otherwise this may point to the corruption of index data.
|
||||
|
||||
- alert: VMSelectConcurrentQueriesExceedMemoryLimit
|
||||
expr: (vm_max_memory_per_query * on(job, instance) vm_concurrent_select_capacity) > on(job, instance) vm_available_memory_bytes
|
||||
for: 5m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
summary: "vmselect ({{ $labels.instance }}) concurrent query memory may exceed pod limit"
|
||||
description: "Current concurrent queries ({{ $value | humanize1024 }} combined max memory) exceed
|
||||
the available memory on instance {{ $labels.instance }}.
|
||||
This may result in OOM kills. Consider reducing -maxConcurrentRequests,
|
||||
lowering -maxMemoryPerQuery, or scaling up pod memory limits."
|
||||
Otherwise this may point to the corruption of index data.
|
||||
@@ -56,3 +56,20 @@ groups:
|
||||
summary: "Too many errors served for user {{ $labels.username }} (instance {{ $labels.instance }})"
|
||||
description: "Requests from user {{ $labels.username }} are receiving errors.
|
||||
Please check the vmauth logs to verify that the configuration is correct and clients are sending valid requests."
|
||||
- alert: InvalidAuthTokenRequestErrors
|
||||
expr: sum(increase(vmauth_http_request_errors_total{reason="invalid_auth_token"}[5m])) without (instance, reason) > 0
|
||||
for: 15m
|
||||
labels:
|
||||
severity: warning
|
||||
annotations:
|
||||
dashboard: "{{ $externalURL }}/d/nbuo5Mr4k?viewPanel=16&var-job={{ $labels.job }}"
|
||||
summary: "vmauth {{ $labels.job }} is receiving many requests with invalid auth tokens"
|
||||
description: |
|
||||
vmauth {{ $labels.job }} received {{ $value }} requests with invalid auth tokens in the last 5 minutes.
|
||||
This may indicate:
|
||||
- credentials have been updated on vmauth but not on clients
|
||||
- client misconfiguration or use of an expired token
|
||||
- a brute-force attack.
|
||||
|
||||
Check vmauth metrics for longevity and scale of the issue.
|
||||
Check access log for detailed information: https://docs.victoriametrics.com/victoriametrics/vmauth/#access-log
|
||||
|
||||
@@ -28,12 +28,11 @@ See also [LTS releases](https://docs.victoriametrics.com/victoriametrics/lts-rel
|
||||
|
||||
* SECURITY: upgrade base docker image (Alpine) from 3.23.4 to 3.24.1. See [Alpine 3.24.1 release notes](https://www.alpinelinux.org/posts/Alpine-3.24.1-released.html).
|
||||
|
||||
* FEATURE: `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): expose `vm_max_memory_per_query` metric reflecting the `-search.maxMemoryPerQuery` limit. Create `VMSelectConcurrentQueriesExceedMemoryLimit` alert to warn when OOMs are possible due to misconfiguration of `-search.maxMemoryPerQuery` and max concurrent queries.
|
||||
|
||||
* FEATURE: [vmauth](https://docs.victoriametrics.com/victoriametrics/vmauth/): add `default_vm_access_claim` field into `jwt` section of auth config. It could be used at [JWT claim placeholders](https://docs.victoriametrics.com/victoriametrics/vmauth/#jwt-claim-based-request-templating), if `JWT` token doesn't have `vm_access` claim. See [#11054](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/11054).
|
||||
* FEATURE: [vmagent](https://docs.victoriametrics.com/victoriametrics/vmagent/): reduces CPU usage by 10% at [sharding among remote storages](https://docs.victoriametrics.com/victoriametrics/vmagent/#sharding-among-remote-storages). See [#11113](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/11113). Thanks to @bennf for contribution.
|
||||
* FEATURE: [vmsingle](https://docs.victoriametrics.com/victoriametrics/single-server-victoriametrics/) and `vmselect` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): add `optimize_repeated_binary_op_subexprs=1` query arg to [/api/v1/query_range](https://docs.victoriametrics.com/victoriametrics/keyconcepts/#range-query) for executing binary operator sides sequentially when they share the same optimized aggregate rollup result expression. This allows the second side to reuse rollup result cache populated by the first side. See [#10575](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10575).
|
||||
* FEATURE: [vmauth](https://docs.victoriametrics.com/victoriametrics/vmauth/): prevent possible password brute-force attacks with an artificial 2-3 second delay as recommended by [OWASP](https://owasp.org/Top10/2025/A07_2025-Authentication_Failures). See [#11180](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/11180).
|
||||
* FEATURE: [alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules): add `InvalidAuthTokenRequestErrors` alerting rule to [vmauth alerts](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/rules/alerts-vmauth.yml). The new rule notifies when vmauth receives requests with invalid or missing auth tokens, which may indicate a client misconfiguration, expired token use, or brute-force attack. See [#11180](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/11180).
|
||||
|
||||
* BUGFIX: all VictoriaMetrics components: cancel in-flight HTTP requests shortly before `-http.maxGracefulShutdownDuration` elapses during graceful shutdown, so they can drain and the shutdown completes cleanly within that window instead of timing out and exiting via `logger.Fatalf` -> `os.Exit`. This prevents skipping the storage flush and losing in-memory data when long-lived requests are in flight (such as VictoriaLogs live tailing). See [#1502](https://github.com/VictoriaMetrics/VictoriaLogs/issues/1502).
|
||||
* BUGFIX: `vminsert` in [VictoriaMetrics cluster](https://docs.victoriametrics.com/victoriametrics/cluster-victoriametrics/): fixes unexpected rare rerouting. See [#11162](https://github.com/VictoriaMetrics/VictoriaMetrics/pull/11162).
|
||||
|
||||
Reference in New Issue
Block a user