mirror of
https://github.com/VictoriaMetrics/VictoriaMetrics.git
synced 2026-05-17 08:36:55 +03:00
app/vmagent: prevent dropping persistent queue if -remoteWrite.showURL changed
Previously, if the command-line flag value `-remoteWrite.showURL` changed, vmagent dropped content of persistent queues. It's not expected behavior and may lead to data-loss at queue. Further more if command-line flag value `-remoteWrite.showURL` is set to `true`, any changes to url query arguments will lead to persistent queue drop. The most common uses is kafka and gcp pub-sub integration. It uses url query arguments for client configuration. Also, it complicates copy content of persistent queue between vmagents. Since it requires to properly change name inside metainfo.json. This commit removes persistent queue name equality check from `lib/persistentqueue`. This check was added as an additional protection from on-disk data corruption. It's safe to skip this check for vmagent, because vmagent encodes remoteWrite.url as part of path to the queue. It guarantees that there will be no collision. related issue: https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8477. ### Checklist The following checks are **mandatory**: - [x] My change adheres [VictoriaMetrics contributing guidelines](https://docs.victoriametrics.com/contributing/). --------- Signed-off-by: f41gh7 <nik@victoriametrics.com> Co-authored-by: f41gh7 <nik@victoriametrics.com>
This commit is contained in:
@@ -89,9 +89,9 @@ var (
|
||||
|
||||
disableOnDiskQueue = flagutil.NewArrayBool("remoteWrite.disableOnDiskQueue", "Whether to disable storing pending data to -remoteWrite.tmpDataPath "+
|
||||
"when the remote storage system at the corresponding -remoteWrite.url cannot keep up with the data ingestion rate. "+
|
||||
"See https://docs.victoriametrics.com/vmagent#disabling-on-disk-persistence . See also -remoteWrite.dropSamplesOnOverload")
|
||||
"See https://docs.victoriametrics.com/vmagent#on-disk-persistence-and-how-to-disable-it . See also -remoteWrite.dropSamplesOnOverload")
|
||||
dropSamplesOnOverload = flag.Bool("remoteWrite.dropSamplesOnOverload", false, "Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples "+
|
||||
"cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/vmagent#disabling-on-disk-persistence")
|
||||
"cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/vmagent#on-disk-persistence-and-how-to-disable-it")
|
||||
)
|
||||
|
||||
var (
|
||||
|
||||
@@ -28,6 +28,7 @@ See also [LTS releases](https://docs.victoriametrics.com/lts-releases/).
|
||||
* BUGFIX: [Single-node VictoriaMetrics](https://docs.victoriametrics.com/) and [vmstorage](https://docs.victoriametrics.com/victoriametrics/): fix metric that shows number of active time series when per-day index is disabled. Previously, once per-day index was disabled, the active time series metric would stop being populated and the `Active time series` chart would show 0. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8411) for details.
|
||||
* BUGFIX: [MetricsQL](https://docs.victoriametrics.com/metricsql/): prevent from `too big duration` panic when the query contains too big `Ni` durations because of too big `step` value. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8447).
|
||||
* BUGFIX: [vmalert](https://docs.victoriametrics.com/vmalert/), [vmctl](https://docs.victoriametrics.com/vmctl/), [vmbackup](https://docs.victoriametrics.com/vmbackup/), [vmrestore](https://docs.victoriametrics.com/vmrestore/), [vmbackupmanager](https://docs.victoriametrics.com/vmbackupmanager/): properly apply TLS settings for URLs with scheme other than `https`. Previously, TLS settings were ignored for such URLs. That could lead to unexpected behavior when a request was receiving a redirect response to a URL with `https` scheme. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8494) for details.
|
||||
* BUGFIX: [vmagent](https://docs.victoriametrics.com/vmagent/): prevent dropping persistent queue data when changes happened for `-remoteWrite.showURL` flag, query params or fragment in remote write URL. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8477).
|
||||
|
||||
## [v1.113.0](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/tag/v1.113.0)
|
||||
|
||||
|
||||
@@ -1030,7 +1030,7 @@ scrape_configs:
|
||||
- "Proxy-Auth: top-secret"
|
||||
```
|
||||
|
||||
## Disabling on-disk persistence
|
||||
## On-disk persistence and how to disable it
|
||||
|
||||
By default `vmagent` stores pending data, which cannot be sent to the configured remote storage systems in a timely manner, in the folder set
|
||||
by `-remoteWrite.tmpDataPath` command-line flag. By default `vmagent` writes all the pending data to this folder until this data is sent to the configured
|
||||
@@ -1038,6 +1038,32 @@ by `-remoteWrite.tmpDataPath` command-line flag. By default `vmagent` writes all
|
||||
per every configured `-remoteWrite.url`, can be limited via `-remoteWrite.maxDiskUsagePerURL` command-line flag.
|
||||
When this limit is reached, `vmagent` drops the oldest data from disk in order to save newly ingested data.
|
||||
|
||||
The folder structure of persistence data is as follows. Each remote write URL corresponds to a folder similar to `1_B9EB7BE220B91E9D`.
|
||||
|
||||
```
|
||||
<remoteWrite.tmpDataPath>
|
||||
└── persistent-queue
|
||||
└── 1_B9EB7BE220B91E9D
|
||||
```
|
||||
|
||||
It's generated based on the following information:
|
||||
1. The **sequence order** of this remote write URL in the remote write configuration, starting from **1**.
|
||||
2. The **hash result** of the remote write URL itself, excluding query parameters and fragments.
|
||||
|
||||
For the following remote write configs:
|
||||
```
|
||||
-remoteWrite.url=http://example-1:8428/prometheus/api/v1/write?foo=bar#baz
|
||||
-remoteWrite.url=http://user:pass@example-2:8428/prometheus/api/v1/write?qux=quux#quuz
|
||||
```
|
||||
|
||||
vmagent will generate persistent queue folders:
|
||||
```bash
|
||||
# 1_<hash(http://example-1:8428/prometheus/api/v1/write)>, query parameters foo=bar and fragment baz are removed.
|
||||
1_BA6E4303DCFA0D45
|
||||
# 2_<hash(http://user:pass@example-2:8428/prometheus/api/v1/write)>, query parameters qux=quux and fragment quuz are removed.
|
||||
2_0AAFDF53E314A72A
|
||||
```
|
||||
|
||||
There are cases when it is better disabling on-disk persistence for pending data at `vmagent` side:
|
||||
|
||||
- When the persistent disk performance isn't enough for the given data processing rate.
|
||||
@@ -2147,11 +2173,11 @@ See the docs at https://docs.victoriametrics.com/vmagent/ .
|
||||
Supports an array of values separated by comma or specified via multiple flags.
|
||||
Value can contain comma inside single-quoted or double-quoted string, {}, [] and () braces.
|
||||
-remoteWrite.disableOnDiskQueue array
|
||||
Whether to disable storing pending data to -remoteWrite.tmpDataPath when the remote storage system at the corresponding -remoteWrite.url cannot keep up with the data ingestion rate. See https://docs.victoriametrics.com/vmagent#disabling-on-disk-persistence . See also -remoteWrite.dropSamplesOnOverload
|
||||
Whether to disable storing pending data to -remoteWrite.tmpDataPath when the remote storage system at the corresponding -remoteWrite.url cannot keep up with the data ingestion rate. See https://docs.victoriametrics.com/vmagent#on-disk-persistence-and-how-to-disable-it . See also -remoteWrite.dropSamplesOnOverload
|
||||
Supports array of values separated by comma or specified via multiple flags.
|
||||
Empty values are set to false.
|
||||
-remoteWrite.dropSamplesOnOverload
|
||||
Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/vmagent#disabling-on-disk-persistence
|
||||
Whether to drop samples when -remoteWrite.disableOnDiskQueue is set and if the samples cannot be pushed into the configured -remoteWrite.url systems in a timely manner. See https://docs.victoriametrics.com/vmagent#on-disk-persistence-and-how-to-disable-it
|
||||
-remoteWrite.flushInterval duration
|
||||
Interval for flushing the data to remote storage. This option takes effect only when less than 10K data points per second are pushed to -remoteWrite.url (default 1s)
|
||||
-remoteWrite.forcePromProto array
|
||||
|
||||
@@ -201,9 +201,6 @@ func tryOpeningQueue(path, name string, chunkFileSize, maxBlockSize, maxPendingB
|
||||
filepath := q.chunkFilePath(0)
|
||||
fs.MustWriteAtomic(filepath, nil, false)
|
||||
}
|
||||
if mi.Name != q.name {
|
||||
return nil, fmt.Errorf("unexpected queue name; got %q; want %q", mi.Name, q.name)
|
||||
}
|
||||
|
||||
// Locate reader and writer chunks in the path.
|
||||
des := fs.MustReadDir(path)
|
||||
|
||||
Reference in New Issue
Block a user