Previously (*writeconcurrencylimiter.Reader).Read() could permanently leak concurrency tokens from the -maxConcurrentInserts semaphore.
Consider the following example:
* GetReader() acquires a token, then PutReader() unconditionally releases it.
* Read() calls DecConcurrency() before the underlying I/O and IncConcurrency() after it. If IncConcurrency() returns an error, Read() returns without holding a token.
* Each such failure permanently removes one slot from the concurrencyLimitCh semaphore. Slots leak one by one until the channel is fully drained, at which point DecConcurrency() blocks forever, deadlocking ingestion on vmstorage.
This commit adds tracking for obtained tokens to the reader. Which prevents possible tokens leakage.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10784
Wait for the first byte from the reader passed to ReadUncompressedData()
before obtaining concurrency token from -maxConcurrentInserts and before allocating
buffers needed for reading the request body in memory.
This should limit the amounts of memory needed for processing a big number of concurrent
HTTP requests via Prometheus remote_write protocol and via other HTTP-based data ingestion
protocols where every request contains a single block of data to process.
Now the maximum memory usage is limited by -maxConcurrentInserts, while the server
can process much more than -maxConcurrentInserts concurrent HTTP requests by pausing the excess requests.
Previously the memory usage wasn't limited by -maxConcurrentInserts, since buffers for reading the data
from concurrent connections were allocated before obtaining the concurrency token from -maxConcurrentInserts.
While at it, use protoparserutil.ReadUncompressedData() in lib/protoparser/promremotewrite/stream.Parse()
for the sake of consistency across parsers for protocols, which send the full block of data per every incoming HTTP request.
This is a follow-up for the commit d107dee9c7
Call decConcurrency() inside Reader.Read() before calling the Read() at the underlying reader.
This reduces chances of improper use of the writeconcurrencylimiter.Reader by callers.
While at it, move the creation of writeconcurrencylimiter.GetReader() to the top of stream parser functions
at lib/protoparser/* packages, and call incConcurrency() inside GetReader() call.
This reduces the frequency of decConcurrency() / incConcurrency() calls
for typical buffered reads when parsing the incoming data. This, in turn,
reduces the contention on the concurrencyLimitCh.
### Describe Your Changes
Under heavy load, vmagent's wirte concurrency limiter
(2ab53acce4/lib/writeconcurrencylimiter/concurrencylimiter.go (L111))
queues incoming requests. If a client's timeout is shorter than the wait
time in the
queue, the client may close the connection before vmagent starts
processing it. When vmagent then tries to read the request body, it
encounters an ambiguous `unexpected EOF` error
(https://github.com/VictoriaMetrics/VictoriaMetrics/issues/8675).
This commit adds more context to such errors to help users diagnose and
resolve
the issue when it's related to vmagent's own load and queuing behavior.
Possible user actions include:
- Lowering `-insert.maxQueueDuration` below the client's timeout.
- Increasing the client-side timeout, if applicable.
- Scaling up vmagent (e.g., adding more CPU resources).
- Increasing `-maxConcurrentInserts` if CPU capacity allows.
Steps to reproduce:
https://gist.github.com/makasim/6984e20f57bfd944411f56a7ebe5b6bf
### Checklist
The following checks are **mandatory**:
- [x] My change adheres to [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
The change also removes misleading `default` value from README for `maxConcurrentInserts`
cmd-line flag.
Signed-off-by: hagen1778 <roman@victoriametrics.com>
Previously the -maxConcurrentInserts was limiting the number of established client connections,
which write data to VictoriaMetrics. Some of these connections could be idle.
Such connections do not consume big amounts of CPU and RAM, so there is a little sense in limiting
the number of such connections. So now the -maxConcurrentInserts command-line option
limits the number of concurrently executed insert requests, not including idle connections.
It is recommended removing -maxConcurrentInserts command-line option, since the default value
for this option should work good for most cases.