Previously inmemoryPart refCount was not properly decremented.
Previous behavior:
* createInmemoryPart called newPartWrapperFromInmemoryPart and returns a partWrapper with refCount=1
* multiple parts are merged in mustMergeInmemoryPartsFinal, which creates a new merged part
* the source partWrappers are never decRef'd
* Since refCount never reaches 0, putInmemoryPart and (*part).MustClose are never called
This commit properly decrements refCount at mustMergeInmemoryPartsFinal.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10086
indexDB mergeset has an edge for single item inmemoryBlock. It stores
such items blocks in-memory at blockheader firstItem. So there is no
need to perform on-disk read operations and storing copy of it at cache.
It also may result in incorrect search results, inmemoryBlock with a
single item has always zero index block offset. Which causes collisions
if it's cached with the next index block at part.
Fixes https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10239
Probably fixes
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/10063
indexDB has 3 block caches. These caches export metrics. Storage
collects these
metrics for each indexDB it has (currently prev and curr only).
There is a potential problem:
- These caches are shared by all indexDBs
- Each indexDB reports the block cache metrics.
- Storage collects the metrics of all indexDBs by adding them together.
I.e. it is possible to count block cache metrics several times.
It is not the case in current implementation because the addition of the
metrics
is not performed intentionally.
The added unit test 1) demonstrates that the resulting counts are
reported
correctly and 2) protects from future unintentional changes in this
behavior.
Additionally a code comment is added to explain why block cache metrics
are not summed up.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
This should catch possible errors related to improper release of Table parts.
Fix such an error at TestTableCreateSnapshotAt by properly closing all the initialized
TableSearch instances.
Thanks to @rtm0 for pointing to this issue.
This allows performing a single MustFsyncPath() for the parent directory after multiple calls to these functions.
This clarifies code paths, which call these functions, and makes them more maintainable.
This also removes a redundant fsync() call for the parent directory when creating a file-based part.
Previously the first fsync() was indirectly called when the directory was created via MustMkdirFailIfExist()
and the second fsync() was called via MustSyncPathAndParentDir() after all the data is written to the part.
The lib/fs.MustCloseParallel() accepts a slice of MustWriter items, which must implement only
a single method - MustWrite(). The previous lib/filestream.MustCloseWritersParallel() was
accepting CloseWriter items, which must implement Write() and Path() methods additionally
to MustClose() method. This was adding artificial restrictions on the applicability
of the MustCloseWritersParallel() method. Remove these restrictions.
- Drop the code needed for asynchronous removal of the directory on NFS shares.
This code was needed when VictoriaMetrics could keep open files after their deletion
or renaming. This is no longer the case after the commit 43b24164ef .
Now files are deleted only after all the readers close them.
This updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/61
- Unify MustRemoveAll() and MustRemoveDirAtomic() into MustRemoveDir() and MustRemovePath()
functions:
- The MustRemoveDir() deletes the given directory with all its contents, in an "atomic" way:
it creates a special `.delete-this-dir` file in the directory, then removes all its contents
except of this file, and later removes the `.delete-this-dir` file together with the directory
itself. This makes possible easily determining whether the given directory needs to be deleted
after unclean shutdown - if it contains the `.delete-this-dir` file or if it is empty, it must be deleted.
Add IsPartiallyRemovedDir() function, which can be used for detecting whether the given directory must be removed
at starup.
Previously the MustRemoveDirAtomic() was using a "trick" for atomic directory removal: it was "atomically" renaming
the directory to a temporary directory with '.must-remove.' marker in the directory name, and after that it
was removing the renamed directory. On startup all the directories with the `.must-remove.` marker were deleted
if they are left after unclean shutdown. This "trick" doesn't work for NFS and object storage such as S3,
since these storage systems do not support atomic renaming of directories with multiple entries inside.
The new MustRemoveDir() function doesn't use this "trick", so it can be safely used in NFS and S3-like storage systems.
This is based on the pull request from @func25 - https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9486/files .
- The MustRemovePath() deletes the given file or an empty directory.
- Delete the existing parts and partitions at startup if they were partially deleted.
- Consistently use fs.MustRemoveDir() and fs.MustRemovePath() instead of os.RemoveAll() across the codebase.
This reduces the amounts of bolierplate code related to error handling.
- Consistently use fs.MustWriteSync() instead of os.WriteFile() across the codebase.
The newly created data part could become missing after unclean shutdown (such as hardware power off),
since the contents of the parent directory wasn't synced to disk before storing the newly created data part in the parts.json file.
Fix this by syncing the parent directory contents before storing the newly created part in the parts.json file.
This commit is based on https://github.com/VictoriaMetrics/VictoriaLogs/pull/507
This commit adds tmp inmemory and data blocks buffers for
index search requests. It allows to reduce memory allocations on block
cache misses. Since block cache puts block into cache only on after
configured number of cache misses.
Related PR https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9324
It appears the `lib/mergeset.Item` violates the 3rd rule
https://pkg.go.dev/unsafe#Pointer
> (3) Conversion of a Pointer to a uintptr and back, with arithmetic.
>
> Unlike in C, it is not valid to advance a pointer just beyond the end
of its original allocation:
> ```
> // INVALID: end points outside allocated space.
> b := make([]byte, n)`
> end = unsafe.Pointer(uintptr(unsafe.Pointer(&b[0])) + uintptr(n))
> ```
`CGO_ENABLED=0 go test -trimpath -gcflags=all=-d=checkptr -run
TestTableAddItemsSerial`
```
fatal error: checkptr: pointer arithmetic result points to invalid allocation
This commit adds a special path to item encoding, that prevents invalid memory references.
See this PR for details:
https://github.com/VictoriaMetrics/VictoriaMetrics/pull/9264
Using this package lets to manipulate time. In this particular case, it
lets to advance the time 61 second forward instantly.
A few side changes were necessary:
- Do not use fasttime in unit tests. The fasttime package starts a
goroutine outside the test bubble which causes the clock to be real, not
fake.
- Stop the time.Ticker explicitly and also stop idbNext. These two
create goroutines with infinite loops which causes the unit tests that
use synctest to hang forever. All goroutines created inside the bubble
must exit in order for the syntest to finish.
- synctest is an experimental package and requires an environment
variable to be set. The Makefile was changed to set it.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
When creating and listing snapshots, panic instead of returning an error
since errors are not recoverable anyway.
Also do not cleanup the filesystem on panic. Leave as is for further
manual inspection.
Signed-off-by: Artem Fetishev <rtm@victoriametrics.com>
### Describe Your Changes
`its'` -> `its`
### Checklist
The following checks are **mandatory**:
- [x] My change adheres [VictoriaMetrics contributing
guidelines](https://docs.victoriametrics.com/contributing/).
### Describe Your Changes
Fix many spelling errors and some grammar, including misspellings in
filenames.
The change also fixes a typo in metric `vm_mmaped_files` to `vm_mmapped_files`.
While this is a breaking change, this metric isn't used in alerts or dashboards.
So it seems to have low impact on users.
The change also deprecates `cspell` as it is much heavier and less usable.
---------
Co-authored-by: Andrii Chubatiuk <achubatiuk@victoriametrics.com>
Co-authored-by: Andrii Chubatiuk <andrew.chubatiuk@gmail.com>
This commit adds lib/chunkedbuffer.Buffer - an in-memory chunked buffer
optimized for random access via MustReadAt() function.
It is better than bytesutil.ByteBuffer for storing large volumes of data,
since it stores the data in chunks of a fixed size (4KiB at the moment)
instead of using a contiguous memory region. This has the following benefits over bytesutil.ByteBuffer:
- reduced memory fragmentation
- reduced memory re-allocations when new data is written to the buffer
- reduced memory usage, since the allocated chunks can be re-used
by other Buffer instances after Buffer.Reset() call
Performance tests show up to 2x memory reduction for VictoriaLogs
when ingesting logs with big number of fields (aka wide events) under high speed.
This allows using different intervals for flushing in-memory data among different mergeset.Table instances.
The initial user of this feature is lib/logstorage.Storage, which explicitly passes Storage.flushInterval
to every created mereset.Table instance. Previously mergeset.Table instances were using 5 seconds
flush interval, which didn't depend on the Storage.flushInterval.
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4775
Hint allows to choose type of cache to be used for index search:
- in-memory parts are storing recently ingested samples and should use
main cache. This improves ingestion speed and cache hit ration for
queries accessing recently ingested samples.
- merges of file parts is performed in background, using a separate
cache allows avoiding pollution of the main cache with irrelevant
entries.
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182
---------
Signed-off-by: f41gh7 <nik@victoriametrics.com>
Related issue:
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/7182
- add a separate index cache for searches which might read through large
amounts of random entries. Primary use-case for this is retention and
downsampling filters, when applying filters background merge needs to
fetch large amount of random entries which pollutes an index cache.
Using different caches allows to reduce effect on memory usage and cache
efficiency of the main cache while still having high cache hit rate. A
separate cache size is 5% of allowed memory.
- reduce size of indexdb/dataBlocks cache in order to free memory for
new sparse cache. Reduced size by 5% and moved this to a separate cache.
- add a separate metricName search which does not cache metric names -
this is needed in order to allow disabling metric name caching when
applying downsampling/retention filters. Applying filters during
background merge accesses random entries, this fills up cache and does
not provide an actual improvement due to random access nature.
Merge performance and memory usage stats before and after the change:
- before

- after

---------
Signed-off-by: Zakhar Bessarab <z.bessarab@victoriametrics.com>
Pending rows and items unconditionally remain in memory for up to pending{Items,Rows}FlushInterval,
so there is no any sense in setting dataFlushInterval (the interval for guaranteed flush of in-memory data to disk)
to values smaller than pending{Items,Rows}FlushInterval, since this doesn't affect the interval
for flushing pending rows and items from memory to disk.
This is a follow-up for 4c80b17027
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/pull/6221
'any' type is supported starting from Go1.18. Let's consistently use it
instead of 'interface{}' type across the code base, since `any` is easier to read than 'interface{}'.
Change the return values for these functions - now they return the unmarshaled result plus
the size of the unmarshaled result in bytes, so the caller could re-slice the src for further unmarshaling.
This improves performance of these functions in hot loops of VictoriaLogs a bit.
This should improve debuggability of unexpected deletion of directories inside partitions.
While at it, log the proper path to parts.json when the directory for big part is missing in the partition.
parts.json is located inside directory with small parts, and there is no parts.json file inside directory with big parts.