app/vmstorage: add -bigMergeConcurrency and -smallMergeConcurrency flags for tuning the maximum number of CPU cores used during merges

lib/storage: small cleanup in Storage.add
README.md: update information about vm_rows{type="indexdb"} metric
2026-06-08 11:23:53 +03:00 · 2019-10-31 16:19:13 +02:00 · 2019-10-31 14:30:34 +02:00 · 2019-10-31 13:30:29 +02:00 · 2019-10-31 13:24:59 +02:00 · 2019-10-30 02:04:56 +02:00
299 changed files with 9725 additions and 2549 deletions
--- a/.github/workflows/main.yml
+++ b/.github/workflows/main.yml
@@ -1,6 +1,7 @@
 name: main
 on:
  - push
+  - pull_request
 jobs:
  build:
    name: Build
@@ -9,7 +10,7 @@ jobs:
      - name: Setup Go
        uses: actions/setup-go@v1
        with:
-          go-version: 1.12
+          go-version: 1.13
        id: go
      - name: Code checkout
        uses: actions/checkout@v1
@@ -28,6 +29,7 @@ jobs:
            git diff --exit-code
            make test-full
            make test-pure
+            make test-full-386
            make victoria-metrics
            make victoria-metrics-pure
            make victoria-metrics-arm
--- a/5
+++ b/5
@@ -61,6 +61,9 @@ test-pure:
 test-full:
 	GO111MODULE=on go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...

+test-full-386:
+	GO111MODULE=on GOARCH=386 go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
+
 benchmark:
 	GO111MODULE=on go test -mod=vendor -bench=. ./lib/...
 	GO111MODULE=on go test -mod=vendor -bench=. ./app/...
@@ -89,7 +92,7 @@ install-qtc:


 golangci-lint: install-golangci-lint
-	golangci-lint run --exclude '(SA4003|SA1019):' -D errcheck
+	golangci-lint run --exclude '(SA4003|SA1019):' -D errcheck -D structcheck

 install-golangci-lint:
 	which golangci-lint || GO111MODULE=off go get -u github.com/golangci/golangci-lint/cmd/golangci-lint
--- a/README.md
+++ b/README.md
@@ -89,6 +89,7 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
  - [Troubleshooting](#troubleshooting)
  - [Backfilling](#backfilling)
  - [Profiling](#profiling)
+- [Integrations](#integrations)
 - [Roadmap](#roadmap)
 - [Contacts](#contacts)
 - [Community and contributions](#community-and-contributions)
@@ -107,8 +108,8 @@ or [docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) wi

 The following command-line flags are used the most:

-* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory.
-* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted.
+* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory. Default path is `victoria-metrics-data` in current working directory.
+* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted. Default period is 1 month.
 * `-httpListenAddr` - TCP address to listen to for http requests. By default, it listens port `8428` on all the network interfaces.
 * `-graphiteListenAddr` - TCP and UDP address to listen to for Graphite data. By default, it is disabled.
 * `-opentsdbListenAddr` - TCP and UDP address to listen to for OpenTSDB data over telnet protocol. By default, it is disabled.
@@ -156,7 +157,7 @@ The label name may be arbitrary - `datacenter` is just an example. The label val
 across Prometheus instances, so those time series may be filtered and grouped by this label.


-It is recommended upgrading Prometheus to [v2.10.0](https://github.com/prometheus/prometheus/releases) or newer,
+It is recommended upgrading Prometheus to [v2.12.0](https://github.com/prometheus/prometheus/releases) or newer,
 since the previous versions may have issues with `remote_write`.


@@ -171,7 +172,7 @@ http://<victoriametrics-addr>:8428
 Substitute `<victoriametrics-addr>` with the hostname or IP address of VictoriaMetrics.

 Then build graphs with the created datasource using [Prometheus query language](https://prometheus.io/docs/prometheus/latest/querying/basics/).
-VictoriaMetrics supports native PromQL and [extends it with useful features](ExtendedPromQL).
+VictoriaMetrics supports native PromQL and [extends it with useful features](https://github.com/VictoriaMetrics/VictoriaMetrics/wiki/ExtendedPromQL).


 ### How to upgrade VictoriaMetrics?
@@ -253,8 +254,8 @@ curl -G 'http://localhost:8428/api/v1/export' -d 'match={__name__!=""}'
 The `/api/v1/export` endpoint should return the following response:

 ```
-{"metric":{"__name__":"measurement.field1","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560272508147]}
-{"metric":{"__name__":"measurement.field2","tag1":"value1","tag2":"value2"},"values":[1.23],"timestamps":[1560272508147]}
+{"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[123],"timestamps":[1560272508147]}
+{"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[1.23],"timestamps":[1560272508147]}
 ```

 Note that Influx line protocol expects [timestamps in *nanoseconds* by default](https://docs.influxdata.com/influxdb/v1.7/write_protocols/line_protocol_tutorial/#timestamp),
@@ -511,7 +512,7 @@ at `http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for

 Optional `start` and `end` args may be added to the request in order to scrape the last point for each selected time series on the `[start ... end]` interval.
 `start` and `end` may contain either unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values. By default, the last point
-on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but can be overridden.
+on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but it can be overridden.
 For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation
 with scrape intervals exceeding `5m`.

@@ -527,7 +528,7 @@ A rough estimation of the required resources for ingestion path:
  VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited by `-memory.allowedPercent` flag.

 * CPU cores: a CPU core per 300K inserted data points per second. So, ~4 CPU cores are required for processing
-  the insert stream of 1M data points per second. The ingestion rate may be lower for high cardinality data.
+  the insert stream of 1M data points per second. The ingestion rate may be lower for high cardinality data or for time series with high number of labels.
  See [this article](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) for details.
  If you see lower numbers per CPU core, then it is likely active time series info doesn't fit caches,
  so you need more RAM for lowering CPU usage.
@@ -652,6 +653,14 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=<i
 * There is no need in Operating System tuning since VictoriaMetrics is optimized for default OS settings.
  The only option is increasing the limit on [the number of open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a),
  so Prometheus instances could establish more connections to VictoriaMetrics.
+* The recommended filesystem is `ext4`, the recommended persistent storage is [persistent HDD-based disk on GCP](https://cloud.google.com/compute/docs/disks/#pdspecs),
+  since it is protected from hardware failures via internal replication and it can be [resized on the fly](https://cloud.google.com/compute/docs/disks/add-persistent-disk#resize_pd).
+  If you plan storing more than 1TB of data on `ext4` partition or plan extending it to more than 16TB,
+  then the following options are recommended to pass to `mkfs.ext4`:
+
+```
+mkfs.ext4 ... -O 64bit,huge_file,extent -T huge
+```


 ### Monitoring
@@ -664,10 +673,7 @@ The most interesting metrics are:

 * `vm_cache_entries{type="storage/hour_metric_ids"}` - the number of time series with new data points during the last hour
  aka active time series.
-* `vm_rows{type="indexdb"}` - the number of rows in inverted index. Each label in each unique time series adds a single
-  row into the inverted index. An approximate number of time series in the database may be calculated as
-  `vm_rows{type="indexdb"} / (avg_labels_per_series + 1)`, where `avg_labels_per_series` is the average number of labels
-  per each time series.
+* `vm_rows{type="indexdb"}` - the number of rows in inverted index. High value for this number usually mean high churn rate for time series.
 * Sum of `vm_rows{type="storage/big"}` and `vm_rows{type="storage/small"}` - total number of `(timestamp, value)` data points
  in the database.
 * Sum of all the `vm_cache_size_bytes` metrics - the total size of all the caches in the database.
@@ -678,6 +684,9 @@ The most interesting metrics are:

 ### Troubleshooting

+* It is recommended to use default command-line flag values (i.e. don't set them explicitly) until the need
+  in tweaking these flag values arises.
+
 * If VictoriaMetrics works slowly and eats more than a CPU core per 100K ingested data points per second,
  then it is likely you have too many active time series for the current amount of RAM.
  It is recommended increasing the amount of RAM on the node with VictoriaMetrics in order to improve
@@ -723,6 +732,14 @@ The command for collecting CPU profile waits for 30 seconds before returning.
 The collected profiles may be analyzed with [go tool pprof](https://github.com/google/pprof).


+## Integrations
+
+* [netdata](https://github.com/netdata/netdata) can push data into VictoriaMetrics via `Prometheus remote_write API`.
+  See [these docs](https://github.com/netdata/netdata#integrations).
+* [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi) can use VictoriaMetrics as time series backend.
+  See [this example](/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
+
+
 ## Roadmap

 - [ ] Replication [#118](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/118)
@@ -745,8 +762,8 @@ Contact us with any questions regarding VictoriaMetrics at [info@victoriametrics
 Feel free asking any questions regarding VictoriaMetrics:

 - [slack](http://slack.victoriametrics.com/)
- [telergam-en](https://t.me/VictoriaMetrics_en)
- [telergam-ru](https://t.me/VictoriaMetrics_ru1)
+- [telegram-en](https://t.me/VictoriaMetrics_en)
+- [telegram-ru](https://t.me/VictoriaMetrics_ru1)
 - [google groups](https://groups.google.com/forum/#!forum/victorametrics-users)


--- a/app/victoria-metrics/Makefile
+++ b/app/victoria-metrics/Makefile
@@ -32,6 +32,12 @@ victoria-metrics-arm64:
 victoria-metrics-arm64-prod:
 	APP_NAME=victoria-metrics APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker

+victoria-metrics-386:
+	CGO_ENABLED=0 GOOS=linux GOARCH=386 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-386 ./app/victoria-metrics
+
+victoria-metrics-386-prod:
+	APP_NAME=victoria-metrics APP_SUFFIX='-386' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=386' $(MAKE) app-via-docker
+
 victoria-metrics-pure:
 	APP_NAME=victoria-metrics $(MAKE) app-local-pure

--- a/app/victoria-metrics/main_test.go
+++ b/app/victoria-metrics/main_test.go
@@ -7,6 +7,7 @@ import (
 	"encoding/json"
 	"flag"
 	"fmt"
+	"io"
 	"io/ioutil"
 	"log"
 	"net"
@@ -18,6 +19,7 @@ import (
 	"testing"
 	"time"

+	testutil "github.com/VictoriaMetrics/VictoriaMetrics/app/victoria-metrics/test"
 	"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert"
 	"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect"
 	"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
@@ -27,18 +29,21 @@ import (
 )

 const (
-	testFixturesDir        = "testdata"
-	testStorageSuffix      = "vm-test-storage"
-	testHTTPListenAddr     = ":7654"
-	testStatsDListenAddr   = ":2003"
-	testOpenTSDBListenAddr = ":4242"
-	testLogLevel           = "INFO"
+	testFixturesDir            = "testdata"
+	testStorageSuffix          = "vm-test-storage"
+	testHTTPListenAddr         = ":7654"
+	testStatsDListenAddr       = ":2003"
+	testOpenTSDBListenAddr     = ":4242"
+	testOpenTSDBHTTPListenAddr = ":4243"
+	testLogLevel               = "INFO"
 )

 const (
-	testReadHTTPPath   = "http://127.0.0.1" + testHTTPListenAddr
-	testWriteHTTPPath  = "http://127.0.0.1" + testHTTPListenAddr + "/write"
-	testHealthHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/health"
+	testReadHTTPPath          = "http://127.0.0.1" + testHTTPListenAddr
+	testWriteHTTPPath         = "http://127.0.0.1" + testHTTPListenAddr + "/write"
+	testOpenTSDBWriteHTTPPath = "http://127.0.0.1" + testOpenTSDBHTTPListenAddr + "/api/put"
+	testPromWriteHTTPPath     = "http://127.0.0.1" + testHTTPListenAddr + "/api/v1/write"
+	testHealthHTTPPath        = "http://127.0.0.1" + testHTTPListenAddr + "/health"
 )

 const (
@@ -51,18 +56,69 @@ var (
 )

 type test struct {
-	Name   string `json:"name"`
-	Data   string `json:"data"`
-	Query  string `json:"query"`
-	Result []Row  `json:"result"`
+	Name             string     `json:"name"`
+	Data             []string   `json:"data"`
+	Query            []string   `json:"query"`
+	ResultMetrics    []Metric   `json:"result_metrics"`
+	ResultSeries     Series     `json:"result_series"`
+	ResultQuery      Query      `json:"result_query"`
+	ResultQueryRange QueryRange `json:"result_query_range"`
+	Issue            string     `json:"issue"`
 }

-type Row struct {
+type Metric struct {
 	Metric     map[string]string `json:"metric"`
 	Values     []float64         `json:"values"`
 	Timestamps []int64           `json:"timestamps"`
 }

+func (r *Metric) UnmarshalJSON(b []byte) error {
+	type plain Metric
+	return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
+}
+
+type Series struct {
+	Status string              `json:"status"`
+	Data   []map[string]string `json:"data"`
+}
+type Query struct {
+	Status string    `json:"status"`
+	Data   QueryData `json:"data"`
+}
+type QueryData struct {
+	ResultType string            `json:"resultType"`
+	Result     []QueryDataResult `json:"result"`
+}
+
+type QueryDataResult struct {
+	Metric map[string]string `json:"metric"`
+	Value  []interface{}     `json:"value"`
+}
+
+func (r *QueryDataResult) UnmarshalJSON(b []byte) error {
+	type plain QueryDataResult
+	return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
+}
+
+type QueryRange struct {
+	Status string         `json:"status"`
+	Data   QueryRangeData `json:"data"`
+}
+type QueryRangeData struct {
+	ResultType string                 `json:"resultType"`
+	Result     []QueryRangeDataResult `json:"result"`
+}
+
+type QueryRangeDataResult struct {
+	Metric map[string]string `json:"metric"`
+	Values [][]interface{}   `json:"values"`
+}
+
+func (r *QueryRangeDataResult) UnmarshalJSON(b []byte) error {
+	type plain QueryRangeDataResult
+	return json.Unmarshal(testutil.PopulateTimeTpl(b, insertionTime), (*plain)(r))
+}
+
 func TestMain(m *testing.M) {
 	setUp()
 	code := m.Run()
@@ -102,6 +158,7 @@ func processFlags() {
 		{flag: "graphiteListenAddr", value: testStatsDListenAddr},
 		{flag: "opentsdbListenAddr", value: testOpenTSDBListenAddr},
 		{flag: "loggerLevel", value: testLogLevel},
+		{flag: "opentsdbHTTPListenAddr", value: testOpenTSDBHTTPListenAddr},
 	} {
 		// panics if flag doesn't exist
 		if err := flag.Lookup(fv.flag).Value.Set(fv.value); err != nil {
@@ -123,7 +180,7 @@ func waitFor(timeout time.Duration, f func() bool) error {

 func tearDown() {
 	if err := httpserver.Stop(*httpListenAddr); err != nil {
-		log.Fatalf("cannot stop the webservice: %s", err)
+		log.Printf("cannot stop the webservice: %s", err)
 	}
 	vminsert.Stop()
 	vmstorage.Stop()
@@ -136,54 +193,112 @@ func TestWriteRead(t *testing.T) {
 	t.Run("write", testWrite)
 	time.Sleep(1 * time.Second)
 	vmstorage.Stop()
-
 	// open storage after stop in write
 	vmstorage.InitWithoutMetrics()
 	t.Run("read", testRead)
 }

 func testWrite(t *testing.T) {
+	t.Run("prometheus", func(t *testing.T) {
+		for _, test := range readIn("prometheus", t, insertionTime) {
+			s := newSuite(t)
+			r := testutil.WriteRequest{}
+			s.noError(json.Unmarshal([]byte(strings.Join(test.Data, "\n")), &r.Timeseries))
+			data, err := testutil.Compress(r)
+			s.greaterThan(len(r.Timeseries), 0)
+			if err != nil {
+				t.Errorf("error compressing %v %s", r, err)
+				t.Fail()
+			}
+			httpWrite(t, testPromWriteHTTPPath, bytes.NewBuffer(data))
+		}
+	})
+
 	t.Run("influxdb", func(t *testing.T) {
-		for _, test := range readIn("influxdb", t, fmt.Sprintf("%d", insertionTime.UnixNano())) {
+		for _, x := range readIn("influxdb", t, insertionTime) {
+			test := x
 			t.Run(test.Name, func(t *testing.T) {
 				t.Parallel()
-				httpWrite(t, testWriteHTTPPath, test.Data)
+				httpWrite(t, testWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
 			})
 		}
 	})
 	t.Run("graphite", func(t *testing.T) {
-		for _, test := range readIn("graphite", t, fmt.Sprintf("%d", insertionTime.Unix())) {
+		for _, x := range readIn("graphite", t, insertionTime) {
+			test := x
 			t.Run(test.Name, func(t *testing.T) {
 				t.Parallel()
-				tcpWrite(t, "127.0.0.1"+testStatsDListenAddr, test.Data)
+				tcpWrite(t, "127.0.0.1"+testStatsDListenAddr, strings.Join(test.Data, "\n"))
 			})
 		}
 	})
 	t.Run("opentsdb", func(t *testing.T) {
-		for _, test := range readIn("opentsdb", t, fmt.Sprintf("%d", insertionTime.Unix())) {
+		for _, x := range readIn("opentsdb", t, insertionTime) {
+			test := x
 			t.Run(test.Name, func(t *testing.T) {
 				t.Parallel()
-				tcpWrite(t, "127.0.0.1"+testOpenTSDBListenAddr, test.Data)
+				tcpWrite(t, "127.0.0.1"+testOpenTSDBListenAddr, strings.Join(test.Data, "\n"))
+			})
+		}
+	})
+	t.Run("opentsdbhttp", func(t *testing.T) {
+		for _, x := range readIn("opentsdbhttp", t, insertionTime) {
+			test := x
+			t.Run(test.Name, func(t *testing.T) {
+				t.Parallel()
+				logger.Infof("writing %s", test.Data)
+				httpWrite(t, testOpenTSDBWriteHTTPPath, bytes.NewBufferString(strings.Join(test.Data, "\n")))
 			})
 		}
 	})
 }

 func testRead(t *testing.T) {
-	for _, engine := range []string{"graphite", "opentsdb", "influxdb"} {
+	for _, engine := range []string{"prometheus", "graphite", "opentsdb", "influxdb", "opentsdbhttp"} {
 		t.Run(engine, func(t *testing.T) {
-			for _, test := range readIn(engine, t, fmt.Sprintf("%d", insertionTime.UnixNano())) {
-				test := test
+			for _, x := range readIn(engine, t, insertionTime) {
+				test := x
 				t.Run(test.Name, func(t *testing.T) {
 					t.Parallel()
-					rowContains(t, httpRead(t, testReadHTTPPath, test.Query), test.Result)
+					for _, q := range test.Query {
+						q = testutil.PopulateTimeTplString(q, insertionTime)
+						if test.Issue != "" {
+							test.Issue = "Regression in " + test.Issue
+						}
+						switch true {
+						case strings.HasPrefix(q, "/api/v1/export"):
+							if err := checkMetricsResult(httpReadMetrics(t, testReadHTTPPath, q), test.ResultMetrics); err != nil {
+								t.Fatalf("Export. %s fails with error %s.%s", q, err, test.Issue)
+							}
+						case strings.HasPrefix(q, "/api/v1/series"):
+							s := Series{}
+							httpReadStruct(t, testReadHTTPPath, q, &s)
+							if err := checkSeriesResult(s, test.ResultSeries); err != nil {
+								t.Fatalf("Series. %s fails with error %s.%s", q, err, test.Issue)
+							}
+						case strings.HasPrefix(q, "/api/v1/query_range"):
+							queryResult := QueryRange{}
+							httpReadStruct(t, testReadHTTPPath, q, &queryResult)
+							if err := checkQueryRangeResult(queryResult, test.ResultQueryRange); err != nil {
+								t.Fatalf("Query Range. %s fails with error %s.%s", q, err, test.Issue)
+							}
+						case strings.HasPrefix(q, "/api/v1/query"):
+							queryResult := Query{}
+							httpReadStruct(t, testReadHTTPPath, q, &queryResult)
+							if err := checkQueryResult(queryResult, test.ResultQuery); err != nil {
+								t.Fatalf("Query. %s fails with error %s.%s", q, err, test.Issue)
+							}
+						default:
+							t.Fatalf("unsupported read query %s", q)
+						}
+					}
 				})
 			}
 		})
 	}
 }

-func readIn(readFor string, t *testing.T, timeStr string) []test {
+func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
 	t.Helper()
 	s := newSuite(t)
 	var tt []test
@@ -195,7 +310,9 @@ func readIn(readFor string, t *testing.T, timeStr string) []test {
 		s.noError(err)
 		item := test{}
 		s.noError(json.Unmarshal(b, &item))
-		item.Data = strings.Replace(item.Data, "{TIME}", timeStr, 1)
+		for i := range item.Data {
+			item.Data[i] = testutil.PopulateTimeTplString(item.Data[i], insertTime)
+		}
 		tt = append(tt, item)
 		return nil
 	}))
@@ -205,10 +322,10 @@ func readIn(readFor string, t *testing.T, timeStr string) []test {
 	return tt
 }

-func httpWrite(t *testing.T, address string, data string) {
+func httpWrite(t *testing.T, address string, r io.Reader) {
 	t.Helper()
 	s := newSuite(t)
-	resp, err := http.Post(address, "", bytes.NewBufferString(data))
+	resp, err := http.Post(address, "", r)
 	s.noError(err)
 	s.noError(resp.Body.Close())
 	s.equalInt(resp.StatusCode, 204)
@@ -225,35 +342,122 @@ func tcpWrite(t *testing.T, address string, data string) {
 	s.equalInt(n, len(data))
 }

-func httpRead(t *testing.T, address, query string) []Row {
+func httpReadMetrics(t *testing.T, address, query string) []Metric {
 	t.Helper()
 	s := newSuite(t)
 	resp, err := http.Get(address + query)
 	s.noError(err)
 	defer resp.Body.Close()
 	s.equalInt(resp.StatusCode, 200)
-	var rows []Row
+	var rows []Metric
 	for dec := json.NewDecoder(resp.Body); dec.More(); {
-		var row Row
+		var row Metric
 		s.noError(dec.Decode(&row))
 		rows = append(rows, row)
 	}
 	return rows
 }
-
-func rowContains(t *testing.T, rows, contains []Row) {
+func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
 	t.Helper()
-	for _, r := range rows {
-		contains = removeIfFound(r, contains)
-	}
-	if len(contains) > 0 {
-		t.Fatalf("result rows %+v not found in %+v", contains, rows)
-	}
+	s := newSuite(t)
+	resp, err := http.Get(address + query)
+	s.noError(err)
+	defer resp.Body.Close()
+	s.equalInt(resp.StatusCode, 200)
+	s.noError(json.NewDecoder(resp.Body).Decode(dst))
 }

-func removeIfFound(r Row, contains []Row) []Row {
+func checkMetricsResult(got, want []Metric) error {
+	for _, r := range append([]Metric(nil), got...) {
+		want = removeIfFoundMetrics(r, want)
+	}
+	if len(want) > 0 {
+		return fmt.Errorf("exptected metrics %+v not found in %+v", want, got)
+	}
+	return nil
+}
+
+func removeIfFoundMetrics(r Metric, contains []Metric) []Metric {
+	for i, item := range contains {
+		if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Values, item.Values) &&
+			reflect.DeepEqual(r.Timestamps, item.Timestamps) {
+			contains[i] = contains[len(contains)-1]
+			return contains[:len(contains)-1]
+		}
+	}
+	return contains
+}
+
+func checkSeriesResult(got, want Series) error {
+	if got.Status != want.Status {
+		return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
+	}
+	wantData := append([]map[string]string(nil), want.Data...)
+	for _, r := range got.Data {
+		wantData = removeIfFoundSeries(r, wantData)
+	}
+	if len(wantData) > 0 {
+		return fmt.Errorf("expected seria(s) %+v not found in %+v", wantData, got.Data)
+	}
+	return nil
+}
+
+func removeIfFoundSeries(r map[string]string, contains []map[string]string) []map[string]string {
+	for i, item := range contains {
+		if reflect.DeepEqual(r, item) {
+			contains[i] = contains[len(contains)-1]
+			return contains[:len(contains)-1]
+		}
+	}
+	return contains
+}
+
+func checkQueryResult(got, want Query) error {
+	if got.Status != want.Status {
+		return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
+	}
+	if got.Data.ResultType != want.Data.ResultType {
+		return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
+	}
+	wantData := append([]QueryDataResult(nil), want.Data.Result...)
+	for _, r := range got.Data.Result {
+		wantData = removeIfFoundQueryData(r, wantData)
+	}
+	if len(wantData) > 0 {
+		return fmt.Errorf("expected query result %+v not found in %+v", wantData, got.Data.Result)
+	}
+	return nil
+}
+
+func removeIfFoundQueryData(r QueryDataResult, contains []QueryDataResult) []QueryDataResult {
+	for i, item := range contains {
+		if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Value[0], item.Value[0]) && reflect.DeepEqual(r.Value[1], item.Value[1]) {
+			contains[i] = contains[len(contains)-1]
+			return contains[:len(contains)-1]
+		}
+	}
+	return contains
+}
+
+func checkQueryRangeResult(got, want QueryRange) error {
+	if got.Status != want.Status {
+		return fmt.Errorf("status mismatch %q - %q", want.Status, got.Status)
+	}
+	if got.Data.ResultType != want.Data.ResultType {
+		return fmt.Errorf("result type mismatch %q - %q", want.Data.ResultType, got.Data.ResultType)
+	}
+	wantData := append([]QueryRangeDataResult(nil), want.Data.Result...)
+	for _, r := range got.Data.Result {
+		wantData = removeIfFoundQueryRangeData(r, wantData)
+	}
+	if len(wantData) > 0 {
+		return fmt.Errorf("expected query range result %+v not found in %+v", wantData, got.Data.Result)
+	}
+	return nil
+}
+
+func removeIfFoundQueryRangeData(r QueryRangeDataResult, contains []QueryRangeDataResult) []QueryRangeDataResult {
 	for i, item := range contains {
-		// todo check time
 		if reflect.DeepEqual(r.Metric, item.Metric) && reflect.DeepEqual(r.Values, item.Values) {
 			contains[i] = contains[len(contains)-1]
 			return contains[:len(contains)-1]
@@ -281,3 +485,11 @@ func (s *suite) equalInt(a, b int) {
 		s.t.FailNow()
 	}
 }
+
+func (s *suite) greaterThan(a, b int) {
+	s.t.Helper()
+	if a <= b {
+		s.t.Errorf("%d less or equal then %d", a, b)
+		s.t.FailNow()
+	}
+}
--- a/app/victoria-metrics/test/parser.go
+++ b/app/victoria-metrics/test/parser.go
@@ -0,0 +1,52 @@
+package test
+
+import (
+	"fmt"
+	"log"
+	"regexp"
+	"strings"
+	"time"
+)
+
+var (
+	parseTimeExpRegex = regexp.MustCompile(`"?{TIME[^}]*}"?`)
+	extractRegex      = regexp.MustCompile(`"?{([^}]*)}"?`)
+)
+
+// PopulateTimeTplString substitutes {TIME_*} with t in s and returns the result.
+func PopulateTimeTplString(s string, t time.Time) string {
+	return string(PopulateTimeTpl([]byte(s), t))
+}
+
+// PopulateTimeTpl substitutes {TIME_*} with tGlobal in b and returns the result.
+func PopulateTimeTpl(b []byte, tGlobal time.Time) []byte {
+	return parseTimeExpRegex.ReplaceAllFunc(b, func(repl []byte) []byte {
+		t := tGlobal
+		repl = extractRegex.FindSubmatch(repl)[1]
+		parts := strings.SplitN(string(repl), "-", 2)
+		if len(parts) == 2 {
+			duration, err := time.ParseDuration(strings.TrimSpace(parts[1]))
+			if err != nil {
+				log.Fatalf("error %s parsing duration %s in %s", err, parts[1], repl)
+			}
+			t = t.Add(-duration)
+		}
+		switch strings.TrimSpace(parts[0]) {
+		case `TIME_S`:
+			return []byte(fmt.Sprintf("%d", t.Unix()))
+		case `TIME_MSZ`:
+			return []byte(fmt.Sprintf("%d", t.Unix()*1e3))
+		case `TIME_MS`:
+			return []byte(fmt.Sprintf("%d", timeToMillis(t)))
+		case `TIME_NS`:
+			return []byte(fmt.Sprintf("%d", t.UnixNano()))
+		default:
+			log.Fatalf("unkown time pattern %s in %s", parts[0], repl)
+		}
+		return repl
+	})
+}
+
+func timeToMillis(t time.Time) int64 {
+	return t.UnixNano() / 1e6
+}
--- a/app/victoria-metrics/test/parser_test.go
+++ b/app/victoria-metrics/test/parser_test.go
@@ -0,0 +1,24 @@
+package test
+
+import (
+	"testing"
+	"time"
+)
+
+func TestPopulateTimeTplString(t *testing.T) {
+	now, err := time.Parse(time.RFC3339, "2006-01-02T15:04:05Z")
+	if err != nil {
+		t.Fatalf("unexpected error when parsing time: %s", err)
+	}
+	f := func(s, resultExpected string) {
+		t.Helper()
+		result := PopulateTimeTplString(s, now)
+		if result != resultExpected {
+			t.Fatalf("unexpected result; got %q; want %q", result, resultExpected)
+		}
+	}
+	f("", "")
+	f("{TIME_S}", "1136214245")
+	f("now: {TIME_S}, past 30s: {TIME_MS-30s}, now: {TIME_S}", "now: 1136214245, past 30s: 1136214215000, now: 1136214245")
+	f("now: {TIME_MS}, past 30m: {TIME_MSZ-30m}, past 2h: {TIME_NS-2h}", "now: 1136214245000, past 30m: 1136212445000, past 2h: 1136207045000000000")
+}
--- a/app/victoria-metrics/test/prom_types.go
+++ b/app/victoria-metrics/test/prom_types.go
@@ -0,0 +1,338 @@
+// +build integration
+
+// Source https://github.com/prometheus/prometheus/blob/master/prompb/remote.pb.go . Code is copy pasted and cleaned up
+package test
+
+import (
+	"encoding/binary"
+	"math"
+	"math/bits"
+)
+
+type WriteRequest struct {
+	Timeseries []TimeSeries `protobuf:"bytes,1,rep,name=timeseries,proto3" json:"timeseries"`
+}
+
+func (m *WriteRequest) Size() (n int) {
+	if m == nil {
+		return 0
+	}
+	var l int
+	_ = l
+	if len(m.Timeseries) > 0 {
+		for _, e := range m.Timeseries {
+			l = e.Size()
+			n += 1 + l + sovRemote(uint64(l))
+		}
+	}
+	return n
+}
+func sovRemote(x uint64) (n int) {
+	return (bits.Len64(x|1) + 6) / 7
+}
+
+func (m *WriteRequest) Marshal() (dAtA []byte, err error) {
+	size := m.Size()
+	dAtA = make([]byte, size)
+	n, err := m.MarshalToSizedBuffer(dAtA[:size])
+	if err != nil {
+		return nil, err
+	}
+	return dAtA[:n], nil
+}
+
+func (m *WriteRequest) MarshalTo(dAtA []byte) (int, error) {
+	size := m.Size()
+	return m.MarshalToSizedBuffer(dAtA[:size])
+}
+
+func (m *WriteRequest) MarshalToSizedBuffer(dAtA []byte) (int, error) {
+	i := len(dAtA)
+	if len(m.Timeseries) > 0 {
+		for iNdEx := len(m.Timeseries) - 1; iNdEx >= 0; iNdEx-- {
+			{
+				size, err := m.Timeseries[iNdEx].MarshalToSizedBuffer(dAtA[:i])
+				if err != nil {
+					return 0, err
+				}
+				i -= size
+				i = encodeVarintRemote(dAtA, i, uint64(size))
+			}
+			i--
+			dAtA[i] = 0xa
+		}
+	}
+	return len(dAtA) - i, nil
+}
+
+func encodeVarintRemote(dAtA []byte, offset int, v uint64) int {
+	offset -= sovRemote(v)
+	base := offset
+	for v >= 1<<7 {
+		dAtA[offset] = uint8(v&0x7f | 0x80)
+		v >>= 7
+		offset++
+	}
+	dAtA[offset] = uint8(v)
+	return base
+}
+
+type Sample struct {
+	Value     float64 `protobuf:"fixed64,1,opt,name=value,proto3" json:"value,omitempty"`
+	Timestamp int64   `protobuf:"varint,2,opt,name=timestamp,proto3" json:"timestamp,omitempty"`
+}
+
+func (m *Sample) Reset() { *m = Sample{} }
+
+// TimeSeries represents samples and labels for a single time series.
+type TimeSeries struct {
+	Labels  []Label  `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
+	Samples []Sample `protobuf:"bytes,2,rep,name=samples,proto3" json:"samples"`
+}
+
+func (m *TimeSeries) Reset() { *m = TimeSeries{} }
+
+type Label struct {
+	Name  string `protobuf:"bytes,1,opt,name=name,proto3" json:"name,omitempty"`
+	Value string `protobuf:"bytes,2,opt,name=value,proto3" json:"value,omitempty"`
+}
+
+func (m *Label) Reset() { *m = Label{} }
+
+type Labels struct {
+	Labels []Label `protobuf:"bytes,1,rep,name=labels,proto3" json:"labels"`
+}
+
+func (m *Labels) Reset() { *m = Labels{} }
+
+func (m *Sample) Marshal() (dAtA []byte, err error) {
+	size := m.Size()
+	dAtA = make([]byte, size)
+	n, err := m.MarshalToSizedBuffer(dAtA[:size])
+	if err != nil {
+		return nil, err
+	}
+	return dAtA[:n], nil
+}
+
+func (m *Sample) MarshalTo(dAtA []byte) (int, error) {
+	size := m.Size()
+	return m.MarshalToSizedBuffer(dAtA[:size])
+}
+
+func (m *Sample) MarshalToSizedBuffer(dAtA []byte) (int, error) {
+	i := len(dAtA)
+	if m.Timestamp != 0 {
+		i = encodeVarintTypes(dAtA, i, uint64(m.Timestamp))
+		i--
+		dAtA[i] = 0x10
+	}
+	if m.Value != 0 {
+		i -= 8
+		binary.LittleEndian.PutUint64(dAtA[i:], uint64(math.Float64bits(float64(m.Value))))
+		i--
+		dAtA[i] = 0x9
+	}
+	return len(dAtA) - i, nil
+}
+
+func (m *TimeSeries) Marshal() (dAtA []byte, err error) {
+	size := m.Size()
+	dAtA = make([]byte, size)
+	n, err := m.MarshalToSizedBuffer(dAtA[:size])
+	if err != nil {
+		return nil, err
+	}
+	return dAtA[:n], nil
+}
+
+func (m *TimeSeries) MarshalTo(dAtA []byte) (int, error) {
+	size := m.Size()
+	return m.MarshalToSizedBuffer(dAtA[:size])
+}
+
+func (m *TimeSeries) MarshalToSizedBuffer(dAtA []byte) (int, error) {
+	i := len(dAtA)
+	if len(m.Samples) > 0 {
+		for iNdEx := len(m.Samples) - 1; iNdEx >= 0; iNdEx-- {
+			{
+				size, err := m.Samples[iNdEx].MarshalToSizedBuffer(dAtA[:i])
+				if err != nil {
+					return 0, err
+				}
+				i -= size
+				i = encodeVarintTypes(dAtA, i, uint64(size))
+			}
+			i--
+			dAtA[i] = 0x12
+		}
+	}
+	if len(m.Labels) > 0 {
+		for iNdEx := len(m.Labels) - 1; iNdEx >= 0; iNdEx-- {
+			{
+				size, err := m.Labels[iNdEx].MarshalToSizedBuffer(dAtA[:i])
+				if err != nil {
+					return 0, err
+				}
+				i -= size
+				i = encodeVarintTypes(dAtA, i, uint64(size))
+			}
+			i--
+			dAtA[i] = 0xa
+		}
+	}
+	return len(dAtA) - i, nil
+}
+
+func (m *Label) Marshal() (dAtA []byte, err error) {
+	size := m.Size()
+	dAtA = make([]byte, size)
+	n, err := m.MarshalToSizedBuffer(dAtA[:size])
+	if err != nil {
+		return nil, err
+	}
+	return dAtA[:n], nil
+}
+
+func (m *Label) MarshalTo(dAtA []byte) (int, error) {
+	size := m.Size()
+	return m.MarshalToSizedBuffer(dAtA[:size])
+}
+
+func (m *Label) MarshalToSizedBuffer(dAtA []byte) (int, error) {
+	i := len(dAtA)
+	_ = i
+	var l int
+	_ = l
+	if len(m.Value) > 0 {
+		i -= len(m.Value)
+		copy(dAtA[i:], m.Value)
+		i = encodeVarintTypes(dAtA, i, uint64(len(m.Value)))
+		i--
+		dAtA[i] = 0x12
+	}
+	if len(m.Name) > 0 {
+		i -= len(m.Name)
+		copy(dAtA[i:], m.Name)
+		i = encodeVarintTypes(dAtA, i, uint64(len(m.Name)))
+		i--
+		dAtA[i] = 0xa
+	}
+	return len(dAtA) - i, nil
+}
+
+func (m *Labels) Marshal() (dAtA []byte, err error) {
+	size := m.Size()
+	dAtA = make([]byte, size)
+	n, err := m.MarshalToSizedBuffer(dAtA[:size])
+	if err != nil {
+		return nil, err
+	}
+	return dAtA[:n], nil
+}
+
+func (m *Labels) MarshalTo(dAtA []byte) (int, error) {
+	size := m.Size()
+	return m.MarshalToSizedBuffer(dAtA[:size])
+}
+
+func (m *Labels) MarshalToSizedBuffer(dAtA []byte) (int, error) {
+	i := len(dAtA)
+	if len(m.Labels) > 0 {
+		for iNdEx := len(m.Labels) - 1; iNdEx >= 0; iNdEx-- {
+			{
+				size, err := m.Labels[iNdEx].MarshalToSizedBuffer(dAtA[:i])
+				if err != nil {
+					return 0, err
+				}
+				i -= size
+				i = encodeVarintTypes(dAtA, i, uint64(size))
+			}
+			i--
+			dAtA[i] = 0xa
+		}
+	}
+	return len(dAtA) - i, nil
+}
+
+func encodeVarintTypes(dAtA []byte, offset int, v uint64) int {
+	offset -= sovTypes(v)
+	base := offset
+	for v >= 1<<7 {
+		dAtA[offset] = uint8(v&0x7f | 0x80)
+		v >>= 7
+		offset++
+	}
+	dAtA[offset] = uint8(v)
+	return base
+}
+
+func (m *Sample) Size() (n int) {
+	if m == nil {
+		return 0
+	}
+	if m.Value != 0 {
+		n += 9
+	}
+	if m.Timestamp != 0 {
+		n += 1 + sovTypes(uint64(m.Timestamp))
+	}
+	return n
+}
+
+func (m *TimeSeries) Size() (n int) {
+	if m == nil {
+		return 0
+	}
+	var l int
+	_ = l
+	if len(m.Labels) > 0 {
+		for _, e := range m.Labels {
+			l = e.Size()
+			n += 1 + l + sovTypes(uint64(l))
+		}
+	}
+	if len(m.Samples) > 0 {
+		for _, e := range m.Samples {
+			l = e.Size()
+			n += 1 + l + sovTypes(uint64(l))
+		}
+	}
+	return n
+}
+
+func (m *Label) Size() (n int) {
+	if m == nil {
+		return 0
+	}
+	var l int
+	_ = l
+	l = len(m.Name)
+	if l > 0 {
+		n += 1 + l + sovTypes(uint64(l))
+	}
+	l = len(m.Value)
+	if l > 0 {
+		n += 1 + l + sovTypes(uint64(l))
+	}
+	return n
+}
+
+func (m *Labels) Size() (n int) {
+	if m == nil {
+		return 0
+	}
+	var l int
+	_ = l
+	if len(m.Labels) > 0 {
+		for _, e := range m.Labels {
+			l = e.Size()
+			n += 1 + l + sovTypes(uint64(l))
+		}
+	}
+	return n
+}
+
+func sovTypes(x uint64) (n int) {
+	return (bits.Len64(x|1) + 6) / 7
+}
--- a/app/victoria-metrics/test/prom_writter.go
+++ b/app/victoria-metrics/test/prom_writter.go
@@ -0,0 +1,13 @@
+// +build integration
+
+package test
+
+import "github.com/golang/snappy"
+
+func Compress(wr WriteRequest) ([]byte, error) {
+	data, err := wr.Marshal()
+	if err != nil {
+		return nil, err
+	}
+	return snappy.Encode(nil, data), nil
+}
--- a/app/victoria-metrics/testdata/graphite/basic.json
+++ b/app/victoria-metrics/testdata/graphite/basic.json
@@ -1,8 +1,8 @@
 {
  "name": "basic_insertion",
-  "data": "graphite.foo.bar.baz;tag1=value1;tag2=value2 123 {TIME}",
-  "query": "/api/v1/export?match={__name__!=\"\"}",
-  "result": [
-    {"metric":{"__name__":"graphite.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123]}
+  "data": ["graphite.foo.bar.baz;tag1=value1;tag2=value2 123 {TIME_S}"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"graphite.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MSZ}"]}
  ]
 }
--- a/app/victoria-metrics/testdata/graphite/comparison-not-inf-not-nan.json
+++ b/app/victoria-metrics/testdata/graphite/comparison-not-inf-not-nan.json
@@ -0,0 +1,16 @@
+{
+  "name": "comparison-not-inf-not-nan",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150",
+  "data": [
+    "not_nan_not_inf;item=x 1 {TIME_S-1m}",
+    "not_nan_not_inf;item=x 1 {TIME_S-2m}",
+    "not_nan_not_inf;item=y 3 {TIME_S-1m}",
+    "not_nan_not_inf;item=y 1 {TIME_S-2m}"],
+  "query": ["/api/v1/query_range?query=1/(not_nan_not_inf-1)!=inf!=nan&start={TIME_S-3m}&end={TIME_S}&step=60"],
+  "result_query_range": {
+    "status":"success",
+    "data":{"resultType":"matrix",
+      "result":[
+	      {"metric":{"item":"y"},"values":[["{TIME_S-1m}","0.5"],["{TIME_S}","0.5"]]}
+      ]}}
+}
--- a/app/victoria-metrics/testdata/graphite/max_lookback_set.json
+++ b/app/victoria-metrics/testdata/graphite/max_lookback_set.json
@@ -0,0 +1,24 @@
+{
+  "name": "max_lookback_set",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209",
+  "data": [
+    "max_lookback_set 1 {TIME_S-30s}",
+    "max_lookback_set 2 {TIME_S-60s}",
+    "max_lookback_set 3 {TIME_S-120s}",
+    "max_lookback_set 4 {TIME_S-150s}"
+  ],
+  "query": ["/api/v1/query_range?query=max_lookback_set&start={TIME_S-150s}&end={TIME_S}&step=10s&max_lookback=1s"],
+  "result_query_range": {
+    "status":"success",
+    "data":{"resultType":"matrix",
+      "result":[{"metric":{"__name__":"max_lookback_set"},"values":[
+	      ["{TIME_S-150s}","4"],
+	      ["{TIME_S-140s}","4"],
+	      ["{TIME_S-120s}","3"],
+	      ["{TIME_S-110s}","3"],
+	      ["{TIME_S-60s}","2"],
+	      ["{TIME_S-50s}","2"],
+	      ["{TIME_S-30s}","1"],
+	      ["{TIME_S-20s}","1"]
+      ]}]}}
+}
--- a/app/victoria-metrics/testdata/graphite/max_lookback_unset.json
+++ b/app/victoria-metrics/testdata/graphite/max_lookback_unset.json
@@ -0,0 +1,32 @@
+{
+  "name": "max_lookback_unset",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209",
+  "data": [
+    "max_lookback_unset 1 {TIME_S-30s}",
+    "max_lookback_unset 2 {TIME_S-60s}",
+    "max_lookback_unset 3 {TIME_S-120s}",
+    "max_lookback_unset 4 {TIME_S-150s}"
+  ],
+  "query": ["/api/v1/query_range?query=max_lookback_unset&start={TIME_S-150s}&end={TIME_S}&step=10s"],
+  "result_query_range": {
+    "status":"success",
+    "data":{"resultType":"matrix",
+      "result":[{"metric":{"__name__":"max_lookback_unset"},"values":[
+	      ["{TIME_S-150s}","4"],
+	      ["{TIME_S-140s}","4"],
+	      ["{TIME_S-130s}","4"],
+	      ["{TIME_S-120s}","3"],
+	      ["{TIME_S-110s}","3"],
+	      ["{TIME_S-100s}","3"],
+	      ["{TIME_S-90s}","3"],
+	      ["{TIME_S-80s}","3"],
+	      ["{TIME_S-70s}","3"],
+	      ["{TIME_S-60s}","2"],
+	      ["{TIME_S-50s}","2"],
+	      ["{TIME_S-40s}","2"],
+	      ["{TIME_S-30s}","1"],
+	      ["{TIME_S-20s}","1"],
+	      ["{TIME_S-10s}","1"],
+	      ["{TIME_S}","1"]
+      ]}]}}
+}
--- a/app/victoria-metrics/testdata/graphite/not-nan-as-missing-data.json
+++ b/app/victoria-metrics/testdata/graphite/not-nan-as-missing-data.json
@@ -0,0 +1,18 @@
+{
+  "name": "not-nan-as-missing-data",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153",
+  "data": [
+    "not_nan_as_missing_data;item=x 2 {TIME_S-2m}",
+    "not_nan_as_missing_data;item=x 1 {TIME_S-1m}",
+    "not_nan_as_missing_data;item=y 4 {TIME_S-2m}",
+    "not_nan_as_missing_data;item=y 3 {TIME_S-1m}"
+  ],
+  "query": ["/api/v1/query_range?query=not_nan_as_missing_data>1&start={TIME_S-2m}&end={TIME_S}&step=60"],
+  "result_query_range": {
+    "status":"success",
+    "data":{"resultType":"matrix",
+      "result":[
+	      {"metric":{"__name__":"not_nan_as_missing_data","item":"x"},"values":[["{TIME_S-2m}","2"]]},
+	      {"metric":{"__name__":"not_nan_as_missing_data","item":"y"},"values":[["{TIME_S-2m}","4"],["{TIME_S-1m}","3"],["{TIME_S}","3"]]}
+      ]}}
+}
--- a/app/victoria-metrics/testdata/graphite/subquery-aggregation.json
+++ b/app/victoria-metrics/testdata/graphite/subquery-aggregation.json
@@ -0,0 +1,14 @@
+{
+  "name": "subquery-aggregation",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184",
+  "data": [
+    "forms_daily_count;item=x 1 {TIME_S-1m}",
+    "forms_daily_count;item=x 2 {TIME_S-2m}",
+    "forms_daily_count;item=y 3 {TIME_S-1m}",
+    "forms_daily_count;item=y 4 {TIME_S-2m}"],
+  "query": ["/api/v1/query?query=min%20by%20(item)%20(min_over_time(forms_daily_count[10m:1m]))&time={TIME_S-1m}"],
+  "result_query": {
+    "status":"success",
+    "data":{"resultType":"vector","result":[{"metric":{"item":"x"},"value":["{TIME_S-1m}","1"]},{"metric":{"item":"y"},"value":["{TIME_S-1m}","3"]}]}
+  }
+}
--- a/app/victoria-metrics/testdata/influxdb/basic.json
+++ b/app/victoria-metrics/testdata/influxdb/basic.json
@@ -1,9 +1,9 @@
 {
  "name": "basic_insertion",
-  "data": "measurement,tag1=value1,tag2=value2 field1=1.23,field2=123",
-  "query": "/api/v1/export?match={__name__!=\"\"}",
-  "result": [
-    {"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[123]},
-    {"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[1.23]}
+  "data": ["measurement,tag1=value1,tag2=value2 field1=1.23,field2=123 {TIME_NS}"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"measurement_field2","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MS}"]},
+    {"metric":{"__name__":"measurement_field1","tag1":"value1","tag2":"value2"},"values":[1.23], "timestamps": ["{TIME_MS}"]}
  ]
 }
--- a/app/victoria-metrics/testdata/opentsdb/basic.json
+++ b/app/victoria-metrics/testdata/opentsdb/basic.json
@@ -1,8 +1,8 @@
 {
  "name": "basic_insertion",
-  "data": "put openstdb.foo.bar.baz {TIME} 123 tag1=value1 tag2=value2",
-  "query": "/api/v1/export?match={__name__!=\"\"}",
-  "result": [
-    {"metric":{"__name__":"openstdb.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123]}
+  "data": ["put openstdb.foo.bar.baz {TIME_S} 123 tag1=value1 tag2=value2"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"openstdb.foo.bar.baz","tag1":"value1","tag2":"value2"},"values":[123], "timestamps": ["{TIME_MSZ}"]}
  ]
 }
--- a/app/victoria-metrics/testdata/opentsdbhttp/basic.json
+++ b/app/victoria-metrics/testdata/opentsdbhttp/basic.json
@@ -0,0 +1,8 @@
+{
+  "name": "basic_insertion",
+  "data": ["{\"metric\": \"opentsdbhttp.foo\", \"value\": 1001, \"timestamp\": {TIME_S}, \"tags\": {\"bar\":\"baz\", \"x\": \"y\"}}"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"opentsdbhttp.foo","bar":"baz","x":"y"},"values":[1001], "timestamps": ["{TIME_MSZ}"]}
+  ]
+}
--- a/app/victoria-metrics/testdata/opentsdbhttp/multi_line.json
+++ b/app/victoria-metrics/testdata/opentsdbhttp/multi_line.json
@@ -0,0 +1,9 @@
+{
+  "name": "multiline",
+  "data": ["[{\"metric\": \"opentsdbhttp.multiline1\", \"value\": 1001, \"timestamp\": \"{TIME_S}\", \"tags\": {\"bar\":\"baz\", \"x\": \"y\"}}, {\"metric\": \"opentsdbhttp.multiline2\", \"value\": 1002, \"timestamp\": {TIME_S}}]"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"opentsdbhttp.multiline1","bar":"baz","x":"y"},"values":[1001], "timestamps": ["{TIME_MSZ}"]},
+    {"metric":{"__name__":"opentsdbhttp.multiline2"},"values":[1002], "timestamps": ["{TIME_MSZ}"]}
+  ]
+}
--- a/app/victoria-metrics/testdata/prometheus/basic.json
+++ b/app/victoria-metrics/testdata/prometheus/basic.json
@@ -0,0 +1,8 @@
+{
+  "name": "basic_insertion",
+  "data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.bar\"},{\"name\":\"baz\",\"value\":\"qux\"}],\"samples\":[{\"value\":100000,\"timestamp\":\"{TIME_MS}\"}]}]"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"prometheus.bar","baz":"qux"},"values":[100000], "timestamps": ["{TIME_MS}"]}
+  ]
+}
--- a/app/victoria-metrics/testdata/prometheus/case-sensitive-regex.json
+++ b/app/victoria-metrics/testdata/prometheus/case-sensitive-regex.json
@@ -0,0 +1,10 @@
+{
+  "name": "case-sensitive-regex",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161",
+  "data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.sensitiveRegex\"},{\"name\":\"label\",\"value\":\"sensitiveRegex\"}],\"samples\":[{\"value\":2,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.sensitiveRegex\"},{\"name\":\"label\",\"value\":\"SensitiveRegex\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
+  "query": ["/api/v1/export?match={label=~'(?i)sensitiveregex'}"],
+  "result_metrics": [
+    {"metric":{"__name__":"prometheus.sensitiveRegex","label":"sensitiveRegex"},"values":[2], "timestamps": ["{TIME_MS}"]},
+    {"metric":{"__name__":"prometheus.sensitiveRegex","label":"SensitiveRegex"},"values":[1], "timestamps": ["{TIME_MS}"]}
+  ]
+}
--- a/app/victoria-metrics/testdata/prometheus/duplicate-label.json
+++ b/app/victoria-metrics/testdata/prometheus/duplicate-label.json
@@ -0,0 +1,9 @@
+{
+  "name": "duplicate_label",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172",
+  "data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"prometheus.duplicate_label\"},{\"name\":\"duplicate\",\"value\":\"label\"},{\"name\":\"duplicate\",\"value\":\"label\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
+  "query": ["/api/v1/export?match={__name__!=''}"],
+  "result_metrics": [
+    {"metric":{"__name__":"prometheus.duplicate_label","duplicate":"label"},"values":[1], "timestamps": ["{TIME_MS}"]}
+  ]
+}
--- a/app/victoria-metrics/testdata/prometheus/match-series.json
+++ b/app/victoria-metrics/testdata/prometheus/match-series.json
@@ -0,0 +1,15 @@
+{
+  "name": "match_series",
+  "issue": "https://github.com/VictoriaMetrics/VictoriaMetrics/issues/155",
+  "data": ["[{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"1\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"2\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"3\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]},{\"labels\":[{\"name\":\"__name__\",\"value\":\"MatchSeries\"},{\"name\":\"db\",\"value\":\"TenMinute\"},{\"name\":\"TurbineType\",\"value\":\"V112\"},{\"name\":\"Park\",\"value\":\"4\"}],\"samples\":[{\"value\":1,\"timestamp\":\"{TIME_MS}\"}]}]"],
+  "query": ["/api/v1/series?match[]={__name__='MatchSeries'}", "/api/v1/series?match[]={__name__=~'MatchSeries.*'}"],
+  "result_series": {
+    "status": "success",
+    "data": [
+      {"__name__":"MatchSeries","db":"TenMinute","Park":"1","TurbineType":"V112"},
+      {"__name__":"MatchSeries","db":"TenMinute","Park":"2","TurbineType":"V112"},
+      {"__name__":"MatchSeries","db":"TenMinute","Park":"3","TurbineType":"V112"},
+      {"__name__":"MatchSeries","db":"TenMinute","Park":"4","TurbineType":"V112"}
+    ]
+  }
+}
--- a/app/vminsert/graphite/parser_test.go
+++ b/app/vminsert/graphite/parser_test.go
@@ -85,6 +85,15 @@ func TestRowsUnmarshalSuccess(t *testing.T) {
 		}},
 	})

+	// Timestamp bigger than 1<<31
+	f("aaa 1123 429496729600", &Rows{
+		Rows: []Row{{
+			Metric:    "aaa",
+			Value:     1123,
+			Timestamp: 429496729600,
+		}},
+	})
+
 	// Tags
 	f("foo;bar=baz 1 2", &Rows{
 		Rows: []Row{{
--- a/app/vminsert/opentsdbhttp/server.go
+++ b/app/vminsert/opentsdbhttp/server.go
@@ -38,7 +38,7 @@ func Serve(addr string, maxReqSize int64) {
 			return
 		}
 		if err != nil {
-			logger.Fatalf("FATAL: error serving HTTP OpenTSDB: %s", err)
+			logger.Fatalf("error serving HTTP OpenTSDB: %s", err)
 		}
 	}()
 }
@@ -65,6 +65,6 @@ func Stop() {
 	ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
 	defer cancel()
 	if err := httpServer.Shutdown(ctx); err != nil {
-		logger.Fatalf("FATAL: cannot close HTTP OpenTSDB server: %s", err)
+		logger.Fatalf("cannot close HTTP OpenTSDB server: %s", err)
 	}
 }
--- a/app/vmselect/netstorage/fadvise_darwin.go
+++ b/app/vmselect/netstorage/fadvise_darwin.go
@@ -4,6 +4,6 @@ import (
 	"os"
 )

-func mustFadviseRandomRead(f *os.File) {
+func mustFadviseSequentialRead(f *os.File) {
 	// Do nothing :)
 }
--- a/app/vmselect/netstorage/fadvise_freebsd.go
+++ b/app/vmselect/netstorage/fadvise_freebsd.go
@@ -7,9 +7,9 @@ import (
 	"golang.org/x/sys/unix"
 )

-func mustFadviseRandomRead(f *os.File) {
+func mustFadviseSequentialRead(f *os.File) {
 	fd := int(f.Fd())
-	if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_RANDOM|unix.FADV_WILLNEED); err != nil {
-		logger.Panicf("FATAL: error returned from unix.Fadvise(RANDOM|WILLNEED): %s", err)
+	if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
+		logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
 	}
 }
--- a/app/vmselect/netstorage/fadvise_linux.go
+++ b/app/vmselect/netstorage/fadvise_linux.go
@@ -7,9 +7,9 @@ import (
 	"golang.org/x/sys/unix"
 )

-func mustFadviseRandomRead(f *os.File) {
+func mustFadviseSequentialRead(f *os.File) {
 	fd := int(f.Fd())
-	if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_RANDOM|unix.FADV_WILLNEED); err != nil {
-		logger.Panicf("FATAL: error returned from unix.Fadvise(RANDOM|WILLNEED): %s", err)
+	if err := unix.Fadvise(int(fd), 0, 0, unix.FADV_SEQUENTIAL|unix.FADV_WILLNEED); err != nil {
+		logger.Panicf("FATAL: error returned from unix.Fadvise(SEQUENTIAL|WILLNEED): %s", err)
 	}
 }
--- a/app/vmselect/netstorage/netstorage.go
+++ b/app/vmselect/netstorage/netstorage.go
@@ -484,9 +484,12 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
 	tbf := getTmpBlocksFile()
 	m := make(map[string][]tmpBlockAddr)
 	blocksRead := 0
+	bb := tmpBufPool.Get()
+	defer tmpBufPool.Put(bb)
 	for sr.NextMetricBlock() {
 		blocksRead++
-		addr, err := tbf.WriteBlock(sr.MetricBlock.Block)
+		bb.B = storage.MarshalBlock(bb.B[:0], sr.MetricBlock.Block)
+		addr, err := tbf.WriteBlockData(bb.B)
 		if err != nil {
 			putTmpBlocksFile(tbf)
 			return nil, fmt.Errorf("cannot write data block #%d to temporary blocks file: %s", blocksRead, err)
@@ -520,6 +523,15 @@ func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadli
 		pts.metricName = metricName
 		pts.addrs = addrs
 	}
+
+	// Sort rss.packedTimeseries by the first addr offset in order
+	// to reduce the number of disk seeks during unpacking in RunParallel.
+	// In this case tmpBlocksFile must be read almost sequentially.
+	sort.Slice(rss.packedTimeseries, func(i, j int) bool {
+		pts := rss.packedTimeseries
+		return pts[i].addrs[0].offset < pts[j].addrs[0].offset
+	})
+
 	return &rss, nil
 }

--- a/app/vmselect/netstorage/tmp_blocks_file.go
+++ b/app/vmselect/netstorage/tmp_blocks_file.go
@@ -82,22 +82,18 @@ func (addr tmpBlockAddr) String() string {

 var tmpBlocksFilesCreated = metrics.NewCounter(`vm_tmp_blocks_files_created_total`)

-// WriteBlock writes b to tbf.
+// WriteBlockData writes b to tbf.
 //
 // It returns errors since the operation may fail on space shortage
 // and this must be handled.
-func (tbf *tmpBlocksFile) WriteBlock(b *storage.Block) (tmpBlockAddr, error) {
-	bb := tmpBufPool.Get()
-	defer tmpBufPool.Put(bb)
-	bb.B = storage.MarshalBlock(bb.B[:0], b)
-
+func (tbf *tmpBlocksFile) WriteBlockData(b []byte) (tmpBlockAddr, error) {
 	var addr tmpBlockAddr
 	addr.offset = tbf.offset
-	addr.size = len(bb.B)
+	addr.size = len(b)
 	tbf.offset += uint64(addr.size)
-	if len(tbf.buf)+len(bb.B) <= cap(tbf.buf) {
+	if len(tbf.buf)+len(b) <= cap(tbf.buf) {
 		// Fast path - the data fits tbf.buf
-		tbf.buf = append(tbf.buf, bb.B...)
+		tbf.buf = append(tbf.buf, b...)
 		return addr, nil
 	}

@@ -111,7 +107,7 @@ func (tbf *tmpBlocksFile) WriteBlock(b *storage.Block) (tmpBlockAddr, error) {
 		tmpBlocksFilesCreated.Inc()
 	}
 	_, err := tbf.f.Write(tbf.buf)
-	tbf.buf = append(tbf.buf[:0], bb.B...)
+	tbf.buf = append(tbf.buf[:0], b...)
 	if err != nil {
 		return addr, fmt.Errorf("cannot write block to %q: %s", tbf.f.Name(), err)
 	}
@@ -129,7 +125,10 @@ func (tbf *tmpBlocksFile) Finalize() error {
 	if _, err := tbf.f.Seek(0, 0); err != nil {
 		logger.Panicf("FATAL: cannot seek to the start of file: %s", err)
 	}
-	mustFadviseRandomRead(tbf.f)
+	// Hint the OS that the file is read almost sequentiallly.
+	// This should reduce the number of disk seeks, which is important
+	// for HDDs.
+	mustFadviseSequentialRead(tbf.f)
 	return nil
 }

--- a/app/vmselect/netstorage/tmp_blocks_file_test.go
+++ b/app/vmselect/netstorage/tmp_blocks_file_test.go
@@ -77,9 +77,12 @@ func testTmpBlocksFile() error {
 			// Write blocks until their summary size exceeds `size`.
 			var addrs []tmpBlockAddr
 			var blocks []*storage.Block
+			bb := tmpBufPool.Get()
+			defer tmpBufPool.Put(bb)
 			for tbf.offset < uint64(size) {
 				b := createBlock()
-				addr, err := tbf.WriteBlock(b)
+				bb.B = storage.MarshalBlock(bb.B[:0], b)
+				addr, err := tbf.WriteBlockData(bb.B)
 				if err != nil {
 					return fmt.Errorf("cannot write block at offset %d: %s", tbf.offset, err)
 				}
--- a/app/vmselect/prometheus/export.qtpl
+++ b/app/vmselect/prometheus/export.qtpl
@@ -13,7 +13,7 @@
 	{% for i, ts := range rs.Timestamps %}
 		{%z= bb.B %}{% space %}
 		{%f= rs.Values[i] %}{% space %}
-		{%d= int(ts) %}{% newline %}
+		{%dl= ts %}{% newline %}
 	{% endfor %}
 	{% code quicktemplate.ReleaseByteBuffer(bb) %}
 {% endfunc %}
@@ -35,10 +35,10 @@
 		"timestamps":[
 			{% if len(rs.Timestamps) > 0 %}
 				{% code timestamps := rs.Timestamps %}
-				{%d= int(timestamps[0]) %}
+				{%dl= timestamps[0] %}
 				{% code timestamps = timestamps[1:] %}
 				{% for _, ts := range timestamps %}
-					,{%d= int(ts) %}
+					,{%dl= ts %}
 				{% endfor %}
 			{% endif %}
 		]
--- a/app/vmselect/prometheus/export.qtpl.go
+++ b/app/vmselect/prometheus/export.qtpl.go
@@ -49,7 +49,7 @@ func StreamExportPrometheusLine(qw422016 *qt422016.Writer, rs *netstorage.Result
 //line app/vmselect/prometheus/export.qtpl:15
 		qw422016.N().S(` `)
 //line app/vmselect/prometheus/export.qtpl:16
-		qw422016.N().D(int(ts))
+		qw422016.N().DL(ts)
 //line app/vmselect/prometheus/export.qtpl:16
 		qw422016.N().S(`
 `)
@@ -129,7 +129,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
 		timestamps := rs.Timestamps

 //line app/vmselect/prometheus/export.qtpl:38
-		qw422016.N().D(int(timestamps[0]))
+		qw422016.N().DL(timestamps[0])
 //line app/vmselect/prometheus/export.qtpl:39
 		timestamps = timestamps[1:]

@@ -138,7 +138,7 @@ func StreamExportJSONLine(qw422016 *qt422016.Writer, rs *netstorage.Result) {
 //line app/vmselect/prometheus/export.qtpl:40
 			qw422016.N().S(`,`)
 //line app/vmselect/prometheus/export.qtpl:41
-			qw422016.N().D(int(ts))
+			qw422016.N().DL(ts)
 //line app/vmselect/prometheus/export.qtpl:42
 		}
 //line app/vmselect/prometheus/export.qtpl:43
--- a/app/vmselect/prometheus/federate.qtpl
+++ b/app/vmselect/prometheus/federate.qtpl
@@ -10,7 +10,7 @@
 	{% if len(rs.Timestamps) == 0 || len(rs.Values) == 0 %}{% return %}{% endif %}
 	{%= prometheusMetricName(&rs.MetricName) %}{% space %}
 	{%f= rs.Values[len(rs.Values)-1] %}{% space %}
-	{%d= int(rs.Timestamps[len(rs.Timestamps)-1]) %}{% newline %}
+	{%dl= rs.Timestamps[len(rs.Timestamps)-1] %}{% newline %}
 {% endfunc %}

 {% endstripspace %}
--- a/app/vmselect/prometheus/federate.qtpl.go
+++ b/app/vmselect/prometheus/federate.qtpl.go
@@ -41,7 +41,7 @@ func StreamFederate(qw422016 *qt422016.Writer, rs *netstorage.Result) {
 //line app/vmselect/prometheus/federate.qtpl:12
 	qw422016.N().S(` `)
 //line app/vmselect/prometheus/federate.qtpl:13
-	qw422016.N().D(int(rs.Timestamps[len(rs.Timestamps)-1]))
+	qw422016.N().DL(rs.Timestamps[len(rs.Timestamps)-1])
 //line app/vmselect/prometheus/federate.qtpl:13
 	qw422016.N().S(`
 `)
--- a/app/vmselect/prometheus/prometheus.go
+++ b/app/vmselect/prometheus/prometheus.go
@@ -21,17 +21,17 @@ import (
 )

 var (
+	latencyOffset = flag.Duration("search.latencyOffset", time.Second*60, "The time when data points become visible in query results after the colection. "+
+		"Too small value can result in incomplete last points for query results")
 	maxQueryDuration = flag.Duration("search.maxQueryDuration", time.Second*30, "The maximum time for search query execution")
 	maxQueryLen      = flag.Int("search.maxQueryLen", 16*1024, "The maximum search query length in bytes")
+	maxLookback      = flag.Duration("search.maxLookback", 0, "Synonim to `-search.lookback-delta` from Prometheus. "+
+		"The value is dynamically detected from interval between time series datapoints if not set. It can be overriden on per-query basis via `max_lookback` arg")
 )

 // Default step used if not set.
 const defaultStep = 5 * 60 * 1000

-// Latency for data processing pipeline, i.e. the time between data is ignested
-// into the system and the time it becomes visible to search.
-const latencyOffset = 60 * 1000
-
 // FederateHandler implements /federate . See https://prometheus.io/docs/prometheus/latest/federation/
 func FederateHandler(w http.ResponseWriter, r *http.Request) error {
 	startTime := time.Now()
@@ -43,11 +43,14 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
 	if len(matches) == 0 {
 		return fmt.Errorf("missing `match[]` arg")
 	}
-	maxLookback, err := getDuration(r, "max_lookback", defaultStep)
+	lookbackDelta, err := getMaxLookback(r)
 	if err != nil {
 		return err
 	}
-	start, err := getTime(r, "start", ct-maxLookback)
+	if lookbackDelta <= 0 {
+		lookbackDelta = defaultStep
+	}
+	start, err := getTime(r, "start", ct-lookbackDelta)
 	if err != nil {
 		return err
 	}
@@ -463,17 +466,22 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
 	if err != nil {
 		return err
 	}
-	step, err := getDuration(r, "step", latencyOffset)
+	queryOffset := getLatencyOffsetMilliseconds()
+	step, err := getDuration(r, "step", queryOffset)
 	if err != nil {
 		return err
 	}
 	deadline := getDeadline(r)
+	lookbackDelta, err := getMaxLookback(r)
+	if err != nil {
+		return err
+	}

 	if len(query) > *maxQueryLen {
 		return fmt.Errorf(`too long query; got %d bytes; mustn't exceed %d bytes`, len(query), *maxQueryLen)
 	}
-	if ct-start < latencyOffset {
-		start -= latencyOffset
+	if ct-start < queryOffset {
+		start -= queryOffset
 	}
 	if childQuery, windowStr, offsetStr := promql.IsMetricSelectorWithRollup(query); childQuery != "" {
 		var window int64
@@ -503,10 +511,11 @@ func QueryHandler(w http.ResponseWriter, r *http.Request) error {
 	}

 	ec := promql.EvalConfig{
-		Start:    start,
-		End:      start,
-		Step:     step,
-		Deadline: deadline,
+		Start:         start,
+		End:           start,
+		Step:          step,
+		Deadline:      deadline,
+		LookbackDelta: lookbackDelta,
 	}
 	result, err := promql.Exec(&ec, query, true)
 	if err != nil {
@@ -546,6 +555,10 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
 	}
 	deadline := getDeadline(r)
 	mayCache := !getBool(r, "nocache")
+	lookbackDelta, err := getMaxLookback(r)
+	if err != nil {
+		return err
+	}

 	// Validate input args.
 	if len(query) > *maxQueryLen {
@@ -562,17 +575,19 @@ func QueryRangeHandler(w http.ResponseWriter, r *http.Request) error {
 	}

 	ec := promql.EvalConfig{
-		Start:    start,
-		End:      end,
-		Step:     step,
-		Deadline: deadline,
-		MayCache: mayCache,
+		Start:         start,
+		End:           end,
+		Step:          step,
+		Deadline:      deadline,
+		MayCache:      mayCache,
+		LookbackDelta: lookbackDelta,
 	}
 	result, err := promql.Exec(&ec, query, false)
 	if err != nil {
 		return fmt.Errorf("cannot execute %q: %s", query, err)
 	}
-	if ct-end < latencyOffset {
+	queryOffset := getLatencyOffsetMilliseconds()
+	if ct-end < queryOffset {
 		result = adjustLastPoints(result)
 	}

@@ -726,6 +741,11 @@ func getDuration(r *http.Request, argKey string, defaultValue int64) (int64, err

 const maxDurationMsecs = 100 * 365 * 24 * 3600 * 1000

+func getMaxLookback(r *http.Request) (int64, error) {
+	d := int64(*maxLookback / time.Millisecond)
+	return getDuration(r, "max_lookback", d)
+}
+
 func getDeadline(r *http.Request) netstorage.Deadline {
 	d, err := getDuration(r, "timeout", 0)
 	if err != nil {
@@ -764,3 +784,11 @@ func getTagFilterssFromMatches(matches []string) ([][]storage.TagFilter, error)
 	}
 	return tagFilterss, nil
 }
+
+func getLatencyOffsetMilliseconds() int64 {
+	d := int64(*latencyOffset / time.Millisecond)
+	if d <= 1000 {
+		d = 1000
+	}
+	return d
+}
--- a/app/vmselect/prometheus/series_count_response.qtpl
+++ b/app/vmselect/prometheus/series_count_response.qtpl
@@ -3,7 +3,7 @@ SeriesCountResponse generates response for /api/v1/series/count .
 {% func SeriesCountResponse(n uint64) %}
 {
 	"status":"success",
-	"data":[{%d int(n) %}]
+	"data":[{%dl int64(n) %}]
 }
 {% endfunc %}
 {% endstripspace %}
--- a/app/vmselect/prometheus/series_count_response.qtpl.go
+++ b/app/vmselect/prometheus/series_count_response.qtpl.go
@@ -24,7 +24,7 @@ func StreamSeriesCountResponse(qw422016 *qt422016.Writer, n uint64) {
 //line app/vmselect/prometheus/series_count_response.qtpl:3
 	qw422016.N().S(`{"status":"success","data":[`)
 //line app/vmselect/prometheus/series_count_response.qtpl:6
-	qw422016.N().D(int(n))
+	qw422016.N().DL(int64(n))
 //line app/vmselect/prometheus/series_count_response.qtpl:6
 	qw422016.N().S(`]}`)
 //line app/vmselect/prometheus/series_count_response.qtpl:8
--- a/app/vmselect/promql/arch_386.go
+++ b/app/vmselect/promql/arch_386.go
@@ -0,0 +1,3 @@
+package promql
+
+const maxByteSliceLen = 1<<31 - 1
--- a/app/vmselect/promql/binary_op.go
+++ b/app/vmselect/promql/binary_op.go
@@ -292,24 +292,14 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
 	}

 	// Slow path: `vector op vector` or `a op {on|ignoring} {group_left|group_right} b`
-	ensureOneX := func(side string, tss []*timeseries) error {
-		if len(tss) == 0 {
-			logger.Panicf("BUG: tss must contain at least one value")
-		}
-		if len(tss) == 1 {
-			return nil
-		}
-		if mergeNonOverlappingTimeseries(tss) {
-			return nil
-		}
-		return fmt.Errorf(`duplicate timeseries on the %s side of %s %s: %s and %s`, side, be.Op, be.GroupModifier.AppendString(nil),
-			stringMetricTags(&tss[0].MetricName), stringMetricTags(&tss[1].MetricName))
-	}
-
 	var rvsLeft, rvsRight []*timeseries
 	mLeft, mRight := createTimeseriesMapByTagSet(be, left, right)
 	joinOp := strings.ToLower(be.JoinModifier.Op)
-	joinTags := be.JoinModifier.Args
+	groupOp := strings.ToLower(be.GroupModifier.Op)
+	if len(groupOp) == 0 {
+		groupOp = "ignoring"
+	}
+	groupTags := be.GroupModifier.Args
 	for k, tssLeft := range mLeft {
 		tssRight := mRight[k]
 		if len(tssRight) == 0 {
@@ -317,39 +307,38 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
 		}
 		switch joinOp {
 		case "group_left":
-			if err := ensureOneX("right", tssRight); err != nil {
+			var err error
+			rvsLeft, rvsRight, err = groupJoin("right", be, rvsLeft, rvsRight, tssLeft, tssRight)
+			if err != nil {
 				return nil, nil, nil, err
 			}
-			src := tssRight[0]
-			for _, ts := range tssLeft {
-				resetMetricGroupIfRequired(be, ts)
-				ts.MetricName.AddMissingTags(joinTags, &src.MetricName)
-				rvsLeft = append(rvsLeft, ts)
-				rvsRight = append(rvsRight, src)
-			}
 		case "group_right":
-			if err := ensureOneX("left", tssLeft); err != nil {
+			var err error
+			rvsRight, rvsLeft, err = groupJoin("left", be, rvsRight, rvsLeft, tssRight, tssLeft)
+			if err != nil {
 				return nil, nil, nil, err
 			}
-			src := tssLeft[0]
-			for _, ts := range tssRight {
-				resetMetricGroupIfRequired(be, ts)
-				ts.MetricName.AddMissingTags(joinTags, &src.MetricName)
-				rvsLeft = append(rvsLeft, src)
-				rvsRight = append(rvsRight, ts)
-			}
 		case "":
-			if err := ensureOneX("left", tssLeft); err != nil {
+			if err := ensureSingleTimeseries("left", be, tssLeft); err != nil {
 				return nil, nil, nil, err
 			}
-			if err := ensureOneX("right", tssRight); err != nil {
+			if err := ensureSingleTimeseries("right", be, tssRight); err != nil {
 				return nil, nil, nil, err
 			}
-			resetMetricGroupIfRequired(be, tssLeft[0])
-			rvsLeft = append(rvsLeft, tssLeft[0])
+			tsLeft := tssLeft[0]
+			resetMetricGroupIfRequired(be, tsLeft)
+			switch groupOp {
+			case "on":
+				tsLeft.MetricName.RemoveTagsOn(groupTags)
+			case "ignoring":
+				tsLeft.MetricName.RemoveTagsIgnoring(groupTags)
+			default:
+				logger.Panicf("BUG: unexpected binary op modifier %q", groupOp)
+			}
+			rvsLeft = append(rvsLeft, tsLeft)
 			rvsRight = append(rvsRight, tssRight[0])
 		default:
-			return nil, nil, nil, fmt.Errorf(`unexpected join modifier %q`, joinOp)
+			logger.Panicf("BUG: unexpected join modifier %q", joinOp)
 		}
 	}
 	dst := rvsLeft
@@ -359,6 +348,90 @@ func adjustBinaryOpTags(be *binaryOpExpr, left, right []*timeseries) ([]*timeser
 	return rvsLeft, rvsRight, dst, nil
 }

+func ensureSingleTimeseries(side string, be *binaryOpExpr, tss []*timeseries) error {
+	if len(tss) == 0 {
+		logger.Panicf("BUG: tss must contain at least one value")
+	}
+	for len(tss) > 1 {
+		if !mergeNonOverlappingTimeseries(tss[0], tss[len(tss)-1]) {
+			return fmt.Errorf(`duplicate time series on the %s side of %s %s: %s and %s`, side, be.Op, be.GroupModifier.AppendString(nil),
+				stringMetricTags(&tss[0].MetricName), stringMetricTags(&tss[len(tss)-1].MetricName))
+		}
+		tss = tss[:len(tss)-1]
+	}
+	return nil
+}
+
+func groupJoin(singleTimeseriesSide string, be *binaryOpExpr, rvsLeft, rvsRight, tssLeft, tssRight []*timeseries) ([]*timeseries, []*timeseries, error) {
+	joinTags := be.JoinModifier.Args
+	var m map[string]*timeseries
+	for _, tsLeft := range tssLeft {
+		resetMetricGroupIfRequired(be, tsLeft)
+		if len(tssRight) == 1 {
+			// Easy case - right part contains only a single matching time series.
+			tsLeft.MetricName.AddMissingTags(joinTags, &tssRight[0].MetricName)
+			rvsLeft = append(rvsLeft, tsLeft)
+			rvsRight = append(rvsRight, tssRight[0])
+			continue
+		}
+
+		// Hard case - right part contains multiple matching time series.
+		// Verify it doesn't result in duplicate MetricName values after adding missing tags.
+		if m == nil {
+			m = make(map[string]*timeseries, len(tssRight))
+		} else {
+			for k := range m {
+				delete(m, k)
+			}
+		}
+		bb := bbPool.Get()
+		for _, tsRight := range tssRight {
+			var tsCopy timeseries
+			tsCopy.CopyFromShallowTimestamps(tsLeft)
+			tsCopy.MetricName.AddMissingTags(joinTags, &tsRight.MetricName)
+			bb.B = marshalMetricTagsSorted(bb.B[:0], &tsCopy.MetricName)
+			if tsExisting := m[string(bb.B)]; tsExisting != nil {
+				// Try merging tsExisting with tsRight if they don't overlap.
+				if mergeNonOverlappingTimeseries(tsExisting, tsRight) {
+					continue
+				}
+				return nil, nil, fmt.Errorf("duplicate time series on the %s side of `%s %s %s`: %s and %s",
+					singleTimeseriesSide, be.Op, be.GroupModifier.AppendString(nil), be.JoinModifier.AppendString(nil),
+					stringMetricTags(&tsExisting.MetricName), stringMetricTags(&tsRight.MetricName))
+			}
+			m[string(bb.B)] = tsRight
+			rvsLeft = append(rvsLeft, &tsCopy)
+			rvsRight = append(rvsRight, tsRight)
+		}
+		bbPool.Put(bb)
+	}
+	return rvsLeft, rvsRight, nil
+}
+
+func mergeNonOverlappingTimeseries(dst, src *timeseries) bool {
+	// Verify whether the time series can be merged.
+	srcValues := src.Values
+	dstValues := dst.Values
+	_ = dstValues[len(srcValues)-1]
+	for i, v := range srcValues {
+		if math.IsNaN(v) {
+			continue
+		}
+		if !math.IsNaN(dstValues[i]) {
+			return false
+		}
+	}
+
+	// Time series can be merged. Merge them.
+	for i, v := range srcValues {
+		if math.IsNaN(v) {
+			continue
+		}
+		dstValues[i] = v
+	}
+	return true
+}
+
 func resetMetricGroupIfRequired(be *binaryOpExpr, ts *timeseries) {
 	if isBinaryOpCmp(be.Op) && !be.Bool {
 		// Do not reset MetricGroup for non-boolean `compare` binary ops like Prometheus does.
@@ -535,26 +608,3 @@ func isScalar(arg []*timeseries) bool {
 	}
 	return len(mn.Tags) == 0
 }
-
-func mergeNonOverlappingTimeseries(tss []*timeseries) bool {
-	if len(tss) < 2 {
-		logger.Panicf("BUG: expecting at least two timeseries. Got %d", len(tss))
-	}
-
-	// Check whether time series in tss overlap.
-	var dst timeseries
-	dst.CopyFromShallowTimestamps(tss[0])
-	dstValues := dst.Values
-	for _, ts := range tss[1:] {
-		for i, value := range ts.Values {
-			if math.IsNaN(dstValues[i]) {
-				dstValues[i] = value
-			} else if !math.IsNaN(value) {
-				// Time series overlap.
-				return false
-			}
-		}
-	}
-	tss[0].CopyFromShallowTimestamps(&dst)
-	return true
-}
--- a/app/vmselect/promql/eval.go
+++ b/app/vmselect/promql/eval.go
@@ -70,6 +70,9 @@ type EvalConfig struct {

 	MayCache bool

+	// LookbackDelta is analog to `-query.lookback-delta` from Prometheus.
+	LookbackDelta int64
+
 	timestamps     []int64
 	timestampsOnce sync.Once
 }
@@ -82,6 +85,7 @@ func newEvalConfig(src *EvalConfig) *EvalConfig {
 	ec.Step = src.Step
 	ec.Deadline = src.Deadline
 	ec.MayCache = src.MayCache
+	ec.LookbackDelta = src.LookbackDelta

 	// do not copy src.timestamps - they must be generated again.
 	return &ec
@@ -290,10 +294,10 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
 		return fe, nrf
 	}
 	if re, ok := e.(*rollupExpr); ok {
-		if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() {
+		if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
 			return nil, nil
 		}
-		// e = rollupExpr(metricExpr)
+		// e = metricExpr[d]
 		fe := &funcExpr{
 			Name: "default_rollup",
 			Args: []expr{re},
@@ -315,15 +319,17 @@ func tryGetArgRollupFuncWithMetricExpr(ae *aggrFuncExpr) (*funcExpr, newRollupFu
 		if me.IsEmpty() {
 			return nil, nil
 		}
+		// e = rollupFunc(metricExpr)
 		return &funcExpr{
 			Name: fe.Name,
 			Args: []expr{me},
 		}, nrf
 	}
 	if re, ok := arg.(*rollupExpr); ok {
-		if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() {
+		if me, ok := re.Expr.(*metricExpr); !ok || me.IsEmpty() || re.ForSubquery() {
 			return nil, nil
 		}
+		// e = rollupFunc(metricExpr[d])
 		return fe, nrf
 	}
 	return nil, nil
@@ -368,8 +374,8 @@ func getRollupExprArg(arg expr) *rollupExpr {
 			Expr: arg,
 		}
 	}
-	if len(re.Step) == 0 && !re.InheritStep {
-		// Return standard rollup if it doesn't set step.
+	if !re.ForSubquery() {
+		// Return standard rollup if it doesn't contain subquery.
 		return re
 	}
 	me, ok := re.Expr.(*metricExpr)
@@ -463,7 +469,7 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
 	}

 	sharedTimestamps := getTimestamps(ec.Start, ec.End, ec.Step)
-	preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, sharedTimestamps)
+	preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)
 	tss := make([]*timeseries, 0, len(tssSQ)*len(rcs))
 	var tssLock sync.Mutex
 	removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
@@ -584,7 +590,7 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
 		return tss, nil
 	}
 	sharedTimestamps := getTimestamps(start, ec.End, ec.Step)
-	preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, sharedTimestamps)
+	preFunc, rcs := getRollupConfigs(name, rf, start, ec.End, ec.Step, window, ec.LookbackDelta, sharedTimestamps)

 	// Verify timeseries fit available memory after the rollup.
 	// Take into account points from tssCached.
@@ -687,7 +693,8 @@ func doRollupForTimeseries(rc *rollupConfig, tsDst *timeseries, mnSrc *storage.M
 	tsDst.denyReuse = true
 }

-func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, sharedTimestamps []int64) (func(values []float64, timestamps []int64), []*rollupConfig) {
+func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, lookbackDelta int64, sharedTimestamps []int64) (
+	func(values []float64, timestamps []int64), []*rollupConfig) {
 	preFunc := func(values []float64, timestamps []int64) {}
 	if rollupFuncsRemoveCounterResets[name] {
 		preFunc = func(values []float64, timestamps []int64) {
@@ -703,6 +710,7 @@ func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64
 			Step:            step,
 			Window:          window,
 			MayAdjustWindow: rollupFuncsMayAdjustWindow[name],
+			LookbackDelta:   lookbackDelta,
 			Timestamps:      sharedTimestamps,
 		}
 	}
--- a/app/vmselect/promql/exec.go
+++ b/app/vmselect/promql/exec.go
@@ -194,11 +194,14 @@ type parseCacheValue struct {
 }

 type parseCache struct {
-	m  map[string]*parseCacheValue
-	mu sync.RWMutex
+	// Move atomic counters to the top of struct for 8-byte alignment on 32-bit arch.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212

 	requests uint64
 	misses   uint64
+
+	m  map[string]*parseCacheValue
+	mu sync.RWMutex
 }

 func (pc *parseCache) Requests() uint64 {
--- a/app/vmselect/promql/exec_test.go
+++ b/app/vmselect/promql/exec_test.go
@@ -369,6 +369,17 @@ func TestExecSuccess(t *testing.T) {
 		resultExpected := []netstorage.Result{r}
 		f(q, resultExpected)
 	})
+	t.Run("timestamp(time()>=1600)", func(t *testing.T) {
+		t.Parallel()
+		q := `timestamp(time()>=1600)`
+		r := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{nan, nan, nan, 1600, 1800, 2000},
+			Timestamps: timestampsExpected,
+		}
+		resultExpected := []netstorage.Result{r}
+		f(q, resultExpected)
+	})
 	t.Run("time()/100", func(t *testing.T) {
 		t.Parallel()
 		q := `time()/100`
@@ -1826,10 +1837,6 @@ func TestExecSuccess(t *testing.T) {
 			Timestamps: timestampsExpected,
 		}
 		r.MetricName.Tags = []storage.Tag{
-			{
-				Key:   []byte("aa"),
-				Value: []byte("bb"),
-			},
 			{
 				Key:   []byte("foo"),
 				Value: []byte("bar"),
@@ -1851,12 +1858,75 @@ func TestExecSuccess(t *testing.T) {
 				Key:   []byte("foo"),
 				Value: []byte("bar"),
 			},
+		}
+		resultExpected := []netstorage.Result{r}
+		f(q, resultExpected)
+	})
+	t.Run(`vector * on(foo) group_left(additional_tag) duplicate_timeseries_differ_by_additional_tag`, func(t *testing.T) {
+		t.Parallel()
+		q := `sort(label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) group_left(op) (
+			label_set(time() < 1400, "foo", "bar", "op", "le"),
+			label_set(time() >= 1400, "foo", "bar", "op", "ge"),
+		))`
+		r1 := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{1100, 1320, nan, nan, nan, nan},
+			Timestamps: timestampsExpected,
+		}
+		r1.MetricName.Tags = []storage.Tag{
+			{
+				Key:   []byte("foo"),
+				Value: []byte("bar"),
+			},
+			{
+				Key:   []byte("op"),
+				Value: []byte("le"),
+			},
 			{
 				Key:   []byte("xx"),
 				Value: []byte("yy"),
 			},
 		}
-		resultExpected := []netstorage.Result{r}
+		r2 := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{nan, nan, 1540, 1760, 1980, 2200},
+			Timestamps: timestampsExpected,
+		}
+		r2.MetricName.Tags = []storage.Tag{
+			{
+				Key:   []byte("foo"),
+				Value: []byte("bar"),
+			},
+			{
+				Key:   []byte("op"),
+				Value: []byte("ge"),
+			},
+			{
+				Key:   []byte("xx"),
+				Value: []byte("yy"),
+			},
+		}
+		resultExpected := []netstorage.Result{r1, r2}
+		f(q, resultExpected)
+	})
+	t.Run(`vector * on(foo) duplicate_nonoverlapping_timeseries`, func(t *testing.T) {
+		t.Parallel()
+		q := `label_set(time()/10, "foo", "bar", "xx", "yy", "__name__", "qwert") + on(foo) (
+			label_set(time() < 1400, "foo", "bar", "op", "le"),
+			label_set(time() >= 1400, "foo", "bar", "op", "ge"),
+		)`
+		r1 := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{1100, 1320, 1540, 1760, 1980, 2200},
+			Timestamps: timestampsExpected,
+		}
+		r1.MetricName.Tags = []storage.Tag{
+			{
+				Key:   []byte("foo"),
+				Value: []byte("bar"),
+			},
+		}
+		resultExpected := []netstorage.Result{r1}
 		f(q, resultExpected)
 	})
 	t.Run(`vector * on(foo) group_left() duplicate_nonoverlapping_timeseries`, func(t *testing.T) {
@@ -2043,10 +2113,6 @@ func TestExecSuccess(t *testing.T) {
 			Timestamps: timestampsExpected,
 		}
 		r.MetricName.Tags = []storage.Tag{
-			{
-				Key:   []byte("t1"),
-				Value: []byte("v123"),
-			},
 			{
 				Key:   []byte("t2"),
 				Value: []byte("v3"),
@@ -2152,10 +2218,6 @@ func TestExecSuccess(t *testing.T) {
 			Timestamps: timestampsExpected,
 		}
 		r.MetricName.Tags = []storage.Tag{
-			{
-				Key:   []byte("t1"),
-				Value: []byte("v123"),
-			},
 			{
 				Key:   []byte("t2"),
 				Value: []byte("v3"),
@@ -2620,6 +2682,28 @@ func TestExecSuccess(t *testing.T) {
 		resultExpected := []netstorage.Result{r}
 		f(q, resultExpected)
 	})
+	t.Run(`increases_over_time`, func(t *testing.T) {
+		t.Parallel()
+		q := `increases_over_time(rand(0)[200s:10s])`
+		r := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{11, 9, 9, 12, 9, 8},
+			Timestamps: timestampsExpected,
+		}
+		resultExpected := []netstorage.Result{r}
+		f(q, resultExpected)
+	})
+	t.Run(`decreases_over_time`, func(t *testing.T) {
+		t.Parallel()
+		q := `decreases_over_time(rand(0)[200s:10s])`
+		r := netstorage.Result{
+			MetricName: metricNameExpected,
+			Values:     []float64{9, 11, 11, 8, 11, 12},
+			Timestamps: timestampsExpected,
+		}
+		resultExpected := []netstorage.Result{r}
+		f(q, resultExpected)
+	})
 	t.Run(`limitk(-1)`, func(t *testing.T) {
 		t.Parallel()
 		q := `limitk(-1, label_set(10, "foo", "bar") or label_set(time()/150, "baz", "sss"))`
@@ -3472,7 +3556,7 @@ func TestExecSuccess(t *testing.T) {
 		}}
 		r4 := netstorage.Result{
 			MetricName: metricNameExpected,
-			Values:     []float64{0.85, 0.94, 0.97, 0.93, 0.98, 0.92},
+			Values:     []float64{0.9, 0.94, 0.97, 0.93, 0.98, 0.92},
 			Timestamps: timestampsExpected,
 		}
 		r4.MetricName.Tags = []storage.Tag{{
@@ -3520,7 +3604,7 @@ func TestExecSuccess(t *testing.T) {
 		q := `sort(rollup(time()[:50s]))`
 		r1 := netstorage.Result{
 			MetricName: metricNameExpected,
-			Values:     []float64{850, 1050, 1250, 1450, 1650, 1850},
+			Values:     []float64{800, 1000, 1200, 1400, 1600, 1800},
 			Timestamps: timestampsExpected,
 		}
 		r1.MetricName.Tags = []storage.Tag{{
--- a/app/vmselect/promql/parser.go
+++ b/app/vmselect/promql/parser.go
@@ -1550,6 +1550,10 @@ type rollupExpr struct {
 	InheritStep bool
 }

+func (re *rollupExpr) ForSubquery() bool {
+	return len(re.Step) > 0 || re.InheritStep
+}
+
 func (re *rollupExpr) AppendString(dst []byte) []byte {
 	needParens := func() bool {
 		if _, ok := re.Expr.(*rollupExpr); ok {
--- a/app/vmselect/promql/regexp_cache.go
+++ b/app/vmselect/promql/regexp_cache.go
@@ -51,11 +51,14 @@ type regexpCacheValue struct {
 }

 type regexpCache struct {
-	m  map[string]*regexpCacheValue
-	mu sync.RWMutex
+	// Move atomic counters to the top of struct for 8-byte alignment on 32-bit arch.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212

 	requests uint64
 	misses   uint64
+
+	m  map[string]*regexpCacheValue
+	mu sync.RWMutex
 }

 func (rc *regexpCache) Requests() uint64 {
--- a/app/vmselect/promql/rollup.go
+++ b/app/vmselect/promql/rollup.go
@@ -38,21 +38,23 @@ var rollupFuncs = map[string]newRollupFunc{
 	"stdvar_over_time":   newRollupFuncOneArg(rollupStdvar),

 	// Additional rollup funcs.
-	"sum2_over_time":     newRollupFuncOneArg(rollupSum2),
-	"geomean_over_time":  newRollupFuncOneArg(rollupGeomean),
-	"first_over_time":    newRollupFuncOneArg(rollupFirst),
-	"last_over_time":     newRollupFuncOneArg(rollupLast),
-	"distinct_over_time": newRollupFuncOneArg(rollupDistinct),
-	"integrate":          newRollupFuncOneArg(rollupIntegrate),
-	"ideriv":             newRollupFuncOneArg(rollupIderiv),
-	"lifetime":           newRollupFuncOneArg(rollupLifetime),
-	"scrape_interval":    newRollupFuncOneArg(rollupScrapeInterval),
-	"rollup":             newRollupFuncOneArg(rollupFake),
-	"rollup_rate":        newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
-	"rollup_deriv":       newRollupFuncOneArg(rollupFake),
-	"rollup_delta":       newRollupFuncOneArg(rollupFake),
-	"rollup_increase":    newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
-	"rollup_candlestick": newRollupFuncOneArg(rollupFake),
+	"sum2_over_time":      newRollupFuncOneArg(rollupSum2),
+	"geomean_over_time":   newRollupFuncOneArg(rollupGeomean),
+	"first_over_time":     newRollupFuncOneArg(rollupFirst),
+	"last_over_time":      newRollupFuncOneArg(rollupLast),
+	"distinct_over_time":  newRollupFuncOneArg(rollupDistinct),
+	"increases_over_time": newRollupFuncOneArg(rollupIncreases),
+	"decreases_over_time": newRollupFuncOneArg(rollupDecreases),
+	"integrate":           newRollupFuncOneArg(rollupIntegrate),
+	"ideriv":              newRollupFuncOneArg(rollupIderiv),
+	"lifetime":            newRollupFuncOneArg(rollupLifetime),
+	"scrape_interval":     newRollupFuncOneArg(rollupScrapeInterval),
+	"rollup":              newRollupFuncOneArg(rollupFake),
+	"rollup_rate":         newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
+	"rollup_deriv":        newRollupFuncOneArg(rollupFake),
+	"rollup_delta":        newRollupFuncOneArg(rollupFake),
+	"rollup_increase":     newRollupFuncOneArg(rollupFake), // + rollupFuncsRemoveCounterResets
+	"rollup_candlestick":  newRollupFuncOneArg(rollupFake),
 }

 var rollupFuncsMayAdjustWindow = map[string]bool{
@@ -147,6 +149,9 @@ type rollupConfig struct {
 	MayAdjustWindow bool

 	Timestamps []int64
+
+	// LoookbackDelta is the analog to `-query.lookback-delta` from Prometheus world.
+	LookbackDelta int64
 }

 var (
@@ -184,6 +189,9 @@ func (rc *rollupConfig) Do(dstValues []float64, values []float64, timestamps []i
 	dstValues = decimal.ExtendFloat64sCapacity(dstValues, len(rc.Timestamps))

 	maxPrevInterval := getMaxPrevInterval(timestamps)
+	if rc.LookbackDelta > 0 && maxPrevInterval > rc.LookbackDelta {
+		maxPrevInterval = rc.LookbackDelta
+	}
 	window := rc.Window
 	if window <= 0 {
 		window = rc.Step
@@ -531,11 +539,14 @@ func rollupAvg(rfa *rollupFuncArg) float64 {
 func rollupMin(rfa *rollupFuncArg) float64 {
 	// There is no need in handling NaNs here, since they must be cleaned up
 	// before calling rollup funcs.
+	minValue := rfa.prevValue
 	values := rfa.values
-	if len(values) == 0 {
-		return rfa.prevValue
+	if math.IsNaN(minValue) {
+		if len(values) == 0 {
+			return nan
+		}
+		minValue = values[0]
 	}
-	minValue := values[0]
 	for _, v := range values {
 		if v < minValue {
 			minValue = v
@@ -547,11 +558,14 @@ func rollupMin(rfa *rollupFuncArg) float64 {
 func rollupMax(rfa *rollupFuncArg) float64 {
 	// There is no need in handling NaNs here, since they must be cleaned up
 	// before calling rollup funcs.
+	maxValue := rfa.prevValue
 	values := rfa.values
-	if len(values) == 0 {
-		return rfa.prevValue
+	if math.IsNaN(maxValue) {
+		if len(values) == 0 {
+			return nan
+		}
+		maxValue = values[0]
 	}
-	maxValue := values[0]
 	for _, v := range values {
 		if v > maxValue {
 			maxValue = v
@@ -565,7 +579,10 @@ func rollupSum(rfa *rollupFuncArg) float64 {
 	// before calling rollup funcs.
 	values := rfa.values
 	if len(values) == 0 {
-		return rfa.prevValue
+		if math.IsNaN(rfa.prevValue) {
+			return nan
+		}
+		return 0
 	}
 	var sum float64
 	for _, v := range values {
@@ -820,6 +837,37 @@ func rollupChanges(rfa *rollupFuncArg) float64 {
 	return float64(n)
 }

+func rollupIncreases(rfa *rollupFuncArg) float64 {
+	// There is no need in handling NaNs here, since they must be cleaned up
+	// before calling rollup funcs.
+	values := rfa.values
+	if len(values) == 0 {
+		if math.IsNaN(rfa.prevValue) {
+			return nan
+		}
+		return 0
+	}
+	prevValue := rfa.prevValue
+	if math.IsNaN(prevValue) {
+		prevValue = values[0]
+		values = values[1:]
+	}
+	if len(values) == 0 {
+		return 0
+	}
+	n := 0
+	for _, v := range values {
+		if v > prevValue {
+			n++
+		}
+		prevValue = v
+	}
+	return float64(n)
+}
+
+// `decreases_over_time` logic is the same as `resets` logic.
+var rollupDecreases = rollupResets
+
 func rollupResets(rfa *rollupFuncArg) float64 {
 	// There is no need in handling NaNs here, since they must be cleaned up
 	// before calling rollup funcs.
--- a/app/vmselect/promql/rollup_test.go
+++ b/app/vmselect/promql/rollup_test.go
@@ -294,6 +294,8 @@ func TestRollupNewRollupFuncSuccess(t *testing.T) {
 	f("integrate", 61.0275)
 	f("distinct_over_time", 8)
 	f("ideriv", 0)
+	f("decreases_over_time", 5)
+	f("increases_over_time", 5)
 }

 func TestRollupNewRollupFuncError(t *testing.T) {
@@ -486,6 +488,51 @@ func TestRollupWindowPartialPoints(t *testing.T) {
 	})
 }

+func TestRollupFuncsLookbackDelta(t *testing.T) {
+	t.Run("1", func(t *testing.T) {
+		rc := rollupConfig{
+			Func:          rollupFirst,
+			Start:         80,
+			End:           140,
+			Step:          10,
+			LookbackDelta: 1,
+		}
+		rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
+		values := rc.Do(nil, testValues, testTimestamps)
+		valuesExpected := []float64{99, 12, 44, nan, 32, 34, nan}
+		timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
+		testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
+	})
+	t.Run("7", func(t *testing.T) {
+		rc := rollupConfig{
+			Func:          rollupFirst,
+			Start:         80,
+			End:           140,
+			Step:          10,
+			LookbackDelta: 7,
+		}
+		rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
+		values := rc.Do(nil, testValues, testTimestamps)
+		valuesExpected := []float64{99, 12, 44, 44, 32, 34, nan}
+		timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
+		testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
+	})
+	t.Run("0", func(t *testing.T) {
+		rc := rollupConfig{
+			Func:          rollupFirst,
+			Start:         80,
+			End:           140,
+			Step:          10,
+			LookbackDelta: 0,
+		}
+		rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
+		values := rc.Do(nil, testValues, testTimestamps)
+		valuesExpected := []float64{34, 12, 12, 44, 44, 34, nan}
+		timestampsExpected := []int64{80, 90, 100, 110, 120, 130, 140}
+		testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
+	})
+}
+
 func TestRollupFuncsNoWindow(t *testing.T) {
 	t.Run("first", func(t *testing.T) {
 		rc := rollupConfig{
@@ -525,7 +572,7 @@ func TestRollupFuncsNoWindow(t *testing.T) {
 		}
 		rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
 		values := rc.Do(nil, testValues, testTimestamps)
-		valuesExpected := []float64{nan, 21, 12, 32, 34}
+		valuesExpected := []float64{nan, 21, 12, 12, 34}
 		timestampsExpected := []int64{0, 40, 80, 120, 160}
 		testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
 	})
@@ -811,7 +858,7 @@ func testRowsEqual(t *testing.T, values []float64, timestamps []int64, valuesExp
 			}
 			continue
 		}
-		if v != vExpected {
+		if math.Abs(v-vExpected) > 1e-15 {
 			t.Fatalf("unexpected value at values[%d]; got %f; want %f\nvalues=\n%v\nvaluesExpected=\n%v",
 				i, v, vExpected, values, valuesExpected)
 		}
--- a/app/vmselect/promql/transform.go
+++ b/app/vmselect/promql/transform.go
@@ -1121,7 +1121,10 @@ func transformTimestamp(tfa *transformFuncArg) ([]*timeseries, error) {
 		ts.MetricName.ResetMetricGroup()
 		values := ts.Values
 		for i, t := range ts.Timestamps {
-			values[i] = float64(t) / 1e3
+			v := values[i]
+			if !math.IsNaN(v) {
+				values[i] = float64(t) / 1e3
+			}
 		}
 	}
 	return rvs, nil
--- a/app/vmstorage/main.go
+++ b/app/vmstorage/main.go
@@ -24,6 +24,9 @@ var (

 	// DataPath is a path to storage data.
 	DataPath = flag.String("storageDataPath", "victoria-metrics-data", "Path to storage data")
+
+	bigMergeConcurrency   = flag.Int("bigMergeConcurrency", 0, "The maximum number of CPU cores to use for big merges. Default value is used if set to 0")
+	smallMergeConcurrency = flag.Int("smallMergeConcurrency", 0, "The maximum number of CPU cores to use for small merges. Default value is used if set to 0")
 )

 // Init initializes vmstorage.
@@ -39,6 +42,10 @@ func InitWithoutMetrics() {
 	if err := encoding.CheckPrecisionBits(uint8(*precisionBits)); err != nil {
 		logger.Fatalf("invalid `-precisionBits`: %s", err)
 	}
+
+	storage.SetBigMergeWorkersCount(*bigMergeConcurrency)
+	storage.SetSmallMergeWorkersCount(*smallMergeConcurrency)
+
 	logger.Infof("opening storage at %q with retention period %d months", *DataPath, *retentionPeriod)
 	startTime := time.Now()
 	WG = syncwg.WaitGroup{}
--- a/dashboards/victoriametrics.json
+++ b/dashboards/victoriametrics.json
@@ -14,7 +14,7 @@
      "type": "grafana",
      "id": "grafana",
      "name": "Grafana",
-      "version": "6.2.1"
+      "version": "6.3.5"
    },
    {
      "type": "panel",
@@ -60,12 +60,12 @@
      }
    ]
  },
-  "description": "Overview for single node VictoriaMetrics v1.22.2 or higher",
+  "description": "Overview for single node VictoriaMetrics v1.28.0 or higher",
  "editable": true,
  "gnetId": 10229,
  "graphTooltip": 0,
  "id": null,
-  "iteration": 1563651131627,
+  "iteration": 1572208904768,
  "links": [
    {
      "icon": "doc",
@@ -121,7 +121,6 @@
        {
          "targetBlank": true,
          "title": "VictoriaMetrics releases",
-          "type": "absolute",
          "url": "https://github.com/VictoriaMetrics/VictoriaMetrics/releases"
        }
      ],
@@ -490,6 +489,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -513,7 +513,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null as zero",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -580,6 +582,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "The less time it takes is better.\n* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -603,7 +606,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null as zero",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -670,6 +675,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "Shows the number of active time series with new data points inserted during the last hour. High value may result in ingestion slowdown. \n\nSee following link for details:",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -692,12 +698,13 @@
        {
          "targetBlank": true,
          "title": "troubleshooting",
-          "type": "absolute",
          "url": "https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/README.md#troubleshooting"
        }
      ],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -764,6 +771,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "VictoriaMetrics stores various caches in RAM. Memory size for these caches may be limited with -`memory.allowedPercent` flag. Line `max allowed` shows max allowed memory size for cache.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -784,7 +792,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -865,6 +875,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "* `*` - unsupported query path\n* `/write` - insert into VM\n* `/metrics` - query VM system metrics\n* `/query` - query instant values\n* `/query_range` - query over a range of time\n* `/series` - match a certain label set\n* `/label/{}/values` - query a list of label values (variables mostly)",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -888,7 +899,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null as zero",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -953,55 +966,74 @@
      "dashLength": 10,
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
-      "description": "",
+      "description": "Shows how many ongoing insertions are taking place.\n* `max` - equal to number of CPU * 2\n* `current` - current number of goroutines busy with inserting rows into storage\n\nWhen `current` hits `max` constantly, it means storage is overloaded and require more CPU.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
        "y": 27
      },
-      "id": 49,
+      "id": 59,
      "legend": {
-        "avg": false,
-        "current": false,
+        "alignAsTable": true,
+        "avg": true,
+        "current": true,
+        "hideEmpty": false,
+        "hideZero": false,
        "max": false,
        "min": false,
-        "show": false,
+        "show": true,
+        "sort": "current",
+        "sortDesc": true,
        "total": false,
-        "values": false
+        "values": true
      },
      "lines": true,
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
      "renderer": "flot",
-      "seriesOverrides": [],
+      "seriesOverrides": [
+        {
+          "alias": "max",
+          "color": "#C4162A"
+        }
+      ],
      "spaceLength": 10,
      "stack": false,
      "steppedLine": false,
      "targets": [
        {
-          "expr": "sum(rate(vm_tcplistener_accepts_total{job=\"$job\"}[$__interval]))",
+          "expr": "sum(vm_concurrent_addrows_capacity{job=\"$job\"})",
          "format": "time_series",
-          "hide": false,
          "intervalFactor": 1,
-          "legendFormat": "connections",
+          "legendFormat": "max",
          "refId": "A"
+        },
+        {
+          "expr": "sum(vm_concurrent_addrows_current{job=\"$job\"})",
+          "format": "time_series",
+          "intervalFactor": 1,
+          "legendFormat": "current",
+          "refId": "B"
        }
      ],
      "thresholds": [],
      "timeFrom": null,
      "timeRegions": [],
      "timeShift": null,
-      "title": "TCP connections rate",
+      "title": "Concurrent inserts",
      "tooltip": {
        "shared": true,
-        "sort": 0,
+        "sort": 2,
        "value_type": "individual"
      },
      "type": "graph",
@@ -1014,7 +1046,7 @@
      },
      "yaxes": [
        {
-          "decimals": null,
+          "decimals": 0,
          "format": "short",
          "label": null,
          "logBase": 1,
@@ -1023,6 +1055,7 @@
          "show": true
        },
        {
+          "decimals": 0,
          "format": "short",
          "label": null,
          "logBase": 1,
@@ -1044,6 +1077,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1064,7 +1098,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1125,6 +1161,98 @@
        "alignLevel": null
      }
    },
+    {
+      "aliasColors": {},
+      "bars": false,
+      "dashLength": 10,
+      "dashes": false,
+      "datasource": "${DS_PROMETHEUS}",
+      "description": "",
+      "fill": 1,
+      "fillGradient": 0,
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 35
+      },
+      "id": 49,
+      "legend": {
+        "avg": false,
+        "current": false,
+        "max": false,
+        "min": false,
+        "show": false,
+        "total": false,
+        "values": false
+      },
+      "lines": true,
+      "linewidth": 1,
+      "links": [],
+      "nullPointMode": "null",
+      "options": {
+        "dataLinks": []
+      },
+      "percentage": false,
+      "pointradius": 2,
+      "points": false,
+      "renderer": "flot",
+      "seriesOverrides": [],
+      "spaceLength": 10,
+      "stack": false,
+      "steppedLine": false,
+      "targets": [
+        {
+          "expr": "sum(rate(vm_tcplistener_accepts_total{job=\"$job\"}[$__interval]))",
+          "format": "time_series",
+          "hide": false,
+          "intervalFactor": 1,
+          "legendFormat": "connections",
+          "refId": "A"
+        }
+      ],
+      "thresholds": [],
+      "timeFrom": null,
+      "timeRegions": [],
+      "timeShift": null,
+      "title": "TCP connections rate",
+      "tooltip": {
+        "shared": true,
+        "sort": 0,
+        "value_type": "individual"
+      },
+      "type": "graph",
+      "xaxis": {
+        "buckets": null,
+        "mode": "time",
+        "name": null,
+        "show": true,
+        "values": []
+      },
+      "yaxes": [
+        {
+          "decimals": null,
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        },
+        {
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        }
+      ],
+      "yaxis": {
+        "align": false,
+        "alignLevel": null
+      }
+    },
    {
      "collapsed": false,
      "gridPos": {
@@ -1146,6 +1274,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "How many datapoints are inserted into storage per second",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1168,7 +1297,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null as zero",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1242,6 +1373,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "How many datapoints are in RAM queue waiting to be written into storage. The less is better.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1262,7 +1394,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1344,6 +1478,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "How many datapoints are in the storage.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1364,7 +1499,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1431,6 +1568,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "Data parts of LSM tree.\nHigh number of parts could be an evidence of slow merge performance - check the resource utilization.\n* `indexdb` - inverted index\n* `storage/small` - recently added parts of data ingested into storage(hot data)\n* `storage/big` -  small parts gradually merged into big parts (cold data)",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1451,7 +1589,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1518,6 +1658,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "Shows amount of on-disk space occupied by data points.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1538,7 +1679,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1605,6 +1748,7 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "Shows amount of on-disk space occupied by inverted index.",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
@@ -1625,7 +1769,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1683,13 +1829,105 @@
        "alignLevel": null
      }
    },
+    {
+      "aliasColors": {},
+      "bars": false,
+      "dashLength": 10,
+      "dashes": false,
+      "datasource": "${DS_PROMETHEUS}",
+      "description": "Shows how many rows were ignored on insertion due to corrupted or out of retention timestamps.",
+      "fill": 1,
+      "fillGradient": 0,
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 0,
+        "y": 68
+      },
+      "id": 58,
+      "legend": {
+        "avg": false,
+        "current": false,
+        "max": false,
+        "min": false,
+        "show": false,
+        "total": false,
+        "values": false
+      },
+      "lines": true,
+      "linewidth": 1,
+      "links": [],
+      "nullPointMode": "null",
+      "options": {
+        "dataLinks": []
+      },
+      "percentage": false,
+      "pointradius": 2,
+      "points": false,
+      "renderer": "flot",
+      "seriesOverrides": [],
+      "spaceLength": 10,
+      "stack": false,
+      "steppedLine": false,
+      "targets": [
+        {
+          "expr": "sum(vm_rows_ignored_total{job=\"$job\"}) by (reason) > 0",
+          "format": "time_series",
+          "hide": false,
+          "intervalFactor": 1,
+          "legendFormat": "{{reason}}",
+          "refId": "A"
+        }
+      ],
+      "thresholds": [],
+      "timeFrom": null,
+      "timeRegions": [],
+      "timeShift": null,
+      "title": "Rows ignored",
+      "tooltip": {
+        "shared": true,
+        "sort": 0,
+        "value_type": "individual"
+      },
+      "type": "graph",
+      "xaxis": {
+        "buckets": null,
+        "mode": "time",
+        "name": null,
+        "show": true,
+        "values": []
+      },
+      "yaxes": [
+        {
+          "decimals": null,
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        },
+        {
+          "format": "short",
+          "label": null,
+          "logBase": 1,
+          "max": null,
+          "min": null,
+          "show": true
+        }
+      ],
+      "yaxis": {
+        "align": false,
+        "alignLevel": null
+      }
+    },
    {
      "collapsed": false,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 68
+        "y": 76
      },
      "id": 46,
      "panels": [],
@@ -1704,11 +1942,12 @@
      "datasource": "${DS_PROMETHEUS}",
      "description": "",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 69
+        "y": 77
      },
      "id": 44,
      "legend": {
@@ -1724,7 +1963,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1806,11 +2047,12 @@
      "dashLength": 10,
      "dashes": false,
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
-        "y": 69
+        "y": 77
      },
      "id": 57,
      "legend": {
@@ -1826,7 +2068,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1892,11 +2136,12 @@
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 77
+        "y": 85
      },
      "id": 47,
      "legend": {
@@ -1912,7 +2157,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -1979,11 +2226,12 @@
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 12,
-        "y": 77
+        "y": 85
      },
      "id": 42,
      "legend": {
@@ -1999,7 +2247,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -2065,11 +2315,12 @@
      "dashes": false,
      "datasource": "${DS_PROMETHEUS}",
      "fill": 1,
+      "fillGradient": 0,
      "gridPos": {
        "h": 8,
        "w": 12,
        "x": 0,
-        "y": 85
+        "y": 93
      },
      "id": 48,
      "legend": {
@@ -2085,7 +2336,9 @@
      "linewidth": 1,
      "links": [],
      "nullPointMode": "null",
-      "options": {},
+      "options": {
+        "dataLinks": []
+      },
      "percentage": false,
      "pointradius": 2,
      "points": false,
@@ -2147,7 +2400,7 @@
    }
  ],
  "refresh": "30s",
-  "schemaVersion": 18,
+  "schemaVersion": 19,
  "style": "dark",
  "tags": [],
  "templating": {
@@ -2230,5 +2483,5 @@
  "timezone": "",
  "title": "VictoriaMetrics",
  "uid": "wNf0q_kZk",
-  "version": 2
-}
+  "version": 3
+}
--- a/deployment/docker/Makefile
+++ b/deployment/docker/Makefile
@@ -1,5 +1,5 @@
 DOCKER_NAMESPACE := victoriametrics
-BUILDER_IMAGE := local/builder:go1.13.0
+BUILDER_IMAGE := local/builder:go1.13.3
 CERTS_IMAGE := local/certs:1.0.2

 package-certs:
@@ -21,7 +21,8 @@ app-via-docker: package-certs package-builder
 		--env GO111MODULE=on \
 		$(DOCKER_OPTS) \
 		$(BUILDER_IMAGE) \
-		go build $(RACE) -mod=vendor -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' -o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
+		go build $(RACE) -mod=vendor -trimpath -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' \
+			-o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)

 package-via-docker:
 	(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)') || (\
--- a/deployment/docker/builder/Dockerfile
+++ b/deployment/docker/builder/Dockerfile
@@ -1,2 +1,2 @@
-FROM golang:1.13.0
+FROM golang:1.13.3
 STOPSIGNAL SIGINT
--- a/deployment/docker/docker-compose.yml
+++ b/deployment/docker/docker-compose.yml
@@ -2,7 +2,7 @@ version: '3.5'
 services:
  prometheus:
    container_name: prometheus
-    image: prom/prometheus:v2.10.0
+    image: prom/prometheus:v2.12.0
    depends_on:
      - "victoriametrics"
    ports:
@@ -35,7 +35,7 @@ services:
    restart: always
  grafana:
    container_name: grafana
-    image: grafana/grafana:6.2.1
+    image: grafana/grafana:6.3.5
    entrypoint: >
      /bin/sh -c "
      cd /var/lib/grafana &&
--- a/go.mod
+++ b/go.mod
@@ -1,18 +1,18 @@
 module github.com/VictoriaMetrics/VictoriaMetrics

 require (
-	github.com/VictoriaMetrics/fastcache v1.5.1
-	github.com/VictoriaMetrics/metrics v1.7.1
-	github.com/cespare/xxhash/v2 v2.0.1-0.20190104013014-3767db7a7e18
+	github.com/VictoriaMetrics/fastcache v1.5.2
+	github.com/VictoriaMetrics/metrics v1.7.2
+	github.com/cespare/xxhash/v2 v2.1.0
 	github.com/golang/snappy v0.0.1
 	github.com/google/go-cmp v0.3.0 // indirect
-	github.com/klauspost/compress v1.7.6
-	github.com/spaolacci/murmur3 v1.1.0 // indirect
+	github.com/klauspost/compress v1.9.1
 	github.com/valyala/fastjson v1.4.1
-	github.com/valyala/gozstd v1.6.1
+	github.com/valyala/fastrand v1.0.0
+	github.com/valyala/gozstd v1.6.2
 	github.com/valyala/histogram v1.0.1
-	github.com/valyala/quicktemplate v1.2.0
-	golang.org/x/sys v0.0.0-20190813064441-fde4db37ae7a
+	github.com/valyala/quicktemplate v1.3.1
+	golang.org/x/sys v0.0.0-20191027211539-f8518d3b3627
 )

 go 1.12
--- a/go.sum
+++ b/go.sum
@@ -1,16 +1,18 @@
 github.com/OneOfOne/xxhash v1.2.2/go.mod h1:HSdplMjZKSmBqAxg5vPj2TmRDmfkzw+cTzAElWljhcU=
 github.com/OneOfOne/xxhash v1.2.5 h1:zl/OfRA6nftbBK9qTohYBJ5xvw6C/oNKizR7cZGl3cI=
 github.com/OneOfOne/xxhash v1.2.5/go.mod h1:eZbhyaAYD41SGSSsnmcpxVoRiQ/MPUTjUdIIOT9Um7Q=
-github.com/VictoriaMetrics/fastcache v1.5.1 h1:qHgHjyoNFV7jgucU8QZUuU4gcdhfs8QW1kw68OD2Lag=
-github.com/VictoriaMetrics/fastcache v1.5.1/go.mod h1:+jv9Ckb+za/P1ZRg/sulP5Ni1v49daAVERr0H3CuscE=
-github.com/VictoriaMetrics/metrics v1.7.1 h1:g2qrY6Upn8rvlvR40cGHFY0crwi4hpqF0n9vJMNsCSg=
-github.com/VictoriaMetrics/metrics v1.7.1/go.mod h1:LU2j9qq7xqZYXz8tF3/RQnB2z2MbZms5TDiIg9/NHiQ=
+github.com/VictoriaMetrics/fastcache v1.5.2 h1:Erd8iIuBAL9kke8JzM4+WxkKuFkHh3ktwLanJvDgR44=
+github.com/VictoriaMetrics/fastcache v1.5.2/go.mod h1:+jv9Ckb+za/P1ZRg/sulP5Ni1v49daAVERr0H3CuscE=
+github.com/VictoriaMetrics/metrics v1.7.2 h1:PzC0SEo5lbbNK7xaYwclCCdoaIGRmXOfflIMF3LpSW4=
+github.com/VictoriaMetrics/metrics v1.7.2/go.mod h1:LU2j9qq7xqZYXz8tF3/RQnB2z2MbZms5TDiIg9/NHiQ=
 github.com/allegro/bigcache v1.2.1-0.20190218064605-e24eb225f156 h1:eMwmnE/GDgah4HI848JfFxHt+iPb26b4zyfspmqY0/8=
 github.com/allegro/bigcache v1.2.1-0.20190218064605-e24eb225f156/go.mod h1:Cb/ax3seSYIx7SuZdm2G2xzfwmv3TPSk2ucNfQESPXM=
 github.com/cespare/xxhash v1.1.0 h1:a6HrQnmkObjyL+Gs60czilIUGqrzKutQD6XZog3p+ko=
 github.com/cespare/xxhash v1.1.0/go.mod h1:XrSqR1VqqWfGrhpAt58auRo0WTKS1nRRg3ghfAqPWnc=
 github.com/cespare/xxhash/v2 v2.0.1-0.20190104013014-3767db7a7e18 h1:pl4eWIqvFe/Kg3zkn7NxevNzILnZYWDCG7qbA1CJik0=
 github.com/cespare/xxhash/v2 v2.0.1-0.20190104013014-3767db7a7e18/go.mod h1:HD5P3vAIAh+Y2GAxg0PrPN1P8WkepXGpjbUPDHJqqKM=
+github.com/cespare/xxhash/v2 v2.1.0 h1:yTUvW7Vhb89inJ+8irsUqiWjh8iT6sQPZiQzI6ReGkA=
+github.com/cespare/xxhash/v2 v2.1.0/go.mod h1:dgIUBU3pDso/gPgZ1osOZ0iQf77oPR28Tjxl5dIMyVM=
 github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
 github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
 github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
@@ -20,8 +22,8 @@ github.com/google/go-cmp v0.3.0 h1:crn/baboCvb5fXaQ0IJ1SGTsTVrWpDsCWC8EGETZijY=
 github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
 github.com/klauspost/compress v1.4.0/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
 github.com/klauspost/compress v1.4.1/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
-github.com/klauspost/compress v1.7.6 h1:GH2karLOcuZtA5a3+KuzSU33A2cvcHGbtEWM6K4t7oU=
-github.com/klauspost/compress v1.7.6/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
+github.com/klauspost/compress v1.9.1 h1:TWy0o9J9c6LK9C8t7Msh6IAJNXbsU/nvKLTQUU5HdaY=
+github.com/klauspost/compress v1.9.1/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
 github.com/klauspost/cpuid v0.0.0-20180405133222-e7e905edc00e/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
 github.com/klauspost/cpuid v1.2.0 h1:NMpwD2G9JSFOE1/TJjGSo5zG7Yb2bTe7eq1jH+irmeE=
 github.com/klauspost/cpuid v1.2.0/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
@@ -29,8 +31,6 @@ github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZb
 github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
 github.com/spaolacci/murmur3 v0.0.0-20180118202830-f09979ecbc72/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
 github.com/spaolacci/murmur3 v1.0.1-0.20190317074736-539464a789e9/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
-github.com/spaolacci/murmur3 v1.1.0 h1:7c1g84S4BPRrfL5Xrdp6fOJ206sU9y293DDHaoy0bLI=
-github.com/spaolacci/murmur3 v1.1.0/go.mod h1:JwIasOWyU6f++ZhiEuf87xNszmSA2myDM2Kzu9HwQUA=
 github.com/stretchr/objx v0.1.0/go.mod h1:HFkY916IF+rwdDfMAkV7OtwuqBVzrE8GR6GFx+wExME=
 github.com/stretchr/testify v1.3.0 h1:TivCn/peBQ7UY8ooIcPgZFpTNSz0Q2U6UrFlUfqbe0Q=
 github.com/stretchr/testify v1.3.0/go.mod h1:M5WIy9Dh21IEIfnGCwXGc5bZfKNJtfHm1UVUgZn+9EI=
@@ -41,13 +41,13 @@ github.com/valyala/fastjson v1.4.1 h1:hrltpHpIpkaxll8QltMU8c3QZ5+qIiCL8yKqPFJI/y
 github.com/valyala/fastjson v1.4.1/go.mod h1:nV6MsjxL2IMJQUoHDIrjEI7oLyeqK6aBD7EFWPsvP8o=
 github.com/valyala/fastrand v1.0.0 h1:LUKT9aKer2dVQNUi3waewTbKV+7H17kvWFNKs2ObdkI=
 github.com/valyala/fastrand v1.0.0/go.mod h1:HWqCzkrkg6QXT8V2EXWvXCoow7vLwOFN002oeRzjapQ=
-github.com/valyala/gozstd v1.6.1 h1:oFN2mNW0kOr1fEKJuLpDwakNb6Y9fElVEBZmPEsFTUw=
-github.com/valyala/gozstd v1.6.1/go.mod h1:y5Ew47GLlP37EkTB+B4s7r6A5rdaeB7ftbl9zoYiIPQ=
+github.com/valyala/gozstd v1.6.2 h1:MgBfNm0I8IKm51LUTTKfO9vi4BtmoH7kBXeUvgaiZVU=
+github.com/valyala/gozstd v1.6.2/go.mod h1:y5Ew47GLlP37EkTB+B4s7r6A5rdaeB7ftbl9zoYiIPQ=
 github.com/valyala/histogram v1.0.1 h1:FzA7n2Tz/wKRMejgu3PV1vw3htAklTjjuoI6z3d4KDg=
 github.com/valyala/histogram v1.0.1/go.mod h1:lQy0xA4wUz2+IUnf97SivorsJIp8FxsnRd6x25q7Mto=
-github.com/valyala/quicktemplate v1.2.0 h1:BaO1nHTkspYzmAjPXj0QiDJxai96tlcZyKcI9dyEGvM=
-github.com/valyala/quicktemplate v1.2.0/go.mod h1:EH+4AkTd43SvgIbQHYu59/cJyxDoOVRUAfrukLPuGJ4=
+github.com/valyala/quicktemplate v1.3.1 h1:V9Ixd/ONuoT6C1ipx8XR2dNGSDgIVnvT4ezZ38ZWllU=
+github.com/valyala/quicktemplate v1.3.1/go.mod h1:EH+4AkTd43SvgIbQHYu59/cJyxDoOVRUAfrukLPuGJ4=
 github.com/valyala/tcplisten v0.0.0-20161114210144-ceec8f93295a/go.mod h1:v3UYOV9WzVtRmSR+PDvWpU/qWl4Wa5LApYYX4ZtKbio=
 golang.org/x/net v0.0.0-20180911220305-26e67e76b6c3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
-golang.org/x/sys v0.0.0-20190813064441-fde4db37ae7a h1:aYOabOQFp6Vj6W1F80affTUvO9UxmJRx8K0gsfABByQ=
-golang.org/x/sys v0.0.0-20190813064441-fde4db37ae7a/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
+golang.org/x/sys v0.0.0-20191027211539-f8518d3b3627 h1:/FZUR3d/QsXe4AcJyJFCc40TOj3y6Hs23Y3YJlvVkWo=
+golang.org/x/sys v0.0.0-20191027211539-f8518d3b3627/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
--- a/lib/decimal/decimal.go
+++ b/lib/decimal/decimal.go
@@ -265,61 +265,83 @@ var (
 // For instance, for f = -1.234 it returns v = -1234, e = -3.
 //
 // FromFloat doesn't work properly with NaN values, so don't pass them here.
-func FromFloat(f float64) (v int64, e int16) {
-	if math.IsInf(f, 0) {
-		// Special case for Inf
-		if math.IsInf(f, 1) {
-			return vInfPos, 0
-		}
-		return vInfNeg, 0
-	}
-
-	minus := false
-	if f < 0 {
-		f = -f
-		minus = true
-	}
+func FromFloat(f float64) (int64, int16) {
 	if f == 0 {
-		// Special case for 0.0 and -0.0
 		return 0, 0
 	}
-	v, e = positiveFloatToDecimal(f)
-	if minus {
-		v = -v
+	if math.IsInf(f, 0) {
+		return fromFloatInf(f)
 	}
-	if v == 0 {
-		e = 0
-	} else if v > vMax {
-		v = vMax
-	} else if v < vMin {
+	if f > 0 {
+		v, e := positiveFloatToDecimal(f)
+		if v > vMax {
+			v = vMax
+		}
+		return v, e
+	}
+	v, e := positiveFloatToDecimal(-f)
+	v = -v
+	if v < vMin {
 		v = vMin
 	}
 	return v, e
 }

+func fromFloatInf(f float64) (int64, int16) {
+	// Special case for Inf
+	if math.IsInf(f, 1) {
+		return vInfPos, 0
+	}
+	return vInfNeg, 0
+}
+
 func positiveFloatToDecimal(f float64) (int64, int16) {
+	// There is no need in checking for f == 0, since it should be already checked by the caller.
+	u := uint64(f)
+	if float64(u) != f {
+		return positiveFloatToDecimalSlow(f)
+	}
+	// Fast path for integers.
+	if u < 1<<55 && u%10 != 0 {
+		return int64(u), 0
+	}
+	return getDecimalAndScale(u)
+}
+
+func getDecimalAndScale(u uint64) (int64, int16) {
 	var scale int16
-	v := int64(f)
-	if f == float64(v) {
-		// Fast path for integers.
-		u := uint64(v)
-		if u%10 != 0 {
-			return v, 0
-		}
-		// Minimize v by converting trailing zeros to scale.
+	for u >= 1<<55 {
+		// Remove trailing garbage bits left after float64->uint64 conversion,
+		// since float64 contains only 53 significant bits.
+		// See https://en.wikipedia.org/wiki/Double-precision_floating-point_format
 		u /= 10
 		scale++
-		for u != 0 && u%10 == 0 {
-			u /= 10
-			scale++
-		}
+	}
+	if u%10 != 0 {
 		return int64(u), scale
 	}
+	// Minimize v by converting trailing zeros to scale.
+	u /= 10
+	scale++
+	for u != 0 && u%10 == 0 {
+		u /= 10
+		scale++
+	}
+	return int64(u), scale
+}

+func positiveFloatToDecimalSlow(f float64) (int64, int16) {
 	// Slow path for floating point numbers.
+	var scale int16
+	prec := conversionPrecision
 	if f > 1e6 || f < 1e-6 {
 		// Normalize f, so it is in the small range suitable
 		// for the next loop.
+		if f > 1e6 {
+			// Increase conversion precision for big numbers.
+			// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/213
+			prec = 1e15
+		}
 		_, exp := math.Frexp(f)
 		scale = int16(float64(exp) * math.Ln2 / math.Ln10)
 		f *= math.Pow10(-int(scale))
@@ -327,13 +349,13 @@ func positiveFloatToDecimal(f float64) (int64, int16) {

 	// Multiply f by 100 until the fractional part becomes
 	// too small comparing to integer part.
-	for f < conversionPrecision {
+	for f < prec {
 		x, frac := math.Modf(f)
-		if frac*conversionPrecision < x {
+		if frac*prec < x {
 			f = x
 			break
 		}
-		if (1-frac)*conversionPrecision < x {
+		if (1-frac)*prec < x {
 			f = x + 1
 			break
 		}
--- a/lib/decimal/decimal_test.go
+++ b/lib/decimal/decimal_test.go
@@ -7,6 +7,44 @@ import (
 	"testing"
 )

+func TestPositiveFloatToDecimal(t *testing.T) {
+	f := func(f float64, decimalExpected int64, exponentExpected int16) {
+		t.Helper()
+		decimal, exponent := positiveFloatToDecimal(f)
+		if decimal != decimalExpected {
+			t.Fatalf("unexpected decimal for positiveFloatToDecimal(%f); got %d; want %d", f, decimal, decimalExpected)
+		}
+		if exponent != exponentExpected {
+			t.Fatalf("unexpected exponent for positiveFloatToDecimal(%f); got %d; want %d", f, exponent, exponentExpected)
+		}
+	}
+	f(0, 0, 1) // The exponent is 1 is OK here. See comment in positiveFloatToDecimal.
+	f(1, 1, 0)
+	f(30, 3, 1)
+	f(12345678900000000, 123456789, 8)
+	f(12345678901234567, 12345678901234568, 0)
+	f(1234567890123456789, 12345678901234567, 2)
+	f(12345678901234567890, 12345678901234567, 3)
+	f(18446744073670737131, 18446744073670737, 3)
+	f(123456789012345678901, 12345678901234568, 4)
+	f(1<<53, 1<<53, 0)
+	f(1<<54, 18014398509481984, 0)
+	f(1<<55, 3602879701896396, 1)
+	f(1<<62, 4611686018427387, 3)
+	f(1<<63, 9223372036854775, 3)
+	f(1<<64, 18446744073709548, 3)
+	f(1<<65, 368934881474191, 5)
+	f(1<<66, 737869762948382, 5)
+	f(1<<67, 1475739525896764, 5)
+
+	f(0.1, 1, -1)
+	f(123456789012345678e-5, 12345678901234568, -4)
+	f(1234567890123456789e-10, 12345678901234568, -8)
+	f(1234567890123456789e-14, 1234567890123, -8)
+	f(1234567890123456789e-17, 12345678901234, -12)
+	f(1234567890123456789e-20, 1234567890123, -14)
+}
+
 func TestAppendDecimalToFloat(t *testing.T) {
 	testAppendDecimalToFloat(t, []int64{}, 0, nil)
 	testAppendDecimalToFloat(t, []int64{0}, 0, []float64{0})
@@ -168,7 +206,7 @@ func TestAppendFloatToDecimal(t *testing.T) {
 	// no-op
 	testAppendFloatToDecimal(t, []float64{}, nil, 0)
 	testAppendFloatToDecimal(t, []float64{0}, []int64{0}, 0)
-	testAppendFloatToDecimal(t, []float64{0, 1, -1, 12345678, -123456789}, []int64{0, 1, -1, 12345678, -123456789}, 0)
+	testAppendFloatToDecimal(t, []float64{0, -0, 1, -1, 12345678, -123456789}, []int64{0, 0, 1, -1, 12345678, -123456789}, 0)

 	// upExp
 	testAppendFloatToDecimal(t, []float64{-24, 0, 4.123, 0.3}, []int64{-24000, 0, 4123, 300}, -3)
@@ -248,8 +286,8 @@ func TestFloatToDecimal(t *testing.T) {

 	f(math.Inf(1), vInfPos, 0)
 	f(math.Inf(-1), vInfNeg, 0)
-	f(1<<63-1, 922337203685, 7)
-	f(-1<<63, -922337203685, 7)
+	f(1<<63-1, 9223372036854775, 3)
+	f(-1<<63, -9223372036854775, 3)

 	// Test precision loss due to conversionPrecision.
 	f(0.1234567890123456, 12345678901234, -14)
--- a/lib/encoding/int.go
+++ b/lib/encoding/int.go
@@ -1,6 +1,7 @@
 package encoding

 import (
+	"encoding/binary"
 	"fmt"
 	"sync"
 )
@@ -12,8 +13,8 @@ func MarshalUint16(dst []byte, u uint16) []byte {

 // UnmarshalUint16 returns unmarshaled uint32 from src.
 func UnmarshalUint16(src []byte) uint16 {
-	_ = src[1]
-	return uint16(src[0])<<8 | uint16(src[1])
+	// This is faster than the manual conversion.
+	return binary.BigEndian.Uint16(src)
 }

 // MarshalUint32 appends marshaled v to dst and returns the result.
@@ -23,8 +24,8 @@ func MarshalUint32(dst []byte, u uint32) []byte {

 // UnmarshalUint32 returns unmarshaled uint32 from src.
 func UnmarshalUint32(src []byte) uint32 {
-	_ = src[3]
-	return uint32(src[0])<<24 | uint32(src[1])<<16 | uint32(src[2])<<8 | uint32(src[3])
+	// This is faster than the manual conversion.
+	return binary.BigEndian.Uint32(src)
 }

 // MarshalUint64 appends marshaled v to dst and returns the result.
@@ -34,8 +35,8 @@ func MarshalUint64(dst []byte, u uint64) []byte {

 // UnmarshalUint64 returns unmarshaled uint64 from src.
 func UnmarshalUint64(src []byte) uint64 {
-	_ = src[7]
-	return uint64(src[0])<<56 | uint64(src[1])<<48 | uint64(src[2])<<40 | uint64(src[3])<<32 | uint64(src[4])<<24 | uint64(src[5])<<16 | uint64(src[6])<<8 | uint64(src[7])
+	// This is faster than the manual conversion.
+	return binary.BigEndian.Uint64(src)
 }

 // MarshalInt16 appends marshaled v to dst and returns the result.
@@ -48,8 +49,8 @@ func MarshalInt16(dst []byte, v int16) []byte {

 // UnmarshalInt16 returns unmarshaled int16 from src.
 func UnmarshalInt16(src []byte) int16 {
-	_ = src[1]
-	u := uint16(src[0])<<8 | uint16(src[1])
+	// This is faster than the manual conversion.
+	u := binary.BigEndian.Uint16(src)
 	v := int16(u>>1) ^ (int16(u<<15) >> 15) // zig-zag decoding without branching.
 	return v
 }
@@ -64,8 +65,8 @@ func MarshalInt64(dst []byte, v int64) []byte {

 // UnmarshalInt64 returns unmarshaled int64 from src.
 func UnmarshalInt64(src []byte) int64 {
-	_ = src[7]
-	u := uint64(src[0])<<56 | uint64(src[1])<<48 | uint64(src[2])<<40 | uint64(src[3])<<32 | uint64(src[4])<<24 | uint64(src[5])<<16 | uint64(src[6])<<8 | uint64(src[7])
+	// This is faster than the manual conversion.
+	u := binary.BigEndian.Uint64(src)
 	v := int64(u>>1) ^ (int64(u<<63) >> 63) // zig-zag decoding without branching.
 	return v
 }
--- a/lib/encoding/int_timing_test.go
+++ b/lib/encoding/int_timing_test.go
@@ -6,6 +6,33 @@ import (
 	"testing"
 )

+func BenchmarkMarshalUint64(b *testing.B) {
+	b.ReportAllocs()
+	b.SetBytes(1)
+	b.RunParallel(func(pb *testing.PB) {
+		var dst []byte
+		var sink uint64
+		for pb.Next() {
+			dst = MarshalUint64(dst[:0], sink)
+			sink += uint64(len(dst))
+		}
+		atomic.AddUint64(&Sink, sink)
+	})
+}
+
+func BenchmarkUnmarshalUint64(b *testing.B) {
+	b.ReportAllocs()
+	b.SetBytes(1)
+	b.RunParallel(func(pb *testing.PB) {
+		var sink uint64
+		for pb.Next() {
+			v := UnmarshalUint64(testMarshaledUint64Data)
+			sink += v
+		}
+		atomic.AddUint64(&Sink, sink)
+	})
+}
+
 func BenchmarkMarshalInt64(b *testing.B) {
 	b.ReportAllocs()
 	b.SetBytes(1)
@@ -120,3 +147,4 @@ func benchmarkUnmarshalVarInt64s(b *testing.B, maxValue int64) {
 }

 var testMarshaledInt64Data = MarshalInt64(nil, 1234567890)
+var testMarshaledUint64Data = MarshalUint64(nil, 1234567890)
--- a/lib/fs/fs.go
+++ b/lib/fs/fs.go
@@ -92,7 +92,7 @@ var tmpFileNum uint64

 // WriteFileAtomically atomically writes data to the given file path.
 //
-// WriteFile returns only after the file is fully written and synced
+// WriteFileAtomically returns only after the file is fully written and synced
 // to the underlying storage.
 func WriteFileAtomically(path string, data []byte) error {
 	// Check for the existing file. It is expected that
--- a/lib/memory/memory_linux.go
+++ b/lib/memory/memory_linux.go
@@ -9,12 +9,17 @@ import (
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
 )

+const maxInt = int(^uint(0) >> 1)
+
 func sysTotalMemory() int {
 	var si syscall.Sysinfo_t
 	if err := syscall.Sysinfo(&si); err != nil {
 		logger.Panicf("FATAL: error in syscall.Sysinfo: %s", err)
 	}
-	totalMem := int(si.Totalram) * int(si.Unit)
+	totalMem := maxInt
+	if uint64(maxInt)/uint64(si.Totalram) > uint64(si.Unit) {
+		totalMem = int(uint64(si.Totalram) * uint64(si.Unit))
+	}

 	// Try determining the amount of memory inside docker container.
 	// See https://stackoverflow.com/questions/42187085/check-mem-limit-within-a-docker-container .
--- a/lib/mergeset/block_header.go
+++ b/lib/mergeset/block_header.go
@@ -158,7 +158,7 @@ func unmarshalBlockHeaders(dst []blockHeader, src []byte, blockHeadersCount int)
 	newBHS := dst[dstLen:]

 	// Verify that block headers are sorted by firstItem.
-	if !sort.SliceIsSorted(newBHS, func(i, j int) bool { return string(newBHS[i].firstItem) <= string(newBHS[j].firstItem) }) {
+	if !sort.SliceIsSorted(newBHS, func(i, j int) bool { return string(newBHS[i].firstItem) < string(newBHS[j].firstItem) }) {
 		return nil, fmt.Errorf("block headers must be sorted by firstItem; unmarshaled unsorted block headers: %#v", newBHS)
 	}

--- a/lib/mergeset/encoding.go
+++ b/lib/mergeset/encoding.go
@@ -2,6 +2,7 @@ package mergeset

 import (
 	"fmt"
+	"os"
 	"sort"
 	"strings"
 	"sync"
@@ -11,10 +12,20 @@ import (
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
 )

+type byteSliceSorter [][]byte
+
+func (s byteSliceSorter) Len() int { return len(s) }
+func (s byteSliceSorter) Less(i, j int) bool {
+	return string(s[i]) < string(s[j])
+}
+func (s byteSliceSorter) Swap(i, j int) {
+	s[i], s[j] = s[j], s[i]
+}
+
 type inmemoryBlock struct {
 	commonPrefix []byte
 	data         []byte
-	items        [][]byte
+	items        byteSliceSorter
 }

 func (ib *inmemoryBlock) Reset() {
@@ -77,12 +88,9 @@ func (ib *inmemoryBlock) Add(x []byte) bool {
 // It must fit CPU cache size, i.e. 64KB for the current CPUs.
 const maxInmemoryBlockSize = 64 * 1024

-func (ib *inmemoryBlock) itemsLess(i, j int) bool {
-	return string(ib.items[i]) < string(ib.items[j])
-}
-
 func (ib *inmemoryBlock) sort() {
-	sort.Slice(ib.items, ib.itemsLess)
+	// Use sort.Sort instead of sort.Slice in order to eliminate memory allocation.
+	sort.Sort(&ib.items)
 	bb := bbPool.Get()
 	b := bytesutil.Resize(bb.B, len(ib.data))
 	b = b[:0]
@@ -120,7 +128,8 @@ func checkMarshalType(mt marshalType) error {
 }

 func (ib *inmemoryBlock) isSorted() bool {
-	return sort.SliceIsSorted(ib.items, ib.itemsLess)
+	// Use sort.IsSorted instead of sort.SliceIsSorted in order to eliminate memory allocation.
+	return sort.IsSorted(&ib.items)
 }

 // MarshalUnsortedData marshals unsorted items from ib to sb.
@@ -138,6 +147,10 @@ func (ib *inmemoryBlock) MarshalUnsortedData(sb *storageBlock, firstItemDst, com
 	return ib.marshalData(sb, firstItemDst, commonPrefixDst, compressLevel)
 }

+var isInTest = func() bool {
+	return strings.HasSuffix(os.Args[0], ".test")
+}()
+
 // MarshalUnsortedData marshals sorted items from ib to sb.
 //
 // It also:
@@ -146,17 +159,22 @@ func (ib *inmemoryBlock) MarshalUnsortedData(sb *storageBlock, firstItemDst, com
 // - returns the number of items encoded including the first item.
 // - returns the marshal type used for the encoding.
 func (ib *inmemoryBlock) MarshalSortedData(sb *storageBlock, firstItemDst, commonPrefixDst []byte, compressLevel int) ([]byte, []byte, uint32, marshalType) {
-	// if !ib.isSorted() {
-	//	logger.Panicf("BUG: %d items must be sorted; items:\n%s", len(ib.items), ib.debugItemsString())
-	// }
+	if isInTest && !ib.isSorted() {
+		logger.Panicf("BUG: %d items must be sorted; items:\n%s", len(ib.items), ib.debugItemsString())
+	}
 	ib.updateCommonPrefix()
 	return ib.marshalData(sb, firstItemDst, commonPrefixDst, compressLevel)
 }

 func (ib *inmemoryBlock) debugItemsString() string {
 	var sb strings.Builder
+	var prevItem []byte
 	for i, item := range ib.items {
+		if string(item) < string(prevItem) {
+			fmt.Fprintf(&sb, "!!! the next item is smaller than the previous item !!!\n")
+		}
 		fmt.Fprintf(&sb, "%05d %X\n", i, item)
+		prevItem = item
 	}
 	return sb.String()
 }
@@ -175,7 +193,7 @@ func (ib *inmemoryBlock) marshalData(sb *storageBlock, firstItemDst, commonPrefi
 	firstItemDst = append(firstItemDst, ib.items[0]...)
 	commonPrefixDst = append(commonPrefixDst, ib.commonPrefix...)

-	if len(ib.data)-len(ib.commonPrefix)*len(ib.items) < 64 || len(ib.items) < 10 {
+	if len(ib.data)-len(ib.commonPrefix)*len(ib.items) < 64 || len(ib.items) < 2 {
 		// Use plain encoding form small block, since it is cheaper.
 		ib.marshalDataPlain(sb)
 		return firstItemDst, commonPrefixDst, uint32(len(ib.items)), marshalTypePlain
--- a/lib/mergeset/merge.go
+++ b/lib/mergeset/merge.go
@@ -5,18 +5,31 @@ import (
 	"fmt"
 	"sync"
 	"sync/atomic"
+
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
 )

+// PrepareBlockCallback can transform the passed items allocated at the given data.
+//
+// The callback is called during merge before flushing full block of the given items
+// to persistent storage.
+//
+// The callback must return sorted items. The first and the last item must be unchanged.
+// The callback can re-use data and items for storing the result.
+type PrepareBlockCallback func(data []byte, items [][]byte) ([]byte, [][]byte)
+
 // mergeBlockStreams merges bsrs and writes result to bsw.
 //
 // It also fills ph.
 //
+// prepareBlock is optional.
+//
 // The function immediately returns when stopCh is closed.
 //
 // It also atomically adds the number of items merged to itemsMerged.
-func mergeBlockStreams(ph *partHeader, bsw *blockStreamWriter, bsrs []*blockStreamReader, stopCh <-chan struct{}, itemsMerged *uint64) error {
+func mergeBlockStreams(ph *partHeader, bsw *blockStreamWriter, bsrs []*blockStreamReader, prepareBlock PrepareBlockCallback, stopCh <-chan struct{}, itemsMerged *uint64) error {
 	bsm := bsmPool.Get().(*blockStreamMerger)
-	if err := bsm.Init(bsrs); err != nil {
+	if err := bsm.Init(bsrs, prepareBlock); err != nil {
 		return fmt.Errorf("cannot initialize blockStreamMerger: %s", err)
 	}
 	err := bsm.Merge(bsw, ph, stopCh, itemsMerged)
@@ -39,15 +52,24 @@ var bsmPool = &sync.Pool{
 }

 type blockStreamMerger struct {
+	prepareBlock PrepareBlockCallback
+
 	bsrHeap bsrHeap

 	// ib is a scratch block with pending items.
 	ib inmemoryBlock

 	phFirstItemCaught bool
+
+	// This are auxiliary buffers used in flushIB
+	// for consistency checks after prepareBlock call.
+	firstItem []byte
+	lastItem  []byte
 }

 func (bsm *blockStreamMerger) reset() {
+	bsm.prepareBlock = nil
+
 	for i := range bsm.bsrHeap {
 		bsm.bsrHeap[i] = nil
 	}
@@ -57,8 +79,9 @@ func (bsm *blockStreamMerger) reset() {
 	bsm.phFirstItemCaught = false
 }

-func (bsm *blockStreamMerger) Init(bsrs []*blockStreamReader) error {
+func (bsm *blockStreamMerger) Init(bsrs []*blockStreamReader, prepareBlock PrepareBlockCallback) error {
 	bsm.reset()
+	bsm.prepareBlock = prepareBlock
 	for _, bsr := range bsrs {
 		if bsr.Next() {
 			bsm.bsrHeap = append(bsm.bsrHeap, bsr)
@@ -95,25 +118,23 @@ again:

 	bsr := heap.Pop(&bsm.bsrHeap).(*blockStreamReader)

-	if !bsm.phFirstItemCaught {
-		ph.firstItem = append(ph.firstItem[:0], bsr.Block.items[0]...)
-		bsm.phFirstItemCaught = true
-	}
-
 	var nextItem []byte
 	hasNextItem := false
 	if len(bsm.bsrHeap) > 0 {
 		nextItem = bsm.bsrHeap[0].bh.firstItem
 		hasNextItem = true
 	}
-	for bsr.blockItemIdx < len(bsr.Block.items) && (!hasNextItem || string(bsr.Block.items[bsr.blockItemIdx]) <= string(nextItem)) {
-		if bsm.ib.Add(bsr.Block.items[bsr.blockItemIdx]) {
-			bsr.blockItemIdx++
+	for bsr.blockItemIdx < len(bsr.Block.items) {
+		item := bsr.Block.items[bsr.blockItemIdx]
+		if hasNextItem && string(item) > string(nextItem) {
+			break
+		}
+		if !bsm.ib.Add(item) {
+			// The bsm.ib is full. Flush it to bsw and continue.
+			bsm.flushIB(bsw, ph, itemsMerged)
 			continue
 		}
-
-		// The bsm.ib is full. Flush it to bsw and continue.
-		bsm.flushIB(bsw, ph, itemsMerged)
+		bsr.blockItemIdx++
 	}
 	if bsr.blockItemIdx == len(bsr.Block.items) {
 		// bsr.Block is fully read. Proceed to the next block.
@@ -139,9 +160,35 @@ func (bsm *blockStreamMerger) flushIB(bsw *blockStreamWriter, ph *partHeader, it
 		// Nothing to flush.
 		return
 	}
-	itemsCount := uint64(len(bsm.ib.items))
-	ph.itemsCount += itemsCount
-	atomic.AddUint64(itemsMerged, itemsCount)
+	atomic.AddUint64(itemsMerged, uint64(len(bsm.ib.items)))
+	if bsm.prepareBlock != nil {
+		bsm.firstItem = append(bsm.firstItem[:0], bsm.ib.items[0]...)
+		bsm.lastItem = append(bsm.lastItem[:0], bsm.ib.items[len(bsm.ib.items)-1]...)
+		bsm.ib.data, bsm.ib.items = bsm.prepareBlock(bsm.ib.data, bsm.ib.items)
+		if len(bsm.ib.items) == 0 {
+			// Nothing to flush
+			return
+		}
+		// Consistency checks after prepareBlock call.
+		firstItem := bsm.ib.items[0]
+		if string(firstItem) != string(bsm.firstItem) {
+			logger.Panicf("BUG: prepareBlock must return first item equal to the original first item;\ngot\n%X\nwant\n%X", firstItem, bsm.firstItem)
+		}
+		lastItem := bsm.ib.items[len(bsm.ib.items)-1]
+		if string(lastItem) != string(bsm.lastItem) {
+			logger.Panicf("BUG: prepareBlock must return last item equal to the original last item;\ngot\n%X\nwant\n%X", lastItem, bsm.lastItem)
+		}
+		// Verify whether the bsm.ib.items are sorted only in tests, since this
+		// can be expensive check in prod for items with long common prefix.
+		if isInTest && !bsm.ib.isSorted() {
+			logger.Panicf("BUG: prepareBlock must return sorted items;\ngot\n%s", bsm.ib.debugItemsString())
+		}
+	}
+	ph.itemsCount += uint64(len(bsm.ib.items))
+	if !bsm.phFirstItemCaught {
+		ph.firstItem = append(ph.firstItem[:0], bsm.ib.items[0]...)
+		bsm.phFirstItemCaught = true
+	}
 	ph.lastItem = append(ph.lastItem[:0], bsm.ib.items[len(bsm.ib.items)-1]...)
 	bsw.WriteBlock(&bsm.ib)
 	bsm.ib.Reset()
--- a/lib/mergeset/merge_test.go
+++ b/lib/mergeset/merge_test.go
@@ -30,14 +30,14 @@ func TestMultilevelMerge(t *testing.T) {
 	var dstIP1 inmemoryPart
 	var bsw1 blockStreamWriter
 	bsw1.InitFromInmemoryPart(&dstIP1, 0)
-	if err := mergeBlockStreams(&dstIP1.ph, &bsw1, bsrs[:5], nil, &itemsMerged); err != nil {
+	if err := mergeBlockStreams(&dstIP1.ph, &bsw1, bsrs[:5], nil, nil, &itemsMerged); err != nil {
 		t.Fatalf("cannot merge first level part 1: %s", err)
 	}

 	var dstIP2 inmemoryPart
 	var bsw2 blockStreamWriter
 	bsw2.InitFromInmemoryPart(&dstIP2, 0)
-	if err := mergeBlockStreams(&dstIP2.ph, &bsw2, bsrs[5:], nil, &itemsMerged); err != nil {
+	if err := mergeBlockStreams(&dstIP2.ph, &bsw2, bsrs[5:], nil, nil, &itemsMerged); err != nil {
 		t.Fatalf("cannot merge first level part 2: %s", err)
 	}

@@ -54,7 +54,7 @@ func TestMultilevelMerge(t *testing.T) {
 		newTestBlockStreamReader(&dstIP2),
 	}
 	bsw.InitFromInmemoryPart(&dstIP, 0)
-	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrsTop, nil, &itemsMerged); err != nil {
+	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrsTop, nil, nil, &itemsMerged); err != nil {
 		t.Fatalf("cannot merge second level: %s", err)
 	}
 	if itemsMerged != uint64(len(items)) {
@@ -76,7 +76,7 @@ func TestMergeForciblyStop(t *testing.T) {
 	ch := make(chan struct{})
 	var itemsMerged uint64
 	close(ch)
-	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrs, ch, &itemsMerged); err != errForciblyStopped {
+	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrs, nil, ch, &itemsMerged); err != errForciblyStopped {
 		t.Fatalf("unexpected error during merge: got %v; want %v", err, errForciblyStopped)
 	}
 	if itemsMerged != 0 {
@@ -120,7 +120,7 @@ func testMergeBlockStreamsSerial(blocksToMerge, maxItemsPerBlock int) error {
 	var dstIP inmemoryPart
 	var bsw blockStreamWriter
 	bsw.InitFromInmemoryPart(&dstIP, 0)
-	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrs, nil, &itemsMerged); err != nil {
+	if err := mergeBlockStreams(&dstIP.ph, &bsw, bsrs, nil, nil, &itemsMerged); err != nil {
 		return fmt.Errorf("cannot merge block streams: %s", err)
 	}
 	if itemsMerged != uint64(len(items)) {
--- a/lib/mergeset/part.go
+++ b/lib/mergeset/part.go
@@ -5,6 +5,7 @@ import (
 	"path/filepath"
 	"sync"
 	"sync/atomic"
+	"unsafe"

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/filestream"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
@@ -13,7 +14,7 @@ import (

 func getMaxCachedIndexBlocksPerPart() int {
 	maxCachedIndexBlocksPerPartOnce.Do(func() {
-		n := memory.Allowed() / 1024 / 1024 / 2
+		n := memory.Allowed() / 1024 / 1024 / 4
 		if n == 0 {
 			n = 10
 		}
@@ -29,7 +30,7 @@ var (

 func getMaxCachedInmemoryBlocksPerPart() int {
 	maxCachedInmemoryBlocksPerPartOnce.Do(func() {
-		n := memory.Allowed() / 1024 / 1024 / 2
+		n := memory.Allowed() / 1024 / 1024 / 4
 		if n == 0 {
 			n = 10
 		}
@@ -43,7 +44,7 @@ var (
 	maxCachedInmemoryBlocksPerPartOnce sync.Once
 )

-type part struct {
+type partInternals struct {
 	ph partHeader

 	path string
@@ -55,7 +56,14 @@ type part struct {
 	indexFile fs.ReadAtCloser
 	itemsFile fs.ReadAtCloser
 	lensFile  fs.ReadAtCloser
+}

+type part struct {
+	partInternals
+
+	// Align atomic counters inside caches by 8 bytes on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212 .
+	_         [(8 - (unsafe.Sizeof(partInternals{}) % 8)) % 8]byte
 	idxbCache indexBlockCache
 	ibCache   inmemoryBlockCache
 }
@@ -114,15 +122,15 @@ func newPart(ph *partHeader, path string, size uint64, metaindexReader filestrea
 	}
 	metaindexReader.MustClose()

-	p := &part{
-		path: path,
-		size: size,
-		mrs:  mrs,
+	var p part
+	p.path = path
+	p.size = size
+	p.mrs = mrs
+
+	p.indexFile = indexFile
+	p.itemsFile = itemsFile
+	p.lensFile = lensFile

-		indexFile: indexFile,
-		itemsFile: itemsFile,
-		lensFile:  lensFile,
-	}
 	p.ph.CopyFrom(ph)
 	p.idxbCache.Init()
 	p.ibCache.Init()
@@ -133,7 +141,7 @@ func newPart(ph *partHeader, path string, size uint64, metaindexReader filestrea
 		p.MustClose()
 		return nil, err
 	}
-	return p, nil
+	return &p, nil
 }

 func (p *part) MustClose() {
@@ -165,12 +173,15 @@ func putIndexBlock(idxb *indexBlock) {
 var indexBlockPool sync.Pool

 type indexBlockCache struct {
+	// Atomically updated counters must go first in the struct, so they are properly
+	// aligned to 8 bytes on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+	requests uint64
+	misses   uint64
+
 	m         map[uint64]*indexBlock
 	missesMap map[uint64]uint64
 	mu        sync.RWMutex
-
-	requests uint64
-	misses   uint64
 }

 func (idxbc *indexBlockCache) Init() {
@@ -274,12 +285,15 @@ func (idxbc *indexBlockCache) Misses() uint64 {
 }

 type inmemoryBlockCache struct {
+	// Atomically updated counters must go first in the struct, so they are properly
+	// aligned to 8 bytes on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+	requests uint64
+	misses   uint64
+
 	m         map[inmemoryBlockCacheKey]*inmemoryBlock
 	missesMap map[inmemoryBlockCacheKey]uint64
 	mu        sync.RWMutex
-
-	requests uint64
-	misses   uint64
 }

 type inmemoryBlockCacheKey struct {
--- a/lib/mergeset/part_search.go
+++ b/lib/mergeset/part_search.go
@@ -31,6 +31,8 @@ type partSearch struct {
 	// Pointer to inmemory block, which may be reused.
 	inmemoryBlockReuse *inmemoryBlock

+	shouldCacheBlock func(item []byte) bool
+
 	idxbCache *indexBlockCache
 	ibCache   *inmemoryBlockCache

@@ -59,6 +61,7 @@ func (ps *partSearch) reset() {
 		putInmemoryBlock(ps.inmemoryBlockReuse)
 		ps.inmemoryBlockReuse = nil
 	}
+	ps.shouldCacheBlock = nil
 	ps.idxbCache = nil
 	ps.ibCache = nil
 	ps.err = nil
@@ -75,7 +78,7 @@ func (ps *partSearch) reset() {
 // Init initializes ps for search in the p.
 //
 // Use Seek for search in p.
-func (ps *partSearch) Init(p *part) {
+func (ps *partSearch) Init(p *part, shouldCacheBlock func(item []byte) bool) {
 	ps.reset()

 	ps.p = p
@@ -324,6 +327,16 @@ func (ps *partSearch) readIndexBlock(mr *metaindexRow) (*indexBlock, error) {
 }

 func (ps *partSearch) getInmemoryBlock(bh *blockHeader) (*inmemoryBlock, bool, error) {
+	if ps.shouldCacheBlock != nil {
+		if !ps.shouldCacheBlock(bh.firstItem) {
+			ib, err := ps.readInmemoryBlock(bh)
+			if err != nil {
+				return nil, false, err
+			}
+			return ib, true, nil
+		}
+	}
+
 	var ibKey inmemoryBlockCacheKey
 	ibKey.Init(bh)
 	ib := ps.ibCache.Get(ibKey)
@@ -371,7 +384,7 @@ func binarySearchKey(items [][]byte, key []byte) int {
 	i, j := uint(0), n
 	for i < j {
 		h := uint(i+j) >> 1
-		if string(key) > string(items[h]) {
+		if h >= 0 && h < uint(len(items)) && string(key) > string(items[h]) {
 			i = h + 1
 		} else {
 			j = h
--- a/lib/mergeset/part_search_test.go
+++ b/lib/mergeset/part_search_test.go
@@ -51,7 +51,7 @@ func testPartSearchConcurrent(p *part, items []string) error {
 func testPartSearchSerial(p *part, items []string) error {
 	var ps partSearch

-	ps.Init(p)
+	ps.Init(p, nil)
 	var k []byte

 	// Search for the item smaller than the items[0]
@@ -150,7 +150,7 @@ func newTestPart(blocksCount, maxItemsPerBlock int) (*part, []string, error) {
 	var ip inmemoryPart
 	var bsw blockStreamWriter
 	bsw.InitFromInmemoryPart(&ip, 0)
-	if err := mergeBlockStreams(&ip.ph, &bsw, bsrs, nil, &itemsMerged); err != nil {
+	if err := mergeBlockStreams(&ip.ph, &bsw, bsrs, nil, nil, &itemsMerged); err != nil {
 		return nil, nil, fmt.Errorf("cannot merge blocks: %s", err)
 	}
 	if itemsMerged != uint64(len(items)) {
--- a/lib/mergeset/table.go
+++ b/lib/mergeset/table.go
@@ -15,6 +15,7 @@ import (
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/syncwg"
 )

@@ -49,7 +50,19 @@ const maxItemsPerPart = 100e9
 //
 // Such parts are usually frequently accessed, so it is good to cache their
 // contents in OS page cache.
-const maxItemsPerCachedPart = 100e6
+func maxItemsPerCachedPart() uint64 {
+	mem := memory.Remaining()
+	// Production data shows that each item occupies ~4 bytes in the compressed part.
+	// It is expected no more than defaultPartsToMerge/2 parts exist
+	// in the OS page cache before they are merged into bigger part.
+	// Halft of the remaining RAM must be left for lib/storage parts,
+	// so the maxItems is calculated using the below code:
+	maxItems := uint64(mem) / (4 * defaultPartsToMerge)
+	if maxItems < 1e6 {
+		maxItems = 1e6
+	}
+	return maxItems
+}

 // The interval for flushing (converting) recent raw items into parts,
 // so they become visible to search.
@@ -57,10 +70,23 @@ const rawItemsFlushInterval = time.Second

 // Table represents mergeset table.
 type Table struct {
+	// Atomically updated counters must go first in the struct, so they are properly
+	// aligned to 8 bytes on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+
+	activeMerges   uint64
+	mergesCount    uint64
+	itemsMerged    uint64
+	assistedMerges uint64
+
+	mergeIdx uint64
+
 	path string

 	flushCallback func()

+	prepareBlock PrepareBlockCallback
+
 	partsLock sync.Mutex
 	parts     []*partWrapper

@@ -68,8 +94,6 @@ type Table struct {
 	rawItemsLock          sync.Mutex
 	rawItemsLastFlushTime time.Time

-	mergeIdx uint64
-
 	snapshotLock sync.RWMutex

 	flockF *os.File
@@ -81,13 +105,10 @@ type Table struct {

 	rawItemsFlusherWG sync.WaitGroup

+	convertersWG sync.WaitGroup
+
 	// Use syncwg instead of sync, since Add/Wait may be called from concurrent goroutines.
 	rawItemsPendingFlushesWG syncwg.WaitGroup
-
-	activeMerges   uint64
-	mergesCount    uint64
-	itemsMerged    uint64
-	assistedMerges uint64
 }

 type partWrapper struct {
@@ -126,8 +147,11 @@ func (pw *partWrapper) decRef() {
 // Optional flushCallback is called every time new data batch is flushed
 // to the underlying storage and becomes visible to search.
 //
+// Optional prepareBlock is called during merge before flushing the prepared block
+// to persistent storage.
+//
 // The table is created if it doesn't exist yet.
-func OpenTable(path string, flushCallback func()) (*Table, error) {
+func OpenTable(path string, flushCallback func(), prepareBlock PrepareBlockCallback) (*Table, error) {
 	path = filepath.Clean(path)
 	logger.Infof("opening table %q...", path)
 	startTime := time.Now()
@@ -152,6 +176,7 @@ func OpenTable(path string, flushCallback func()) (*Table, error) {
 	tb := &Table{
 		path:          path,
 		flushCallback: flushCallback,
+		prepareBlock:  prepareBlock,
 		parts:         pws,
 		mergeIdx:      uint64(time.Now().UnixNano()),
 		flockF:        flockF,
@@ -165,6 +190,12 @@ func OpenTable(path string, flushCallback func()) (*Table, error) {
 	logger.Infof("table %q has been opened in %s; partsCount: %d; blocksCount: %d, itemsCount: %d; sizeBytes: %d",
 		path, time.Since(startTime), m.PartsCount, m.BlocksCount, m.ItemsCount, m.SizeBytes)

+	tb.convertersWG.Add(1)
+	go func() {
+		tb.convertToV1280()
+		tb.convertersWG.Done()
+	}()
+
 	return tb, nil
 }

@@ -177,6 +208,11 @@ func (tb *Table) MustClose() {
 	tb.rawItemsFlusherWG.Wait()
 	logger.Infof("raw items flusher stopped in %s on %q", time.Since(startTime), tb.path)

+	logger.Infof("waiting for converters to stop on %q...", tb.path)
+	startTime = time.Now()
+	tb.convertersWG.Wait()
+	logger.Infof("converters stopped in %s on %q", time.Since(startTime), tb.path)
+
 	logger.Infof("waiting for part mergers to stop on %q...", tb.path)
 	startTime = time.Now()
 	tb.partMergersWG.Wait()
@@ -203,7 +239,7 @@ func (tb *Table) MustClose() {
 	}
 	tb.partsLock.Unlock()

-	if err := tb.mergePartsOptimal(pws); err != nil {
+	if err := tb.mergePartsOptimal(pws, nil); err != nil {
 		logger.Panicf("FATAL: cannot flush inmemory parts to files in %q: %s", tb.path, err)
 	}
 	logger.Infof("%d inmemory parts have been flushed to files in %s on %q", len(pws), time.Since(startTime), tb.path)
@@ -380,15 +416,67 @@ func (tb *Table) rawItemsFlusher() {
 	}
 }

-func (tb *Table) mergePartsOptimal(pws []*partWrapper) error {
+const convertToV1280FileName = "converted-to-v1.28.0"
+
+func (tb *Table) convertToV1280() {
+	// Convert tag->metricID rows into tag->metricIDs rows when upgrading to v1.28.0+.
+	flagFilePath := tb.path + "/" + convertToV1280FileName
+	if fs.IsPathExist(flagFilePath) {
+		// The conversion has been already performed.
+		return
+	}
+
+	getAllPartsForMerge := func() []*partWrapper {
+		var pws []*partWrapper
+		tb.partsLock.Lock()
+		for _, pw := range tb.parts {
+			if pw.isInMerge {
+				continue
+			}
+			pw.isInMerge = true
+			pws = append(pws, pw)
+		}
+		tb.partsLock.Unlock()
+		return pws
+	}
+	pws := getAllPartsForMerge()
+	if len(pws) > 0 {
+		logger.Infof("started round 1 of background conversion of %q to v1.28.0 format; merge %d parts", tb.path, len(pws))
+		startTime := time.Now()
+		if err := tb.mergePartsOptimal(pws, tb.stopCh); err != nil {
+			logger.Errorf("failed round 1 of background conversion of %q to v1.28.0 format: %s", tb.path, err)
+			return
+		}
+		logger.Infof("finished round 1 of background conversion of %q to v1.28.0 format in %s", tb.path, time.Since(startTime))
+
+		// The second round is needed in order to merge small blocks
+		// with tag->metricIDs rows left after the first round.
+		pws = getAllPartsForMerge()
+		logger.Infof("started round 2 of background conversion of %q to v1.28.0 format; merge %d parts", tb.path, len(pws))
+		startTime = time.Now()
+		if len(pws) > 0 {
+			if err := tb.mergePartsOptimal(pws, tb.stopCh); err != nil {
+				logger.Errorf("failed round 2 of background conversion of %q to v1.28.0 format: %s", tb.path, err)
+				return
+			}
+		}
+		logger.Infof("finished round 2 of background conversion of %q to v1.28.0 format in %s", tb.path, time.Since(startTime))
+	}
+
+	if err := fs.WriteFileAtomically(flagFilePath, []byte("ok")); err != nil {
+		logger.Panicf("FATAL: cannot create %q: %s", flagFilePath, err)
+	}
+}
+
+func (tb *Table) mergePartsOptimal(pws []*partWrapper, stopCh <-chan struct{}) error {
 	for len(pws) > defaultPartsToMerge {
-		if err := tb.mergeParts(pws[:defaultPartsToMerge], nil, false); err != nil {
+		if err := tb.mergeParts(pws[:defaultPartsToMerge], stopCh, false); err != nil {
 			return fmt.Errorf("cannot merge %d parts: %s", defaultPartsToMerge, err)
 		}
 		pws = pws[defaultPartsToMerge:]
 	}
 	if len(pws) > 0 {
-		if err := tb.mergeParts(pws, nil, false); err != nil {
+		if err := tb.mergeParts(pws, stopCh, false); err != nil {
 			return fmt.Errorf("cannot merge %d parts: %s", len(pws), err)
 		}
 	}
@@ -464,7 +552,7 @@ func (tb *Table) mergeRawItemsBlocks(blocksToMerge []*inmemoryBlock) {
 		}

 		// The added part exceeds maxParts count. Assist with merging other parts.
-		err := tb.mergeSmallParts(false)
+		err := tb.mergeExistingParts(false)
 		if err == nil {
 			atomic.AddUint64(&tb.assistedMerges, 1)
 			continue
@@ -528,7 +616,7 @@ func (tb *Table) mergeInmemoryBlocks(blocksToMerge []*inmemoryBlock) *partWrappe
 	// Merge parts.
 	// The merge shouldn't be interrupted by stopCh,
 	// since it may be final after stopCh is closed.
-	if err := mergeBlockStreams(&mpDst.ph, bsw, bsrs, nil, &tb.itemsMerged); err != nil {
+	if err := mergeBlockStreams(&mpDst.ph, bsw, bsrs, tb.prepareBlock, nil, &tb.itemsMerged); err != nil {
 		logger.Panicf("FATAL: cannot merge inmemoryBlocks: %s", err)
 	}
 	putBlockStreamWriter(bsw)
@@ -545,7 +633,7 @@ func (tb *Table) mergeInmemoryBlocks(blocksToMerge []*inmemoryBlock) *partWrappe
 }

 func (tb *Table) startPartMergers() {
-	for i := 0; i < mergeWorkers; i++ {
+	for i := 0; i < mergeWorkersCount; i++ {
 		tb.partMergersWG.Add(1)
 		go func() {
 			if err := tb.partMerger(); err != nil {
@@ -556,7 +644,7 @@ func (tb *Table) startPartMergers() {
 	}
 }

-func (tb *Table) mergeSmallParts(isFinal bool) error {
+func (tb *Table) mergeExistingParts(isFinal bool) error {
 	maxItems := tb.maxOutPartItems()
 	if maxItems > maxItemsPerPart {
 		maxItems = maxItemsPerPart
@@ -580,7 +668,7 @@ func (tb *Table) partMerger() error {
 	isFinal := false
 	t := time.NewTimer(sleepTime)
 	for {
-		err := tb.mergeSmallParts(isFinal)
+		err := tb.mergeExistingParts(isFinal)
 		if err == nil {
 			// Try merging additional parts.
 			sleepTime = minMergeSleepTime
@@ -595,7 +683,7 @@ func (tb *Table) partMerger() error {
 		if err != errNothingToMerge {
 			return err
 		}
-		if time.Since(lastMergeTime) > 10*time.Second {
+		if time.Since(lastMergeTime) > 30*time.Second {
 			// We have free time for merging into bigger parts.
 			// This should improve select performance.
 			lastMergeTime = time.Now()
@@ -670,13 +758,10 @@ func (tb *Table) mergeParts(pws []*partWrapper, stopCh <-chan struct{}, isOuterP
 		outItemsCount += pw.p.ph.itemsCount
 	}
 	nocache := true
-	if outItemsCount < maxItemsPerCachedPart {
+	if outItemsCount < maxItemsPerCachedPart() {
 		// Cache small (i.e. recent) output parts in OS file cache,
 		// since there is high chance they will be read soon.
 		nocache = false
-
-		// Do not interrupt small merges.
-		stopCh = nil
 	}

 	// Prepare blockStreamWriter for destination part.
@@ -690,7 +775,7 @@ func (tb *Table) mergeParts(pws []*partWrapper, stopCh <-chan struct{}, isOuterP

 	// Merge parts into a temporary location.
 	var ph partHeader
-	err := mergeBlockStreams(&ph, bsw, bsrs, stopCh, &tb.itemsMerged)
+	err := mergeBlockStreams(&ph, bsw, bsrs, tb.prepareBlock, stopCh, &tb.itemsMerged)
 	putBlockStreamWriter(bsw)
 	if err != nil {
 		if err == errForciblyStopped {
@@ -816,12 +901,12 @@ func (tb *Table) maxOutPartItemsSlow() uint64 {

 	// Calculate the maximum number of items in the output merge part
 	// by dividing the freeSpace by 4 and by the number of concurrent
-	// mergeWorkers.
+	// mergeWorkersCount.
 	// This assumes each item is compressed into 4 bytes.
-	return freeSpace / uint64(mergeWorkers) / 4
+	return freeSpace / uint64(mergeWorkersCount) / 4
 }

-var mergeWorkers = func() int {
+var mergeWorkersCount = func() int {
 	return runtime.GOMAXPROCS(-1)
 }()

@@ -940,11 +1025,20 @@ func (tb *Table) CreateSnapshotAt(dstDir string) error {
 		return fmt.Errorf("cannot read directory: %s", err)
 	}
 	for _, fi := range fis {
+		fn := fi.Name()
 		if !fs.IsDirOrSymlink(fi) {
-			// Skip non-directories.
+			switch fn {
+			case convertToV1280FileName:
+				srcPath := srcDir + "/" + fn
+				dstPath := dstDir + "/" + fn
+				if err := os.Link(srcPath, dstPath); err != nil {
+					return fmt.Errorf("cannot hard link from %q to %q: %s", srcPath, dstPath, err)
+				}
+			default:
+				// Skip other non-directories.
+			}
 			continue
 		}
-		fn := fi.Name()
 		if isSpecialDir(fn) {
 			// Skip special dirs.
 			continue
@@ -1152,30 +1246,31 @@ func appendPartsToMerge(dst, src []*partWrapper, maxPartsToMerge int, maxItems u
 	for i := 2; i <= n; i++ {
 		for j := 0; j <= len(src)-i; j++ {
 			itemsSum := uint64(0)
-			for _, pw := range src[j : j+i] {
+			a := src[j : j+i]
+			for _, pw := range a {
 				itemsSum += pw.p.ph.itemsCount
 			}
 			if itemsSum > maxItems {
-				continue
+				// There is no sense in checking the remaining bigger parts.
+				break
 			}
-			m := float64(itemsSum) / float64(src[j+i-1].p.ph.itemsCount)
+			m := float64(itemsSum) / float64(a[len(a)-1].p.ph.itemsCount)
 			if m < maxM {
 				continue
 			}
 			maxM = m
-			pws = src[j : j+i]
+			pws = a
 		}
 	}

-	minM := float64(maxPartsToMerge / 2)
-	if minM < 2 {
-		minM = 2
+	minM := float64(maxPartsToMerge) / 2
+	if minM < 1.7 {
+		minM = 1.7
 	}
 	if maxM < minM {
 		// There is no sense in merging parts with too small m.
 		return dst
 	}
-
 	return append(dst, pws...)
 }

--- a/lib/mergeset/table_search.go
+++ b/lib/mergeset/table_search.go
@@ -58,7 +58,7 @@ func (ts *TableSearch) reset() {
 // Init initializes ts for searching in the tb.
 //
 // MustClose must be called when the ts is no longer needed.
-func (ts *TableSearch) Init(tb *Table) {
+func (ts *TableSearch) Init(tb *Table, shouldCacheBlock func(item []byte) bool) {
 	if ts.needClosing {
 		logger.Panicf("BUG: missing MustClose call before the next call to Init")
 	}
@@ -76,7 +76,7 @@ func (ts *TableSearch) Init(tb *Table) {
 	}
 	ts.psPool = ts.psPool[:len(ts.pws)]
 	for i, pw := range ts.pws {
-		ts.psPool[i].Init(pw.p)
+		ts.psPool[i].Init(pw.p, shouldCacheBlock)
 	}
 }

--- a/lib/mergeset/table_search_test.go
+++ b/lib/mergeset/table_search_test.go
@@ -40,7 +40,7 @@ func TestTableSearchSerial(t *testing.T) {

 	func() {
 		// Re-open the table and verify the search works.
-		tb, err := OpenTable(path, nil)
+		tb, err := OpenTable(path, nil, nil)
 		if err != nil {
 			t.Fatalf("cannot open table: %s", err)
 		}
@@ -75,7 +75,7 @@ func TestTableSearchConcurrent(t *testing.T) {

 	// Re-open the table and verify the search works.
 	func() {
-		tb, err := OpenTable(path, nil)
+		tb, err := OpenTable(path, nil, nil)
 		if err != nil {
 			t.Fatalf("cannot open table: %s", err)
 		}
@@ -109,7 +109,7 @@ func testTableSearchConcurrent(tb *Table, items []string) error {

 func testTableSearchSerial(tb *Table, items []string) error {
 	var ts TableSearch
-	ts.Init(tb)
+	ts.Init(tb, nil)
 	for _, key := range []string{
 		"",
 		"123",
@@ -151,7 +151,7 @@ func newTestTable(path string, itemsCount int) (*Table, []string, error) {
 	flushCallback := func() {
 		atomic.AddUint64(&flushes, 1)
 	}
-	tb, err := OpenTable(path, flushCallback)
+	tb, err := OpenTable(path, flushCallback, nil)
 	if err != nil {
 		return nil, nil, fmt.Errorf("cannot open table: %s", err)
 	}
--- a/lib/mergeset/table_search_timing_test.go
+++ b/lib/mergeset/table_search_timing_test.go
@@ -32,7 +32,7 @@ func benchmarkTableSearch(b *testing.B, itemsCount int) {

 	// Force finishing pending merges
 	tb.MustClose()
-	tb, err = OpenTable(path, nil)
+	tb, err = OpenTable(path, nil, nil)
 	if err != nil {
 		b.Fatalf("unexpected error when re-opening table %q: %s", path, err)
 	}
@@ -81,7 +81,7 @@ func benchmarkTableSearchKeysExt(b *testing.B, tb *Table, keys [][]byte, stripSu
 	b.SetBytes(int64(searchKeysCount * rowsToScan))
 	b.RunParallel(func(pb *testing.PB) {
 		var ts TableSearch
-		ts.Init(tb)
+		ts.Init(tb, nil)
 		defer ts.MustClose()
 		for pb.Next() {
 			startIdx := rand.Intn(len(keys) - searchKeysCount)
--- a/lib/mergeset/table_test.go
+++ b/lib/mergeset/table_test.go
@@ -21,7 +21,7 @@ func TestTableOpenClose(t *testing.T) {
 	}()

 	// Create a new table
-	tb, err := OpenTable(path, nil)
+	tb, err := OpenTable(path, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot create new table: %s", err)
 	}
@@ -31,7 +31,7 @@ func TestTableOpenClose(t *testing.T) {

 	// Re-open created table multiple times.
 	for i := 0; i < 10; i++ {
-		tb, err := OpenTable(path, nil)
+		tb, err := OpenTable(path, nil, nil)
 		if err != nil {
 			t.Fatalf("cannot open created table: %s", err)
 		}
@@ -45,14 +45,14 @@ func TestTableOpenMultipleTimes(t *testing.T) {
 		_ = os.RemoveAll(path)
 	}()

-	tb1, err := OpenTable(path, nil)
+	tb1, err := OpenTable(path, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open table: %s", err)
 	}
 	defer tb1.MustClose()

 	for i := 0; i < 10; i++ {
-		tb2, err := OpenTable(path, nil)
+		tb2, err := OpenTable(path, nil, nil)
 		if err == nil {
 			tb2.MustClose()
 			t.Fatalf("expecting non-nil error when opening already opened table")
@@ -73,7 +73,7 @@ func TestTableAddItemSerial(t *testing.T) {
 	flushCallback := func() {
 		atomic.AddUint64(&flushes, 1)
 	}
-	tb, err := OpenTable(path, flushCallback)
+	tb, err := OpenTable(path, flushCallback, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
@@ -99,7 +99,7 @@ func TestTableAddItemSerial(t *testing.T) {
 	testReopenTable(t, path, itemsCount)

 	// Add more items in order to verify merge between inmemory parts and file-based parts.
-	tb, err = OpenTable(path, nil)
+	tb, err = OpenTable(path, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
@@ -132,7 +132,7 @@ func TestTableCreateSnapshotAt(t *testing.T) {
 		_ = os.RemoveAll(path)
 	}()

-	tb, err := OpenTable(path, nil)
+	tb, err := OpenTable(path, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
@@ -163,23 +163,23 @@ func TestTableCreateSnapshotAt(t *testing.T) {
 	}()

 	// Verify snapshots contain all the data.
-	tb1, err := OpenTable(snapshot1, nil)
+	tb1, err := OpenTable(snapshot1, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
 	defer tb1.MustClose()

-	tb2, err := OpenTable(snapshot2, nil)
+	tb2, err := OpenTable(snapshot2, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
 	defer tb2.MustClose()

 	var ts, ts1, ts2 TableSearch
-	ts.Init(tb)
-	ts1.Init(tb1)
+	ts.Init(tb, nil)
+	ts1.Init(tb1, nil)
 	defer ts1.MustClose()
-	ts2.Init(tb2)
+	ts2.Init(tb2, nil)
 	defer ts2.MustClose()
 	for i := 0; i < itemsCount; i++ {
 		key := []byte(fmt.Sprintf("item %d", i))
@@ -217,7 +217,12 @@ func TestTableAddItemsConcurrent(t *testing.T) {
 	flushCallback := func() {
 		atomic.AddUint64(&flushes, 1)
 	}
-	tb, err := OpenTable(path, flushCallback)
+	var itemsMerged uint64
+	prepareBlock := func(data []byte, items [][]byte) ([]byte, [][]byte) {
+		atomic.AddUint64(&itemsMerged, uint64(len(items)))
+		return data, items
+	}
+	tb, err := OpenTable(path, flushCallback, prepareBlock)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
@@ -230,6 +235,10 @@ func TestTableAddItemsConcurrent(t *testing.T) {
 	if atomic.LoadUint64(&flushes) == 0 {
 		t.Fatalf("unexpected zero flushes")
 	}
+	n := atomic.LoadUint64(&itemsMerged)
+	if n < itemsCount {
+		t.Fatalf("too low number of items merged; got %v; must be at least %v", n, itemsCount)
+	}

 	var m TableMetrics
 	tb.UpdateMetrics(&m)
@@ -243,7 +252,7 @@ func TestTableAddItemsConcurrent(t *testing.T) {
 	testReopenTable(t, path, itemsCount)

 	// Add more items in order to verify merge between inmemory parts and file-based parts.
-	tb, err = OpenTable(path, nil)
+	tb, err = OpenTable(path, nil, nil)
 	if err != nil {
 		t.Fatalf("cannot open %q: %s", path, err)
 	}
@@ -285,7 +294,7 @@ func testReopenTable(t *testing.T, path string, itemsCount int) {
 	t.Helper()

 	for i := 0; i < 10; i++ {
-		tb, err := OpenTable(path, nil)
+		tb, err := OpenTable(path, nil, nil)
 		if err != nil {
 			t.Fatalf("cannot re-open %q: %s", path, err)
 		}
--- a/lib/netutil/conn.go
+++ b/lib/netutil/conn.go
@@ -43,6 +43,11 @@ func (cm *connMetrics) init(group, name, addr string) {
 }

 type statConn struct {
+	// Move atomic counters to the top of struct in order to properly align them on 32-bit arch.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+
+	closeCalls uint64
+
 	readTimeout  time.Duration
 	lastReadTime time.Time

@@ -52,8 +57,6 @@ type statConn struct {
 	net.Conn

 	cm *connMetrics
-
-	closeCalls uint64
 }

 func (sc *statConn) Read(p []byte) (int, error) {
--- a/lib/prompb/README.md
+++ b/lib/prompb/README.md
@@ -1,14 +0,0 @@
-The compiled protobufs are version controlled and you won't normally need to
-re-compile them when building Prometheus.
-
-If however you have modified the defs and do need to re-compile, run
-`./scripts/genproto.sh` from the parent dir.
-
-In order for the script to run, you'll need `protoc` (version 3.5) in your
-PATH, and the following Go packages installed:
-
- github.com/gogo/protobuf
- github.com/gogo/protobuf/protoc-gen-gogofast
- github.com/grpc-ecosystem/grpc-gateway/protoc-gen-grpc-gateway/
- github.com/grpc-ecosystem/grpc-gateway/protoc-gen-swagger
- golang.org/x/tools/cmd/goimports
--- a/lib/storage/block_stream_reader_test.go
+++ b/lib/storage/block_stream_reader_test.go
@@ -59,7 +59,7 @@ func TestBlockStreamReaderManyTSIDManyRows(t *testing.T) {
 	r.PrecisionBits = defaultPrecisionBits
 	const blocks = 123
 	for i := 0; i < 3210; i++ {
-		r.TSID.MetricID = uint64((1e12 - i) % blocks)
+		r.TSID.MetricID = uint64((1e9 - i) % blocks)
 		r.Value = rand.Float64()
 		r.Timestamp = int64(rand.Float64() * 1e9)
 		rows = append(rows, r)
@@ -73,7 +73,7 @@ func TestBlockStreamReaderReadConcurrent(t *testing.T) {
 	r.PrecisionBits = defaultPrecisionBits
 	const blocks = 123
 	for i := 0; i < 3210; i++ {
-		r.TSID.MetricID = uint64((1e12 - i) % blocks)
+		r.TSID.MetricID = uint64((1e9 - i) % blocks)
 		r.Value = rand.Float64()
 		r.Timestamp = int64(rand.Float64() * 1e9)
 		rows = append(rows, r)
--- a/lib/storage/index_db.go
+++ b/lib/storage/index_db.go
--- a/lib/storage/index_db_test.go
+++ b/lib/storage/index_db_test.go
@@ -12,9 +12,331 @@ import (
 	"time"

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache"
 )

+func TestMergeTagToMetricIDsRows(t *testing.T) {
+	f := func(items []string, expectedItems []string) {
+		t.Helper()
+		var data []byte
+		var itemsB [][]byte
+		for _, item := range items {
+			data = append(data, item...)
+			itemsB = append(itemsB, data[len(data)-len(item):])
+		}
+		if err := checkItemsSorted(itemsB); err != nil {
+			t.Fatalf("source items aren't sorted: %s", err)
+		}
+		resultData, resultItemsB := mergeTagToMetricIDsRows(data, itemsB)
+		if len(resultItemsB) != len(expectedItems) {
+			t.Fatalf("unexpected len(resultItemsB); got %d; want %d", len(resultItemsB), len(expectedItems))
+		}
+		if err := checkItemsSorted(resultItemsB); err != nil {
+			t.Fatalf("result items aren't sorted: %s", err)
+		}
+		for i, item := range resultItemsB {
+			if !bytes.HasPrefix(resultData, item) {
+				t.Fatalf("unexpected prefix for resultData #%d;\ngot\n%X\nwant\n%X", i, resultData, item)
+			}
+			resultData = resultData[len(item):]
+		}
+		if len(resultData) != 0 {
+			t.Fatalf("unexpected tail left in resultData: %X", resultData)
+		}
+		var resultItems []string
+		for _, item := range resultItemsB {
+			resultItems = append(resultItems, string(item))
+		}
+		if !reflect.DeepEqual(expectedItems, resultItems) {
+			t.Fatalf("unexpected items;\ngot\n%X\nwant\n%X", resultItems, expectedItems)
+		}
+	}
+	x := func(key, value string, metricIDs []uint64) string {
+		dst := marshalCommonPrefix(nil, nsPrefixTagToMetricIDs)
+		t := &Tag{
+			Key:   []byte(key),
+			Value: []byte(value),
+		}
+		dst = t.Marshal(dst)
+		for _, metricID := range metricIDs {
+			dst = encoding.MarshalUint64(dst, metricID)
+		}
+		return string(dst)
+	}
+
+	f(nil, nil)
+	f([]string{}, nil)
+	f([]string{"foo"}, []string{"foo"})
+	f([]string{"a", "b", "c", "def"}, []string{"a", "b", "c", "def"})
+	f([]string{"\x00", "\x00b", "\x00c", "\x00def"}, []string{"\x00", "\x00b", "\x00c", "\x00def"})
+	f([]string{
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+	}, []string{
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+	})
+	f([]string{
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		"xyz",
+	}, []string{
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		"xyz",
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		x("", "", []uint64{0}),
+		"xyz",
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{0}),
+		"xyz",
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("", "", []uint64{2}),
+		x("", "", []uint64{3}),
+		x("", "", []uint64{4}),
+		"xyz",
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{1, 2, 3, 4}),
+		"xyz",
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("", "", []uint64{2}),
+		x("", "", []uint64{3}),
+		x("", "", []uint64{4}),
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{1, 2, 3}),
+		x("", "", []uint64{4}),
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("", "", []uint64{2, 3, 4}),
+		x("", "", []uint64{2, 3, 4, 5}),
+		x("", "", []uint64{3, 5}),
+		"foo",
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{1, 2, 3, 4, 5}),
+		"foo",
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("", "a", []uint64{2, 3, 4}),
+		x("", "a", []uint64{2, 3, 4, 5}),
+		x("", "b", []uint64{3, 5}),
+		"foo",
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("", "a", []uint64{2, 3, 4, 5}),
+		x("", "b", []uint64{3, 5}),
+		"foo",
+	})
+	f([]string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("x", "a", []uint64{2, 3, 4}),
+		x("y", "", []uint64{2, 3, 4, 5}),
+		x("y", "x", []uint64{3, 5}),
+		"foo",
+	}, []string{
+		"\x00asdf",
+		x("", "", []uint64{1}),
+		x("x", "a", []uint64{2, 3, 4}),
+		x("y", "", []uint64{2, 3, 4, 5}),
+		x("y", "x", []uint64{3, 5}),
+		"foo",
+	})
+	f([]string{
+		"\x00asdf",
+		x("sdf", "aa", []uint64{1, 1, 3}),
+		x("sdf", "aa", []uint64{1, 2}),
+		"foo",
+	}, []string{
+		"\x00asdf",
+		x("sdf", "aa", []uint64{1, 2, 3}),
+		"foo",
+	})
+	f([]string{
+		"\x00asdf",
+		x("sdf", "aa", []uint64{1, 2, 2, 4}),
+		x("sdf", "aa", []uint64{1, 2, 3}),
+		"foo",
+	}, []string{
+		"\x00asdf",
+		x("sdf", "aa", []uint64{1, 2, 3, 4}),
+		"foo",
+	})
+
+	// Construct big source chunks
+	var metricIDs []uint64
+
+	metricIDs = metricIDs[:0]
+	for i := 0; i < maxMetricIDsPerRow-1; i++ {
+		metricIDs = append(metricIDs, uint64(i))
+	}
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		"x",
+	})
+
+	metricIDs = metricIDs[:0]
+	for i := 0; i < maxMetricIDsPerRow; i++ {
+		metricIDs = append(metricIDs, uint64(i))
+	}
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	})
+
+	metricIDs = metricIDs[:0]
+	for i := 0; i < 3*maxMetricIDsPerRow; i++ {
+		metricIDs = append(metricIDs, uint64(i))
+	}
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	})
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", []uint64{0, 0, 1, 2, 3}),
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", []uint64{0, 1, 2, 3}),
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	})
+
+	// Check for duplicate metricIDs removal
+	metricIDs = metricIDs[:0]
+	for i := 0; i < maxMetricIDsPerRow-1; i++ {
+		metricIDs = append(metricIDs, 123)
+	}
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", metricIDs),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", []uint64{123}),
+		"x",
+	})
+
+	// Check fallback to the original items after merging, which result in incorrect ordering.
+	metricIDs = metricIDs[:0]
+	for i := 0; i < maxMetricIDsPerRow-3; i++ {
+		metricIDs = append(metricIDs, uint64(123))
+	}
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+		"x",
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+		"x",
+	})
+	f([]string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+	}, []string{
+		"\x00aa",
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+	})
+	f([]string{
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+	}, []string{
+		x("foo", "bar", metricIDs),
+		x("foo", "bar", []uint64{123, 123, 125}),
+		x("foo", "bar", []uint64{123, 124}),
+	})
+}
+
+func TestRemoveDuplicateMetricIDs(t *testing.T) {
+	f := func(metricIDs, expectedMetricIDs []uint64) {
+		t.Helper()
+		a := removeDuplicateMetricIDs(metricIDs)
+		if !reflect.DeepEqual(a, expectedMetricIDs) {
+			t.Fatalf("unexpected result from removeDuplicateMetricIDs:\ngot\n%d\nwant\n%d", a, expectedMetricIDs)
+		}
+	}
+	f(nil, nil)
+	f([]uint64{123}, []uint64{123})
+	f([]uint64{123, 123}, []uint64{123})
+	f([]uint64{123, 123, 123}, []uint64{123})
+	f([]uint64{123, 1234, 1235}, []uint64{123, 1234, 1235})
+	f([]uint64{0, 1, 1, 2}, []uint64{0, 1, 2})
+	f([]uint64{0, 0, 0, 1, 1, 2}, []uint64{0, 1, 2})
+	f([]uint64{0, 1, 1, 2, 2}, []uint64{0, 1, 2})
+	f([]uint64{0, 1, 2, 2}, []uint64{0, 1, 2})
+}
+
 func TestMarshalUnmarshalTSIDs(t *testing.T) {
 	f := func(tsids []TSID) {
 		t.Helper()
@@ -280,6 +602,7 @@ func testIndexDBGetOrCreateTSIDByName(db *indexDB, accountsCount, projectsCount,
 	is := db.getIndexSearch()
 	defer db.putIndexSearch(is)

+	var metricNameBuf []byte
 	for i := 0; i < 4e2+1; i++ {
 		var mn MetricName

@@ -294,11 +617,11 @@ func testIndexDBGetOrCreateTSIDByName(db *indexDB, accountsCount, projectsCount,
 			mn.AddTag(key, value)
 		}
 		mn.sortTags()
-		metricName := mn.Marshal(nil)
+		metricNameBuf = mn.Marshal(metricNameBuf[:0])

 		// Create tsid for the metricName.
 		var tsid TSID
-		if err := is.GetOrCreateTSIDByName(&tsid, metricName); err != nil {
+		if err := is.GetOrCreateTSIDByName(&tsid, metricNameBuf); err != nil {
 			return nil, nil, fmt.Errorf("unexpected error when creating tsid for mn:\n%s: %s", &mn, err)
 		}

@@ -306,22 +629,22 @@ func testIndexDBGetOrCreateTSIDByName(db *indexDB, accountsCount, projectsCount,
 		tsids = append(tsids, tsid)
 	}

+	// fill Date -> MetricID cache
+	date := uint64(timestampFromTime(time.Now())) / msecPerDay
+	for i := range tsids {
+		tsid := &tsids[i]
+		if err := db.storeDateMetricID(date, tsid.MetricID); err != nil {
+			return nil, nil, fmt.Errorf("error in storeDateMetricID(%d, %d): %s", date, tsid.MetricID, err)
+		}
+	}
+
+	// Flush index to disk, so it becomes visible for search
 	db.tb.DebugFlush()

 	return mns, tsids, nil
 }

 func testIndexDBCheckTSIDByName(db *indexDB, mns []MetricName, tsids []TSID, isConcurrent bool) error {
-	// fill Date -> MetricID cache
-	date := uint64(timestampFromTime(time.Now())) / msecPerDay
-	for i := range tsids {
-		tsid := &tsids[i]
-		if err := db.storeDateMetricID(date, tsid.MetricID); err != nil {
-			return fmt.Errorf("error in storeDateMetricID(%d, %d): %s", date, tsid.MetricID, err)
-		}
-	}
-	db.tb.DebugFlush()
-
 	hasValue := func(tvs []string, v []byte) bool {
 		for _, tv := range tvs {
 			if string(v) == tv {
@@ -361,7 +684,7 @@ func testIndexDBCheckTSIDByName(db *indexDB, mns []MetricName, tsids []TSID, isC
 		var err error
 		metricNameCopy, err = db.searchMetricName(metricNameCopy[:0], tsidCopy.MetricID)
 		if err != nil {
-			return fmt.Errorf("error in searchMetricName: %s", err)
+			return fmt.Errorf("error in searchMetricName for metricID=%d; i=%d: %s", tsidCopy.MetricID, i, err)
 		}
 		if !bytes.Equal(metricName, metricNameCopy) {
 			return fmt.Errorf("unexpected mn for metricID=%d;\ngot\n%q\nwant\n%q", tsidCopy.MetricID, metricNameCopy, metricName)
@@ -451,7 +774,7 @@ func testIndexDBCheckTSIDByName(db *indexDB, mns []MetricName, tsids []TSID, isC
 			return fmt.Errorf("cannot search by exact tag filter: %s", err)
 		}
 		if !testHasTSID(tsidsFound, tsid) {
-			return fmt.Errorf("tsids is missing in exact tsidsFound\ntsid=%+v\ntsidsFound=%+v\ntfs=%s\nmn=%s", tsid, tsidsFound, tfs, mn)
+			return fmt.Errorf("tsids is missing in exact tsidsFound\ntsid=%+v\ntsidsFound=%+v\ntfs=%s\nmn=%s\ni=%d", tsid, tsidsFound, tfs, mn, i)
 		}

 		// Verify tag cache.
--- a/lib/storage/merge.go
+++ b/lib/storage/merge.go
@@ -6,6 +6,7 @@ import (

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 )

 // mergeBlockStreams merges bsrs into bsw and updates ph.
@@ -14,7 +15,7 @@ import (
 //
 // rowsMerged is atomically updated with the number of merged rows during the merge.
 func mergeBlockStreams(ph *partHeader, bsw *blockStreamWriter, bsrs []*blockStreamReader, stopCh <-chan struct{}, rowsMerged *uint64,
-	deletedMetricIDs map[uint64]struct{}, rowsDeleted *uint64) error {
+	deletedMetricIDs *uint64set.Set, rowsDeleted *uint64) error {
 	ph.Reset()

 	bsm := bsmPool.Get().(*blockStreamMerger)
@@ -41,7 +42,7 @@ var bsmPool = &sync.Pool{
 var errForciblyStopped = fmt.Errorf("forcibly stopped")

 func mergeBlockStreamsInternal(ph *partHeader, bsw *blockStreamWriter, bsm *blockStreamMerger, stopCh <-chan struct{}, rowsMerged *uint64,
-	deletedMetricIDs map[uint64]struct{}, rowsDeleted *uint64) error {
+	deletedMetricIDs *uint64set.Set, rowsDeleted *uint64) error {
 	// Search for the first block to merge
 	var pendingBlock *Block
 	for bsm.NextBlock() {
@@ -50,7 +51,7 @@ func mergeBlockStreamsInternal(ph *partHeader, bsw *blockStreamWriter, bsm *bloc
 			return errForciblyStopped
 		default:
 		}
-		if _, deleted := deletedMetricIDs[bsm.Block.bh.TSID.MetricID]; deleted {
+		if deletedMetricIDs.Has(bsm.Block.bh.TSID.MetricID) {
 			// Skip blocks for deleted metrics.
 			*rowsDeleted += uint64(bsm.Block.bh.RowsCount)
 			continue
@@ -72,7 +73,7 @@ func mergeBlockStreamsInternal(ph *partHeader, bsw *blockStreamWriter, bsm *bloc
 			return errForciblyStopped
 		default:
 		}
-		if _, deleted := deletedMetricIDs[bsm.Block.bh.TSID.MetricID]; deleted {
+		if deletedMetricIDs.Has(bsm.Block.bh.TSID.MetricID) {
 			// Skip blocks for deleted metrics.
 			*rowsDeleted += uint64(bsm.Block.bh.RowsCount)
 			continue
--- a/lib/storage/merge_test.go
+++ b/lib/storage/merge_test.go
@@ -25,7 +25,7 @@ func TestMergeBlockStreamsOneStreamOneBlockManyRows(t *testing.T) {
 	minTimestamp := int64(1<<63 - 1)
 	maxTimestamp := int64(-1 << 63)
 	for i := 0; i < maxRowsPerBlock; i++ {
-		r.Timestamp = int64(rand.Intn(1e15))
+		r.Timestamp = int64(rand.Intn(1e9))
 		r.Value = rand.NormFloat64() * 2332
 		rows = append(rows, r)

@@ -51,7 +51,7 @@ func TestMergeBlockStreamsOneStreamManyBlocksOneRow(t *testing.T) {
 	for i := 0; i < blocksCount; i++ {
 		initTestTSID(&r.TSID)
 		r.TSID.MetricID = uint64(i * 123)
-		r.Timestamp = int64(rand.Intn(1e15))
+		r.Timestamp = int64(rand.Intn(1e9))
 		r.Value = rand.NormFloat64() * 2332
 		rows = append(rows, r)

@@ -78,7 +78,7 @@ func TestMergeBlockStreamsOneStreamManyBlocksManyRows(t *testing.T) {
 	maxTimestamp := int64(-1 << 63)
 	for i := 0; i < rowsCount; i++ {
 		r.TSID.MetricID = uint64(i % blocksCount)
-		r.Timestamp = int64(rand.Intn(1e15))
+		r.Timestamp = int64(rand.Intn(1e9))
 		r.Value = rand.NormFloat64() * 2332
 		rows = append(rows, r)

@@ -175,7 +175,7 @@ func TestMergeBlockStreamsTwoStreamsManyBlocksManyRows(t *testing.T) {
 	const rowsCount1 = 4938
 	for i := 0; i < rowsCount1; i++ {
 		r.TSID.MetricID = uint64(i % blocksCount)
-		r.Timestamp = int64(rand.Intn(1e15))
+		r.Timestamp = int64(rand.Intn(1e9))
 		r.Value = rand.NormFloat64() * 2332
 		rows = append(rows, r)

@@ -192,7 +192,7 @@ func TestMergeBlockStreamsTwoStreamsManyBlocksManyRows(t *testing.T) {
 	const rowsCount2 = 3281
 	for i := 0; i < rowsCount2; i++ {
 		r.TSID.MetricID = uint64((i + 17) % blocksCount)
-		r.Timestamp = int64(rand.Intn(1e15))
+		r.Timestamp = int64(rand.Intn(1e9))
 		r.Value = rand.NormFloat64() * 2332
 		rows = append(rows, r)

@@ -310,7 +310,7 @@ func TestMergeBlockStreamsManyStreamsManyBlocksManyRows(t *testing.T) {
 		var rows []rawRow
 		for j := 0; j < rowsPerStream; j++ {
 			r.TSID.MetricID = uint64(j % blocksCount)
-			r.Timestamp = int64(rand.Intn(1e10))
+			r.Timestamp = int64(rand.Intn(1e9))
 			r.Value = rand.NormFloat64()
 			rows = append(rows, r)

@@ -343,7 +343,7 @@ func TestMergeForciblyStop(t *testing.T) {
 		var rows []rawRow
 		for j := 0; j < rowsPerStream; j++ {
 			r.TSID.MetricID = uint64(j % blocksCount)
-			r.Timestamp = int64(rand.Intn(1e10))
+			r.Timestamp = int64(rand.Intn(1e9))
 			r.Value = rand.NormFloat64()
 			rows = append(rows, r)

--- a/lib/storage/metric_name.go
+++ b/lib/storage/metric_name.go
@@ -25,6 +25,17 @@ type Tag struct {
 	Value []byte
 }

+// Reset resets the tag.
+func (tag *Tag) Reset() {
+	tag.Key = tag.Key[:0]
+	tag.Value = tag.Value[:0]
+}
+
+// Equal returns true if tag equals t
+func (tag *Tag) Equal(t *Tag) bool {
+	return string(tag.Key) == string(t.Key) && string(tag.Value) == string(t.Value)
+}
+
 // Marshal appends marshaled tag to dst and returns the result.
 func (tag *Tag) Marshal(dst []byte) []byte {
 	dst = marshalTagValue(dst, tag.Key)
--- a/lib/storage/part.go
+++ b/lib/storage/part.go
@@ -5,6 +5,7 @@ import (
 	"path/filepath"
 	"sync"
 	"sync/atomic"
+	"unsafe"

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/filestream"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
@@ -27,8 +28,7 @@ var (
 	maxCachedIndexBlocksPerPartOnce sync.Once
 )

-// part represents a searchable part containing time series data.
-type part struct {
+type partInternals struct {
 	ph partHeader

 	// Filesystem path to the part.
@@ -44,7 +44,15 @@ type part struct {
 	indexFile      fs.ReadAtCloser

 	metaindex []metaindexRow
+}

+// part represents a searchable part containing time series data.
+type part struct {
+	partInternals
+
+	// Align ibCache to 8 bytes in order to align internal counters on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+	_       [(8 - (unsafe.Sizeof(partInternals{}) % 8)) % 8]byte
 	ibCache indexBlockCache
 }

@@ -107,27 +115,26 @@ func newPart(ph *partHeader, path string, size uint64, metaindexReader filestrea
 	}
 	metaindexReader.MustClose()

-	p := &part{
-		ph:             *ph,
-		path:           path,
-		size:           size,
-		timestampsFile: timestampsFile,
-		valuesFile:     valuesFile,
-		indexFile:      indexFile,
+	var p part
+	p.ph = *ph
+	p.path = path
+	p.size = size
+	p.timestampsFile = timestampsFile
+	p.valuesFile = valuesFile
+	p.indexFile = indexFile

-		metaindex: metaindex,
-	}
+	p.metaindex = metaindex

 	if len(errors) > 0 {
 		// Return only the first error, since it has no sense in returning all errors.
-		err = fmt.Errorf("cannot initialize part %q: %s", p, errors[0])
+		err = fmt.Errorf("cannot initialize part %q: %s", &p, errors[0])
 		p.MustClose()
 		return nil, err
 	}

 	p.ibCache.Init()

-	return p, nil
+	return &p, nil
 }

 // String returns human-readable representation of p.
@@ -168,12 +175,14 @@ func putIndexBlock(ib *indexBlock) {
 var indexBlockPool sync.Pool

 type indexBlockCache struct {
+	// Put atomic counters to the top of struct in order to align them to 8 bytes on 32-bit architectures.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+	requests uint64
+	misses   uint64
+
 	m         map[uint64]*indexBlock
 	missesMap map[uint64]uint64
 	mu        sync.RWMutex
-
-	requests uint64
-	misses   uint64
 }

 func (ibc *indexBlockCache) Init() {
--- a/lib/storage/part_search.go
+++ b/lib/storage/part_search.go
@@ -3,7 +3,9 @@ package storage
 import (
 	"fmt"
 	"io"
+	"os"
 	"sort"
+	"strings"

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
@@ -49,7 +51,7 @@ type partSearch struct {
 func (ps *partSearch) reset() {
 	ps.Block.Reset()
 	ps.p = nil
-	ps.tsids = ps.tsids[:0]
+	ps.tsids = nil
 	ps.tsidIdx = 0
 	ps.fetchData = true
 	ps.metaindex = nil
@@ -64,16 +66,24 @@ func (ps *partSearch) reset() {
 	ps.err = nil
 }

+var isInTest = func() bool {
+	return strings.HasSuffix(os.Args[0], ".test")
+}()
+
 // Init initializes the ps with the given p, tsids and tr.
+//
+// tsids must be sorted.
+// tsids cannot be modified after the Init call, since it is owned by ps.
 func (ps *partSearch) Init(p *part, tsids []TSID, tr TimeRange, fetchData bool) {
 	ps.reset()
 	ps.p = p

 	if p.ph.MinTimestamp <= tr.MaxTimestamp && p.ph.MaxTimestamp >= tr.MinTimestamp {
-		if !sort.SliceIsSorted(tsids, func(i, j int) bool { return tsids[i].Less(&tsids[j]) }) {
+		if isInTest && !sort.SliceIsSorted(tsids, func(i, j int) bool { return tsids[i].Less(&tsids[j]) }) {
 			logger.Panicf("BUG: tsids must be sorted; got %+v", tsids)
 		}
-		ps.tsids = append(ps.tsids[:0], tsids...)
+		// take ownership of of tsids.
+		ps.tsids = tsids
 	}
 	ps.tr = tr
 	ps.fetchData = fetchData
--- a/lib/storage/partition.go
+++ b/lib/storage/partition.go
@@ -19,17 +19,23 @@ import (
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 )

 func maxRowsPerSmallPart() uint64 {
 	// Small parts are cached in the OS page cache,
-	// so limit the number of rows for small part
-	// by the remaining free RAM.
+	// so limit the number of rows for small part by the remaining free RAM.
 	mem := memory.Remaining()
-	if mem <= 0 {
-		return 100e6
+	// Production data shows that each row occupies ~1 byte in the compressed part.
+	// It is expected no more than defaultPartsToMerge/2 parts exist
+	// in the OS page cache before they are merged into bigger part.
+	// Half of the remaining RAM must be left for lib/mergeset parts,
+	// so the maxItems is calculated using the below code:
+	maxRows := uint64(mem) / defaultPartsToMerge
+	if maxRows < 10e6 {
+		maxRows = 10e6
 	}
-	return uint64(mem) / defaultPartsToMerge
+	return maxRows
 }

 // The maximum number of rows per big part.
@@ -52,7 +58,7 @@ const defaultPartsToMerge = 15
 // It must be smaller than defaultPartsToMerge.
 // Lower value improves select performance at the cost of increased
 // write amplification.
-const finalPartsToMerge = 3
+const finalPartsToMerge = 2

 // getMaxRowsPerPartition returns the maximum number of rows that haven't been converted into parts yet.
 func getMaxRawRowsPerPartition() int {
@@ -84,11 +90,27 @@ const inmemoryPartsFlushInterval = 5 * time.Second

 // partition represents a partition.
 type partition struct {
+	// Put atomic counters to the top of struct, so they are aligned to 8 bytes on 32-bit arch.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+
+	activeBigMerges   uint64
+	activeSmallMerges uint64
+	bigMergesCount    uint64
+	smallMergesCount  uint64
+	bigRowsMerged     uint64
+	smallRowsMerged   uint64
+	bigRowsDeleted    uint64
+	smallRowsDeleted  uint64
+
+	smallAssistedMerges uint64
+
+	mergeIdx uint64
+
 	smallPartsPath string
 	bigPartsPath   string

 	// The callack that returns deleted metric ids which must be skipped during merge.
-	getDeletedMetricIDs func() map[uint64]struct{}
+	getDeletedMetricIDs func() *uint64set.Set

 	// Name is the name of the partition in the form YYYY_MM.
 	name string
@@ -117,8 +139,6 @@ type partition struct {
 	// rawRowsLastFlushTime is the last time rawRows are flushed.
 	rawRowsLastFlushTime time.Time

-	mergeIdx uint64
-
 	snapshotLock sync.RWMutex

 	stopCh chan struct{}
@@ -127,30 +147,22 @@ type partition struct {
 	bigPartsMergerWG       sync.WaitGroup
 	rawRowsFlusherWG       sync.WaitGroup
 	inmemoryPartsFlusherWG sync.WaitGroup
-
-	activeBigMerges   uint64
-	activeSmallMerges uint64
-	bigMergesCount    uint64
-	smallMergesCount  uint64
-	bigRowsMerged     uint64
-	smallRowsMerged   uint64
-	bigRowsDeleted    uint64
-	smallRowsDeleted  uint64
-
-	smallAssistedMerges uint64
 }

 // partWrapper is a wrapper for the part.
 type partWrapper struct {
+	// Put atomic counters to the top of struct, so they are aligned to 8 bytes on 32-bit arch.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+
+	// The number of references to the part.
+	refCount uint64
+
 	// The part itself.
 	p *part

 	// non-nil if the part is inmemoryPart.
 	mp *inmemoryPart

-	// The number of references to the part.
-	refCount uint64
-
 	// Whether the part is in merge now.
 	isInMerge bool
 }
@@ -178,7 +190,7 @@ func (pw *partWrapper) decRef() {

 // createPartition creates new partition for the given timestamp and the given paths
 // to small and big partitions.
-func createPartition(timestamp int64, smallPartitionsPath, bigPartitionsPath string, getDeletedMetricIDs func() map[uint64]struct{}) (*partition, error) {
+func createPartition(timestamp int64, smallPartitionsPath, bigPartitionsPath string, getDeletedMetricIDs func() *uint64set.Set) (*partition, error) {
 	name := timestampToPartitionName(timestamp)
 	smallPartsPath := filepath.Clean(smallPartitionsPath) + "/" + name
 	bigPartsPath := filepath.Clean(bigPartitionsPath) + "/" + name
@@ -213,7 +225,7 @@ func (pt *partition) Drop() {
 }

 // openPartition opens the existing partition from the given paths.
-func openPartition(smallPartsPath, bigPartsPath string, getDeletedMetricIDs func() map[uint64]struct{}) (*partition, error) {
+func openPartition(smallPartsPath, bigPartsPath string, getDeletedMetricIDs func() *uint64set.Set) (*partition, error) {
 	smallPartsPath = filepath.Clean(smallPartsPath)
 	bigPartsPath = filepath.Clean(bigPartsPath)

@@ -250,7 +262,7 @@ func openPartition(smallPartsPath, bigPartsPath string, getDeletedMetricIDs func
 	return pt, nil
 }

-func newPartition(name, smallPartsPath, bigPartsPath string, getDeletedMetricIDs func() map[uint64]struct{}) *partition {
+func newPartition(name, smallPartsPath, bigPartsPath string, getDeletedMetricIDs func() *uint64set.Set) *partition {
 	return &partition{
 		name:           name,
 		smallPartsPath: smallPartsPath,
@@ -722,7 +734,7 @@ func (pt *partition) mergePartsOptimal(pws []*partWrapper) error {
 	return nil
 }

-var mergeWorkers = func() int {
+var mergeWorkersCount = func() int {
 	n := runtime.GOMAXPROCS(-1) / 2
 	if n <= 0 {
 		n = 1
@@ -730,16 +742,47 @@ var mergeWorkers = func() int {
 	return n
 }()

+var (
+	bigMergeWorkersCount   = uint64(mergeWorkersCount)
+	smallMergeWorkersCount = uint64(mergeWorkersCount)
+)
+
+var (
+	bigMergeConcurrencyLimitCh   = make(chan struct{}, bigMergeWorkersCount)
+	smallMergeConcurrencyLimitCh = make(chan struct{}, smallMergeWorkersCount)
+)
+
+// SetBigMergeWorkersCount sets the maximum number of concurrent mergers for big blocks.
+//
+// The function must be called before opening or creating any storage.
+func SetBigMergeWorkersCount(n int) {
+	if n <= 0 {
+		// Do nothing
+		return
+	}
+	atomic.StoreUint64(&bigMergeWorkersCount, uint64(n))
+}
+
+// SetSmallMergeWorkersCount sets the maximum number of concurrent mergers for small blocks.
+//
+// The function must be called before opening or creating any storage.
+func SetSmallMergeWorkersCount(n int) {
+	if n <= 0 {
+		// Do nothing
+		return
+	}
+	atomic.StoreUint64(&smallMergeWorkersCount, uint64(n))
+}
+
 func (pt *partition) startMergeWorkers() {
-	for i := 0; i < mergeWorkers; i++ {
+	for i := 0; i < mergeWorkersCount; i++ {
 		pt.smallPartsMergerWG.Add(1)
 		go func() {
 			pt.smallPartsMerger()
 			pt.smallPartsMergerWG.Done()
 		}()
 	}
-
-	for i := 0; i < mergeWorkers; i++ {
+	for i := 0; i < mergeWorkersCount; i++ {
 		pt.bigPartsMergerWG.Add(1)
 		go func() {
 			pt.bigPartsMerger()
@@ -786,7 +829,7 @@ func (pt *partition) partsMerger(mergerFunc func(isFinal bool) error) error {
 		if err != errNothingToMerge {
 			return err
 		}
-		if time.Since(lastMergeTime) > 10*time.Second {
+		if time.Since(lastMergeTime) > 30*time.Second {
 			// We have free time for merging into bigger parts.
 			// This should improve select performance.
 			lastMergeTime = time.Now()
@@ -813,11 +856,11 @@ func maxRowsByPath(path string) uint64 {

 	// Calculate the maximum number of rows in the output merge part
 	// by dividing the freeSpace by the number of concurrent
-	// mergeWorkers for big parts.
+	// mergeWorkersCount for big parts.
 	// This assumes each row is compressed into 1 byte. Production
 	// simulation shows that each row usually occupies up to 0.5 bytes,
 	// so this is quite safe assumption.
-	maxRows := freeSpace / uint64(mergeWorkers)
+	maxRows := freeSpace / uint64(mergeWorkersCount)
 	if maxRows > maxRowsPerBigPart {
 		maxRows = maxRowsPerBigPart
 	}
@@ -854,6 +897,11 @@ type freeSpaceEntry struct {
 }

 func (pt *partition) mergeBigParts(isFinal bool) error {
+	bigMergeConcurrencyLimitCh <- struct{}{}
+	defer func() {
+		<-bigMergeConcurrencyLimitCh
+	}()
+
 	maxRows := maxRowsByPath(pt.bigPartsPath)

 	pt.partsLock.Lock()
@@ -873,10 +921,15 @@ func (pt *partition) mergeBigParts(isFinal bool) error {
 }

 func (pt *partition) mergeSmallParts(isFinal bool) error {
+	smallMergeConcurrencyLimitCh <- struct{}{}
+	defer func() {
+		<-smallMergeConcurrencyLimitCh
+	}()
+
 	maxRows := maxRowsByPath(pt.smallPartsPath)
 	if maxRows > maxRowsPerSmallPart() {
 		// The output part may go to big part,
-		// so make sure it as enough space.
+		// so make sure it has enough space.
 		maxBigPartRows := maxRowsByPath(pt.bigPartsPath)
 		if maxRows > maxBigPartRows {
 			maxRows = maxBigPartRows
@@ -1153,13 +1206,10 @@ func appendPartsToMerge(dst, src []*partWrapper, maxPartsToMerge int, maxRows ui
 	sort.Slice(src, func(i, j int) bool {
 		a := &src[i].p.ph
 		b := &src[j].p.ph
-		if a.RowsCount < b.RowsCount {
-			return true
+		if a.RowsCount == b.RowsCount {
+			return a.MinTimestamp > b.MinTimestamp
 		}
-		if a.RowsCount > b.RowsCount {
-			return false
-		}
-		return a.MinTimestamp > b.MinTimestamp
+		return a.RowsCount < b.RowsCount
 	})

 	n := maxPartsToMerge
@@ -1173,31 +1223,32 @@ func appendPartsToMerge(dst, src []*partWrapper, maxPartsToMerge int, maxRows ui
 	maxM := float64(0)
 	for i := 2; i <= n; i++ {
 		for j := 0; j <= len(src)-i; j++ {
+			a := src[j : j+i]
 			rowsSum := uint64(0)
-			for _, pw := range src[j : j+i] {
+			for _, pw := range a {
 				rowsSum += pw.p.ph.RowsCount
 			}
 			if rowsSum > maxRows {
-				continue
+				// There is no need in verifying remaining parts with higher number of rows
+				break
 			}
-			m := float64(rowsSum) / float64(src[j+i-1].p.ph.RowsCount)
+			m := float64(rowsSum) / float64(a[len(a)-1].p.ph.RowsCount)
 			if m < maxM {
 				continue
 			}
 			maxM = m
-			pws = src[j : j+i]
+			pws = a
 		}
 	}

-	minM := float64(maxPartsToMerge / 2)
-	if minM < 2 {
-		minM = 2
+	minM := float64(maxPartsToMerge) / 2
+	if minM < 1.7 {
+		minM = 1.7
 	}
 	if maxM < minM {
 		// There is no sense in merging parts with too small m.
 		return dst
 	}
-
 	return append(dst, pws...)
 }

--- a/lib/storage/partition_search.go
+++ b/lib/storage/partition_search.go
@@ -55,7 +55,10 @@ func (pts *partitionSearch) reset() {

 // Init initializes the search in the given partition for the given tsid and tr.
 //
-// MustClose must be called when partition search is done.
+// tsids must be sorted.
+// tsids cannot be modified after the Init call, since it is owned by pts.
+//
+/// MustClose must be called when partition search is done.
 func (pts *partitionSearch) Init(pt *partition, tsids []TSID, tr TimeRange, fetchData bool) {
 	if pts.needClosing {
 		logger.Panicf("BUG: missing partitionSearch.MustClose call before the next call to Init")
--- a/lib/storage/partition_search_test.go
+++ b/lib/storage/partition_search_test.go
@@ -7,6 +7,8 @@ import (
 	"sort"
 	"testing"
 	"time"
+
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 )

 func TestPartitionSearch(t *testing.T) {
@@ -284,6 +286,6 @@ func testPartitionSearchSerial(pt *partition, tsids []TSID, tr TimeRange, rbsExp
 	return nil
 }

-func nilGetDeletedMetricIDs() map[uint64]struct{} {
+func nilGetDeletedMetricIDs() *uint64set.Set {
 	return nil
 }
--- a/lib/storage/partition_test.go
+++ b/lib/storage/partition_test.go
@@ -14,34 +14,34 @@ func TestPartitionMaxRowsByPath(t *testing.T) {
 }

 func TestAppendPartsToMerge(t *testing.T) {
-	testAppendPartsToMerge(t, 2, []int{}, nil)
-	testAppendPartsToMerge(t, 2, []int{123}, nil)
-	testAppendPartsToMerge(t, 2, []int{4, 2}, nil)
-	testAppendPartsToMerge(t, 2, []int{128, 64, 32, 16, 8, 4, 2, 1}, nil)
-	testAppendPartsToMerge(t, 4, []int{128, 64, 32, 10, 9, 7, 2, 1}, []int{2, 7, 9, 10})
-	testAppendPartsToMerge(t, 2, []int{128, 64, 32, 16, 8, 4, 2, 2}, []int{2, 2})
-	testAppendPartsToMerge(t, 4, []int{128, 64, 32, 16, 8, 4, 2, 2}, []int{2, 2, 4, 8})
-	testAppendPartsToMerge(t, 2, []int{1, 1}, []int{1, 1})
-	testAppendPartsToMerge(t, 2, []int{2, 2, 2}, []int{2, 2})
-	testAppendPartsToMerge(t, 2, []int{4, 2, 4}, []int{4, 4})
-	testAppendPartsToMerge(t, 2, []int{1, 3, 7, 2}, nil)
-	testAppendPartsToMerge(t, 3, []int{1, 3, 7, 2}, []int{1, 2, 3})
-	testAppendPartsToMerge(t, 4, []int{1, 3, 7, 2}, []int{1, 2, 3})
-	testAppendPartsToMerge(t, 3, []int{11, 1, 10, 100, 10}, []int{10, 10, 11})
+	testAppendPartsToMerge(t, 2, []uint64{}, nil)
+	testAppendPartsToMerge(t, 2, []uint64{123}, nil)
+	testAppendPartsToMerge(t, 2, []uint64{4, 2}, nil)
+	testAppendPartsToMerge(t, 2, []uint64{128, 64, 32, 16, 8, 4, 2, 1}, nil)
+	testAppendPartsToMerge(t, 4, []uint64{128, 64, 32, 10, 9, 7, 2, 1}, []uint64{2, 7, 9, 10})
+	testAppendPartsToMerge(t, 2, []uint64{128, 64, 32, 16, 8, 4, 2, 2}, []uint64{2, 2})
+	testAppendPartsToMerge(t, 4, []uint64{128, 64, 32, 16, 8, 4, 2, 2}, []uint64{2, 2, 4, 8})
+	testAppendPartsToMerge(t, 2, []uint64{1, 1}, []uint64{1, 1})
+	testAppendPartsToMerge(t, 2, []uint64{2, 2, 2}, []uint64{2, 2})
+	testAppendPartsToMerge(t, 2, []uint64{4, 2, 4}, []uint64{4, 4})
+	testAppendPartsToMerge(t, 2, []uint64{1, 3, 7, 2}, nil)
+	testAppendPartsToMerge(t, 3, []uint64{1, 3, 7, 2}, []uint64{1, 2, 3})
+	testAppendPartsToMerge(t, 4, []uint64{1, 3, 7, 2}, []uint64{1, 2, 3})
+	testAppendPartsToMerge(t, 3, []uint64{11, 1, 10, 100, 10}, []uint64{10, 10, 11})
 }

 func TestAppendPartsToMergeManyParts(t *testing.T) {
 	// Verify that big number of parts are merged into minimal number of parts
 	// using minimum merges.
-	var a []int
+	var a []uint64
 	maxOutPartRows := uint64(0)
 	for i := 0; i < 1024; i++ {
-		n := int(rand.NormFloat64() * 1e9)
+		n := uint64(uint32(rand.NormFloat64() * 1e9))
 		if n < 0 {
 			n = -n
 		}
 		n++
-		maxOutPartRows += uint64(n)
+		maxOutPartRows += n
 		a = append(a, n)
 	}
 	pws := newTestPartWrappersForRowsCount(a)
@@ -67,11 +67,10 @@ func TestAppendPartsToMergeManyParts(t *testing.T) {
 			}
 		}
 		pw := &partWrapper{
-			p: &part{
-				ph: partHeader{
-					RowsCount: rowsCount,
-				},
-			},
+			p: &part{},
+		}
+		pw.p.ph = partHeader{
+			RowsCount: rowsCount,
 		}
 		rowsMerged += rowsCount
 		pwsNew = append(pwsNew, pw)
@@ -94,7 +93,7 @@ func TestAppendPartsToMergeManyParts(t *testing.T) {
 	}
 }

-func testAppendPartsToMerge(t *testing.T, maxPartsToMerge int, initialRowsCount, expectedRowsCount []int) {
+func testAppendPartsToMerge(t *testing.T, maxPartsToMerge int, initialRowsCount, expectedRowsCount []uint64) {
 	t.Helper()

 	pws := newTestPartWrappersForRowsCount(initialRowsCount)
@@ -111,8 +110,10 @@ func testAppendPartsToMerge(t *testing.T, maxPartsToMerge int, initialRowsCount,
 	prefix := []*partWrapper{
 		{
 			p: &part{
-				ph: partHeader{
-					RowsCount: 1234,
+				partInternals: partInternals{
+					ph: partHeader{
+						RowsCount: 1234,
+					},
 				},
 			},
 		},
@@ -132,21 +133,23 @@ func testAppendPartsToMerge(t *testing.T, maxPartsToMerge int, initialRowsCount,
 	}
 }

-func newTestRowsCountFromPartWrappers(pws []*partWrapper) []int {
-	var rowsCount []int
+func newTestRowsCountFromPartWrappers(pws []*partWrapper) []uint64 {
+	var rowsCount []uint64
 	for _, pw := range pws {
-		rowsCount = append(rowsCount, int(pw.p.ph.RowsCount))
+		rowsCount = append(rowsCount, pw.p.ph.RowsCount)
 	}
 	return rowsCount
 }

-func newTestPartWrappersForRowsCount(rowsCount []int) []*partWrapper {
+func newTestPartWrappersForRowsCount(rowsCount []uint64) []*partWrapper {
 	var pws []*partWrapper
 	for _, rc := range rowsCount {
 		pw := &partWrapper{
 			p: &part{
-				ph: partHeader{
-					RowsCount: uint64(rc),
+				partInternals: partInternals{
+					ph: partHeader{
+						RowsCount: rc,
+					},
 				},
 			},
 		}
--- a/lib/storage/raw_row.go
+++ b/lib/storage/raw_row.go
@@ -48,42 +48,30 @@ type rawRowsSort []rawRow
 func (rrs *rawRowsSort) Len() int { return len(*rrs) }
 func (rrs *rawRowsSort) Less(i, j int) bool {
 	x := *rrs
+	if i < 0 || j < 0 || i >= len(x) || j >= len(x) {
+		// This is no-op for compiler, so it doesn't generate panic code
+		// for out of range access on x[i], x[j] below
+		return false
+	}
 	a := &x[i]
 	b := &x[j]
 	ta := &a.TSID
 	tb := &b.TSID
-	if ta.MetricID == tb.MetricID {
-		// Fast path - identical TSID values.
-		return a.Timestamp < b.Timestamp
-	}

-	// Slow path - compare TSIDs.
 	// Manually inline TSID.Less here, since the compiler doesn't inline it yet :(
-	if ta.MetricGroupID < tb.MetricGroupID {
-		return true
+	if ta.MetricGroupID != tb.MetricGroupID {
+		return ta.MetricGroupID < tb.MetricGroupID
 	}
-	if ta.MetricGroupID > tb.MetricGroupID {
-		return false
+	if ta.JobID != tb.JobID {
+		return ta.JobID < tb.JobID
 	}
-	if ta.JobID < tb.JobID {
-		return true
+	if ta.InstanceID != tb.InstanceID {
+		return ta.InstanceID < tb.InstanceID
 	}
-	if ta.JobID > tb.JobID {
-		return false
+	if ta.MetricID != tb.MetricID {
+		return ta.MetricID < tb.MetricID
 	}
-	if ta.InstanceID < tb.InstanceID {
-		return true
-	}
-	if ta.InstanceID > tb.InstanceID {
-		return false
-	}
-	if ta.MetricID < tb.MetricID {
-		return true
-	}
-	if ta.MetricID > tb.MetricID {
-		return false
-	}
-	return false
+	return a.Timestamp < b.Timestamp
 }
 func (rrs *rawRowsSort) Swap(i, j int) {
 	x := *rrs
--- a/lib/storage/storage.go
+++ b/lib/storage/storage.go
@@ -20,6 +20,7 @@ import (
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/memory"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/workingsetcache"
 	"github.com/VictoriaMetrics/fastcache"
 )
@@ -28,6 +29,15 @@ const maxRetentionMonths = 12 * 100

 // Storage represents TSDB storage.
 type Storage struct {
+	// Atomic counters must go at the top of the structure in order to properly align by 8 bytes on 32-bit archs.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212 .
+	tooSmallTimestampRows uint64
+	tooBigTimestampRows   uint64
+
+	addRowsConcurrencyLimitReached uint64
+	addRowsConcurrencyLimitTimeout uint64
+	addRowsConcurrencyDroppedRows  uint64
+
 	path            string
 	cachePath       string
 	retentionMonths int
@@ -59,19 +69,12 @@ type Storage struct {

 	// Pending MetricID values to be added to currHourMetricIDs.
 	pendingHourMetricIDsLock sync.Mutex
-	pendingHourMetricIDs     map[uint64]struct{}
+	pendingHourMetricIDs     *uint64set.Set

 	stop chan struct{}

 	currHourMetricIDsUpdaterWG sync.WaitGroup
 	retentionWatcherWG         sync.WaitGroup
-
-	tooSmallTimestampRows uint64
-	tooBigTimestampRows   uint64
-
-	addRowsConcurrencyLimitReached uint64
-	addRowsConcurrencyLimitTimeout uint64
-	addRowsConcurrencyDroppedRows  uint64
 }

 // OpenStorage opens storage on the given path with the given number of retention months.
@@ -122,7 +125,7 @@ func OpenStorage(path string, retentionMonths int) (*Storage, error) {
 	hmPrev := s.mustLoadHourMetricIDs(hour-1, "prev_hour_metric_ids")
 	s.currHourMetricIDs.Store(hmCurr)
 	s.prevHourMetricIDs.Store(hmPrev)
-	s.pendingHourMetricIDs = make(map[uint64]struct{})
+	s.pendingHourMetricIDs = &uint64set.Set{}

 	// Load indexdb
 	idbPath := path + "/indexdb"
@@ -158,7 +161,7 @@ func (s *Storage) debugFlush() {
 	s.idb().tb.DebugFlush()
 }

-func (s *Storage) getDeletedMetricIDs() map[uint64]struct{} {
+func (s *Storage) getDeletedMetricIDs() *uint64set.Set {
 	return s.idb().getDeletedMetricIDs()
 }

@@ -364,9 +367,9 @@ func (s *Storage) UpdateMetrics(m *Metrics) {

 	hmCurr := s.currHourMetricIDs.Load().(*hourMetricIDs)
 	hmPrev := s.prevHourMetricIDs.Load().(*hourMetricIDs)
-	hourMetricIDsLen := len(hmPrev.m)
-	if len(hmCurr.m) > hourMetricIDsLen {
-		hourMetricIDsLen = len(hmCurr.m)
+	hourMetricIDsLen := hmPrev.m.Len()
+	if hmCurr.m.Len() > hourMetricIDsLen {
+		hourMetricIDsLen = hmCurr.m.Len()
 	}
 	m.HourMetricIDCacheSize += uint64(hourMetricIDsLen)

@@ -508,11 +511,11 @@ func (s *Storage) mustLoadHourMetricIDs(hour uint64, name string) *hourMetricIDs
 		logger.Errorf("discarding %s, since it has broken body; got %d bytes; want %d bytes", path, len(src), 8*hmLen)
 		return &hourMetricIDs{}
 	}
-	m := make(map[uint64]struct{}, hmLen)
+	m := &uint64set.Set{}
 	for i := uint64(0); i < hmLen; i++ {
 		metricID := encoding.UnmarshalUint64(src)
 		src = src[8:]
-		m[metricID] = struct{}{}
+		m.Add(metricID)
 	}
 	logger.Infof("loaded %s from %q in %s; entriesCount: %d; sizeBytes: %d", name, path, time.Since(startTime), hmLen, srcOrigLen)
 	return &hourMetricIDs{
@@ -526,21 +529,21 @@ func (s *Storage) mustSaveHourMetricIDs(hm *hourMetricIDs, name string) {
 	path := s.cachePath + "/" + name
 	logger.Infof("saving %s to %q...", name, path)
 	startTime := time.Now()
-	dst := make([]byte, 0, len(hm.m)*8+24)
+	dst := make([]byte, 0, hm.m.Len()*8+24)
 	isFull := uint64(0)
 	if hm.isFull {
 		isFull = 1
 	}
 	dst = encoding.MarshalUint64(dst, isFull)
 	dst = encoding.MarshalUint64(dst, hm.hour)
-	dst = encoding.MarshalUint64(dst, uint64(len(hm.m)))
-	for metricID := range hm.m {
+	dst = encoding.MarshalUint64(dst, uint64(hm.m.Len()))
+	for _, metricID := range hm.m.AppendTo(nil) {
 		dst = encoding.MarshalUint64(dst, metricID)
 	}
 	if err := ioutil.WriteFile(path, dst, 0644); err != nil {
 		logger.Panicf("FATAL: cannot write %d bytes to %q: %s", len(dst), path, err)
 	}
-	logger.Infof("saved %s to %q in %s; entriesCount: %d; sizeBytes: %d", name, path, time.Since(startTime), len(hm.m), len(dst))
+	logger.Infof("saved %s to %q in %s; entriesCount: %d; sizeBytes: %d", name, path, time.Since(startTime), hm.m.Len(), len(dst))
 }

 func (s *Storage) mustLoadCache(info, name string, sizeBytes int) *workingsetcache.Cache {
@@ -579,7 +582,7 @@ func nextRetentionDuration(retentionMonths int) time.Duration {
 	return deadline.Sub(t)
 }

-// searchTSIDs returns TSIDs for the given tfss and the given tr.
+// searchTSIDs returns sorted TSIDs for the given tfss and the given tr.
 func (s *Storage) searchTSIDs(tfss []*TagFilters, tr TimeRange, maxMetrics int) ([]TSID, error) {
 	// Do not cache tfss -> tsids here, since the caching is performed
 	// on idb level.
@@ -770,7 +773,7 @@ var (

 func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]rawRow, error) {
 	// Return only the last error, since it has no sense in returning all errors.
-	var lastError error
+	var lastWarn error

 	var is *indexSearch
 	var mn *MetricName
@@ -794,13 +797,13 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
 		}
 		if mr.Timestamp < minTimestamp {
 			// Skip rows with too small timestamps outside the retention.
-			lastError = fmt.Errorf("cannot insert row with too small timestamp %d outside the retention; minimum allowed timestamp is %d", mr.Timestamp, minTimestamp)
+			lastWarn = fmt.Errorf("cannot insert row with too small timestamp %d outside the retention; minimum allowed timestamp is %d", mr.Timestamp, minTimestamp)
 			atomic.AddUint64(&s.tooSmallTimestampRows, 1)
 			continue
 		}
 		if mr.Timestamp > maxTimestamp {
 			// Skip rows with too big timestamps significantly exceeding the current time.
-			lastError = fmt.Errorf("cannot insert row with too big timestamp %d exceeding the current time; maximum allowd timestamp is %d", mr.Timestamp, maxTimestamp)
+			lastWarn = fmt.Errorf("cannot insert row with too big timestamp %d exceeding the current time; maximum allowd timestamp is %d", mr.Timestamp, maxTimestamp)
 			atomic.AddUint64(&s.tooBigTimestampRows, 1)
 			continue
 		}
@@ -810,11 +813,7 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
 		r.Value = mr.Value
 		r.PrecisionBits = precisionBits
 		if s.getTSIDFromCache(&r.TSID, mr.MetricNameRaw) {
-			if len(dmis) == 0 {
-				// Fast path - the TSID for the given MetricName has been found in cache and isn't deleted.
-				continue
-			}
-			if _, deleted := dmis[r.TSID.MetricID]; !deleted {
+			if !dmis.Has(r.TSID.MetricID) {
 				// Fast path - the TSID for the given MetricName has been found in cache and isn't deleted.
 				continue
 			}
@@ -830,7 +829,7 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
 			// Do not stop adding rows on error - just skip invalid row.
 			// This guarantees that invalid rows don't prevent
 			// from adding valid rows into the storage.
-			lastError = fmt.Errorf("cannot unmarshal MetricNameRaw %q: %s", mr.MetricNameRaw, err)
+			lastWarn = fmt.Errorf("cannot unmarshal MetricNameRaw %q: %s", mr.MetricNameRaw, err)
 			j--
 			continue
 		}
@@ -840,12 +839,15 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
 			// Do not stop adding rows on error - just skip invalid row.
 			// This guarantees that invalid rows don't prevent
 			// from adding valid rows into the storage.
-			lastError = fmt.Errorf("cannot obtain TSID for MetricName %q: %s", kb.B, err)
+			lastWarn = fmt.Errorf("cannot obtain TSID for MetricName %q: %s", kb.B, err)
 			j--
 			continue
 		}
 		s.putTSIDToCache(&r.TSID, mr.MetricNameRaw)
 	}
+	if lastWarn != nil {
+		logger.Errorf("warn occurred during rows addition: %s", lastWarn)
+	}
 	if is != nil {
 		kbPool.Put(kb)
 		PutMetricName(mn)
@@ -853,12 +855,15 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
 	}
 	rows = rows[:rowsLen+j]

+	var lastError error
 	if err := s.tb.AddRows(rows); err != nil {
 		lastError = fmt.Errorf("cannot add rows to table: %s", err)
 	}
-	lastError = s.updateDateMetricIDCache(rows, lastError)
+	if err := s.updateDateMetricIDCache(rows, lastError); err != nil {
+		lastError = err
+	}
 	if lastError != nil {
-		return rows, fmt.Errorf("errors occurred during rows addition: %s", lastError)
+		return rows, fmt.Errorf("error occurred during rows addition: %s", lastError)
 	}
 	return rows, nil
 }
@@ -884,12 +889,12 @@ func (s *Storage) updateDateMetricIDCache(rows []rawRow, lastError error) error
 		hm := s.currHourMetricIDs.Load().(*hourMetricIDs)
 		if hour == hm.hour {
 			// The r belongs to the current hour. Check for the current hour cache.
-			if _, ok := hm.m[metricID]; ok {
+			if hm.m.Has(metricID) {
 				// Fast path: the metricID is in the current hour cache.
 				continue
 			}
 			s.pendingHourMetricIDsLock.Lock()
-			s.pendingHourMetricIDs[metricID] = struct{}{}
+			s.pendingHourMetricIDs.Add(metricID)
 			s.pendingHourMetricIDsLock.Unlock()
 		}

@@ -915,7 +920,7 @@ func (s *Storage) updateDateMetricIDCache(rows []rawRow, lastError error) error
 func (s *Storage) updateCurrHourMetricIDs() {
 	hm := s.currHourMetricIDs.Load().(*hourMetricIDs)
 	s.pendingHourMetricIDsLock.Lock()
-	newMetricIDsLen := len(s.pendingHourMetricIDs)
+	newMetricIDsLen := s.pendingHourMetricIDs.Len()
 	s.pendingHourMetricIDsLock.Unlock()
 	hour := uint64(timestampFromTime(time.Now())) / msecPerHour
 	if newMetricIDsLen == 0 && hm.hour == hour {
@@ -924,23 +929,20 @@ func (s *Storage) updateCurrHourMetricIDs() {
 	}

 	// Slow path: hm.m must be updated with non-empty s.pendingHourMetricIDs.
-	var m map[uint64]struct{}
+	var m *uint64set.Set
 	isFull := hm.isFull
 	if hm.hour == hour {
-		m = make(map[uint64]struct{}, len(hm.m)+newMetricIDsLen)
-		for metricID := range hm.m {
-			m[metricID] = struct{}{}
-		}
+		m = hm.m.Clone()
 	} else {
-		m = make(map[uint64]struct{}, newMetricIDsLen)
+		m = &uint64set.Set{}
 		isFull = true
 	}
 	s.pendingHourMetricIDsLock.Lock()
-	newMetricIDs := s.pendingHourMetricIDs
-	s.pendingHourMetricIDs = make(map[uint64]struct{}, len(newMetricIDs))
+	newMetricIDs := s.pendingHourMetricIDs.AppendTo(nil)
+	s.pendingHourMetricIDs = &uint64set.Set{}
 	s.pendingHourMetricIDsLock.Unlock()
-	for metricID := range newMetricIDs {
-		m[metricID] = struct{}{}
+	for _, metricID := range newMetricIDs {
+		m.Add(metricID)
 	}

 	hmNew := &hourMetricIDs{
@@ -955,7 +957,7 @@ func (s *Storage) updateCurrHourMetricIDs() {
 }

 type hourMetricIDs struct {
-	m      map[uint64]struct{}
+	m      *uint64set.Set
 	hour   uint64
 	isFull bool
 }
--- a/lib/storage/storage_test.go
+++ b/lib/storage/storage_test.go
@@ -9,6 +9,8 @@ import (
 	"testing"
 	"testing/quick"
 	"time"
+
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 )

 func TestUpdateCurrHourMetricIDs(t *testing.T) {
@@ -16,19 +18,18 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 		var s Storage
 		s.currHourMetricIDs.Store(&hourMetricIDs{})
 		s.prevHourMetricIDs.Store(&hourMetricIDs{})
-		s.pendingHourMetricIDs = make(map[uint64]struct{})
+		s.pendingHourMetricIDs = &uint64set.Set{}
 		return &s
 	}
 	t.Run("empty_pedning_metric_ids_stale_curr_hour", func(t *testing.T) {
 		s := newStorage()
 		hour := uint64(timestampFromTime(time.Now())) / msecPerHour
 		hmOrig := &hourMetricIDs{
-			m: map[uint64]struct{}{
-				12: {},
-				34: {},
-			},
+			m:    &uint64set.Set{},
 			hour: 123,
 		}
+		hmOrig.m.Add(12)
+		hmOrig.m.Add(34)
 		s.currHourMetricIDs.Store(hmOrig)
 		s.updateCurrHourMetricIDs()
 		hmCurr := s.currHourMetricIDs.Load().(*hourMetricIDs)
@@ -39,8 +40,8 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 				t.Fatalf("unexpected hmCurr.hour; got %d; want %d", hmCurr.hour, hour)
 			}
 		}
-		if len(hmCurr.m) != 0 {
-			t.Fatalf("unexpected length of hm.m; got %d; want %d", len(hmCurr.m), 0)
+		if hmCurr.m.Len() != 0 {
+			t.Fatalf("unexpected length of hm.m; got %d; want %d", hmCurr.m.Len(), 0)
 		}
 		if !hmCurr.isFull {
 			t.Fatalf("unexpected hmCurr.isFull; got %v; want %v", hmCurr.isFull, true)
@@ -51,20 +52,19 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 			t.Fatalf("unexpected hmPrev; got %v; want %v", hmPrev, hmOrig)
 		}

-		if len(s.pendingHourMetricIDs) != 0 {
-			t.Fatalf("unexpected len(s.pendingHourMetricIDs); got %d; want %d", len(s.pendingHourMetricIDs), 0)
+		if s.pendingHourMetricIDs.Len() != 0 {
+			t.Fatalf("unexpected s.pendingHourMetricIDs.Len(); got %d; want %d", s.pendingHourMetricIDs.Len(), 0)
 		}
 	})
 	t.Run("empty_pedning_metric_ids_valid_curr_hour", func(t *testing.T) {
 		s := newStorage()
 		hour := uint64(timestampFromTime(time.Now())) / msecPerHour
 		hmOrig := &hourMetricIDs{
-			m: map[uint64]struct{}{
-				12: {},
-				34: {},
-			},
+			m:    &uint64set.Set{},
 			hour: hour,
 		}
+		hmOrig.m.Add(12)
+		hmOrig.m.Add(34)
 		s.currHourMetricIDs.Store(hmOrig)
 		s.updateCurrHourMetricIDs()
 		hmCurr := s.currHourMetricIDs.Load().(*hourMetricIDs)
@@ -90,27 +90,25 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 			t.Fatalf("unexpected hmPrev; got %v; want %v", hmPrev, hmEmpty)
 		}

-		if len(s.pendingHourMetricIDs) != 0 {
-			t.Fatalf("unexpected len(s.pendingHourMetricIDs); got %d; want %d", len(s.pendingHourMetricIDs), 0)
+		if s.pendingHourMetricIDs.Len() != 0 {
+			t.Fatalf("unexpected s.pendingHourMetricIDs.Len(); got %d; want %d", s.pendingHourMetricIDs.Len(), 0)
 		}
 	})
 	t.Run("nonempty_pending_metric_ids_stale_curr_hour", func(t *testing.T) {
 		s := newStorage()
-		pendingHourMetricIDs := map[uint64]struct{}{
-			343:     {},
-			32424:   {},
-			8293432: {},
-		}
+		pendingHourMetricIDs := &uint64set.Set{}
+		pendingHourMetricIDs.Add(343)
+		pendingHourMetricIDs.Add(32424)
+		pendingHourMetricIDs.Add(8293432)
 		s.pendingHourMetricIDs = pendingHourMetricIDs

 		hour := uint64(timestampFromTime(time.Now())) / msecPerHour
 		hmOrig := &hourMetricIDs{
-			m: map[uint64]struct{}{
-				12: {},
-				34: {},
-			},
+			m:    &uint64set.Set{},
 			hour: 123,
 		}
+		hmOrig.m.Add(12)
+		hmOrig.m.Add(34)
 		s.currHourMetricIDs.Store(hmOrig)
 		s.updateCurrHourMetricIDs()
 		hmCurr := s.currHourMetricIDs.Load().(*hourMetricIDs)
@@ -133,27 +131,25 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 			t.Fatalf("unexpected hmPrev; got %v; want %v", hmPrev, hmOrig)
 		}

-		if len(s.pendingHourMetricIDs) != 0 {
-			t.Fatalf("unexpected len(s.pendingHourMetricIDs); got %d; want %d", len(s.pendingHourMetricIDs), 0)
+		if s.pendingHourMetricIDs.Len() != 0 {
+			t.Fatalf("unexpected s.pendingHourMetricIDs.Len(); got %d; want %d", s.pendingHourMetricIDs.Len(), 0)
 		}
 	})
 	t.Run("nonempty_pending_metric_ids_valid_curr_hour", func(t *testing.T) {
 		s := newStorage()
-		pendingHourMetricIDs := map[uint64]struct{}{
-			343:     {},
-			32424:   {},
-			8293432: {},
-		}
+		pendingHourMetricIDs := &uint64set.Set{}
+		pendingHourMetricIDs.Add(343)
+		pendingHourMetricIDs.Add(32424)
+		pendingHourMetricIDs.Add(8293432)
 		s.pendingHourMetricIDs = pendingHourMetricIDs

 		hour := uint64(timestampFromTime(time.Now())) / msecPerHour
 		hmOrig := &hourMetricIDs{
-			m: map[uint64]struct{}{
-				12: {},
-				34: {},
-			},
+			m:    &uint64set.Set{},
 			hour: hour,
 		}
+		hmOrig.m.Add(12)
+		hmOrig.m.Add(34)
 		s.currHourMetricIDs.Store(hmOrig)
 		s.updateCurrHourMetricIDs()
 		hmCurr := s.currHourMetricIDs.Load().(*hourMetricIDs)
@@ -166,9 +162,10 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 			// Do not run other checks, since they may fail.
 			return
 		}
-		m := getMetricIDsCopy(pendingHourMetricIDs)
-		for metricID := range hmOrig.m {
-			m[metricID] = struct{}{}
+		m := pendingHourMetricIDs.Clone()
+		origMetricIDs := hmOrig.m.AppendTo(nil)
+		for _, metricID := range origMetricIDs {
+			m.Add(metricID)
 		}
 		if !reflect.DeepEqual(hmCurr.m, m) {
 			t.Fatalf("unexpected hm.m; got %v; want %v", hmCurr.m, m)
@@ -183,8 +180,8 @@ func TestUpdateCurrHourMetricIDs(t *testing.T) {
 			t.Fatalf("unexpected hmPrev; got %v; want %v", hmPrev, hmEmpty)
 		}

-		if len(s.pendingHourMetricIDs) != 0 {
-			t.Fatalf("unexpected len(s.pendingHourMetricIDs); got %d; want %d", len(s.pendingHourMetricIDs), 0)
+		if s.pendingHourMetricIDs.Len() != 0 {
+			t.Fatalf("unexpected s.pendingHourMetricIDs.Len(); got %d; want %d", s.pendingHourMetricIDs.Len(), 0)
 		}
 	})
 }
--- a/lib/storage/table.go
+++ b/lib/storage/table.go
@@ -10,6 +10,7 @@ import (

 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
 	"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
+	"github.com/VictoriaMetrics/VictoriaMetrics/lib/uint64set"
 )

 // table represents a single table with time series data.
@@ -18,7 +19,7 @@ type table struct {
 	smallPartitionsPath string
 	bigPartitionsPath   string

-	getDeletedMetricIDs func() map[uint64]struct{}
+	getDeletedMetricIDs func() *uint64set.Set

 	ptws     []*partitionWrapper
 	ptwsLock sync.Mutex
@@ -33,11 +34,15 @@ type table struct {

 // partitionWrapper provides refcounting mechanism for the partition.
 type partitionWrapper struct {
-	pt       *partition
+	// Atomic counters must be at the top of struct for proper 8-byte alignment on 32-bit archs.
+	// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212
+
 	refCount uint64

 	// The partition must be dropped if mustDrop > 0
 	mustDrop uint64
+
+	pt *partition
 }

 func (ptw *partitionWrapper) incRef() {
@@ -75,7 +80,7 @@ func (ptw *partitionWrapper) scheduleToDrop() {
 // The table is created if it doesn't exist.
 //
 // Data older than the retentionMonths may be dropped at any time.
-func openTable(path string, retentionMonths int, getDeletedMetricIDs func() map[uint64]struct{}) (*table, error) {
+func openTable(path string, retentionMonths int, getDeletedMetricIDs func() *uint64set.Set) (*table, error) {
 	path = filepath.Clean(path)

 	// Create a directory for the table if it doesn't exist yet.
@@ -430,7 +435,7 @@ func (tb *table) PutPartitions(ptws []*partitionWrapper) {
 	}
 }

-func openPartitions(smallPartitionsPath, bigPartitionsPath string, getDeletedMetricIDs func() map[uint64]struct{}) ([]*partition, error) {
+func openPartitions(smallPartitionsPath, bigPartitionsPath string, getDeletedMetricIDs func() *uint64set.Set) ([]*partition, error) {
 	smallD, err := os.Open(smallPartitionsPath)
 	if err != nil {
 		return nil, fmt.Errorf("cannot open directory with small partitions %q: %s", smallPartitionsPath, err)
--- a/lib/storage/table_search.go
+++ b/lib/storage/table_search.go
@@ -54,6 +54,9 @@ func (ts *tableSearch) reset() {

 // Init initializes the ts.
 //
+// tsids must be sorted.
+// tsids cannot be modified after the Init call, since it is owned by ts.
+//
 // MustClose must be called then the tableSearch is done.
 func (ts *tableSearch) Init(tb *table, tsids []TSID, tr TimeRange, fetchData bool) {
 	if ts.needClosing {
--- a/lib/storage/tag_filters.go
+++ b/lib/storage/tag_filters.go
@@ -19,14 +19,14 @@ type TagFilters struct {
 	tfs []tagFilter

 	// Common prefix for all the tag filters.
-	// Contains encoded nsPrefixTagToMetricID.
+	// Contains encoded nsPrefixTagToMetricIDs.
 	commonPrefix []byte
 }

 // NewTagFilters returns new TagFilters.
 func NewTagFilters() *TagFilters {
 	return &TagFilters{
-		commonPrefix: marshalCommonPrefix(nil, nsPrefixTagToMetricID),
+		commonPrefix: marshalCommonPrefix(nil, nsPrefixTagToMetricIDs),
 	}
 }

@@ -78,7 +78,7 @@ func (tfs *TagFilters) String() string {
 // Reset resets the tf
 func (tfs *TagFilters) Reset() {
 	tfs.tfs = tfs.tfs[:0]
-	tfs.commonPrefix = marshalCommonPrefix(tfs.commonPrefix[:0], nsPrefixTagToMetricID)
+	tfs.commonPrefix = marshalCommonPrefix(tfs.commonPrefix[:0], nsPrefixTagToMetricIDs)
 }

 func (tfs *TagFilters) marshal(dst []byte) []byte {
@@ -95,7 +95,7 @@ type tagFilter struct {
 	isNegative bool
 	isRegexp   bool

-	// Prefix always contains {nsPrefixTagToMetricID, key}.
+	// Prefix always contains {nsPrefixTagToMetricIDs, key}.
 	// Additionally it contains:
 	//  - value ending with tagSeparatorChar if !isRegexp.
 	//  - non-regexp prefix if isRegexp.
@@ -317,6 +317,9 @@ func getSingleValueFuncExt(re *syntax.Regexp) func(b []byte) bool {
 	case syntax.OpCapture:
 		return getSingleValueFuncExt(re.Sub[0])
 	case syntax.OpLiteral:
+		if !isLiteral(re) {
+			return nil
+		}
 		s := string(re.Rune)
 		return func(b []byte) bool {
 			return string(b) == s
@@ -399,7 +402,7 @@ func isLiteral(re *syntax.Regexp) bool {
 	if re.Op == syntax.OpCapture {
 		return isLiteral(re.Sub[0])
 	}
-	return re.Op == syntax.OpLiteral
+	return re.Op == syntax.OpLiteral && re.Flags&syntax.FoldCase == 0
 }

 func getOrValues(expr string) []string {
@@ -420,6 +423,9 @@ func getOrValuesExt(re *syntax.Regexp) []string {
 	case syntax.OpCapture:
 		return getOrValuesExt(re.Sub[0])
 	case syntax.OpLiteral:
+		if !isLiteral(re) {
+			return nil
+		}
 		return []string{string(re.Rune)}
 	case syntax.OpEmptyMatch:
 		return []string{""}
@@ -592,13 +598,13 @@ func extractRegexpPrefix(b []byte) ([]byte, []byte) {
 	if re == emptyRegexp {
 		return nil, nil
 	}
-	if re.Op == syntax.OpLiteral && re.Flags&syntax.FoldCase == 0 {
+	if isLiteral(re) {
 		return []byte(string(re.Rune)), nil
 	}
 	var prefix []byte
 	if re.Op == syntax.OpConcat {
 		sub0 := re.Sub[0]
-		if sub0.Op == syntax.OpLiteral && sub0.Flags&syntax.FoldCase == 0 {
+		if isLiteral(sub0) {
 			prefix = []byte(string(sub0.Rune))
 			re.Sub = re.Sub[1:]
 			if len(re.Sub) == 0 {
--- a/lib/storage/tag_filters_test.go
+++ b/lib/storage/tag_filters_test.go
@@ -53,11 +53,17 @@ func TestGetRegexpFromCache(t *testing.T) {
 	f("((.*)foo(.*))", nil, []string{"foo", "xfoo", "foox", "xfoobar"}, []string{"", "bar", "foxx"})
 	f(".+foo", nil, []string{"afoo", "bbfoo"}, []string{"foo", "foobar", "afoox", ""})
 	f("a|b", []string{"a", "b"}, []string{"a", "b"}, []string{"xa", "bx", "xab", ""})
+	f("(a|b)", []string{"a", "b"}, []string{"a", "b"}, []string{"xa", "bx", "xab", ""})
 	f("foo.+", nil, []string{"foox", "foobar"}, []string{"foo", "afoox", "afoo", ""})
 	f(".*foo.*bar", nil, []string{"foobar", "xfoobar", "xfooxbar", "fooxbar"}, []string{"", "foobarx", "afoobarx", "aaa"})
 	f("foo.*bar", nil, []string{"foobar", "fooxbar"}, []string{"xfoobar", "", "foobarx", "aaa"})
 	f("foo.*bar.*", nil, []string{"foobar", "fooxbar", "foobarx", "fooxbarx"}, []string{"", "afoobarx", "aaa", "afoobar"})

+	f("(?i)foo", nil, []string{"foo", "Foo", "FOO"}, []string{"xfoo", "foobar", "xFOObar"})
+	f("(?i).+foo", nil, []string{"xfoo", "aaFoo", "bArFOO"}, []string{"foosdf", "xFOObar"})
+	f("(?i)(foo|bar)", nil, []string{"foo", "Foo", "BAR", "bAR"}, []string{"foobar", "xfoo", "xFOObAR"})
+	f("(?i)foo.*bar", nil, []string{"foobar", "FooBAR", "FOOxxbaR"}, []string{"xfoobar", "foobarx", "xFOObarx"})
+
 	f(".*", nil, []string{"", "a", "foo", "foobar"}, nil)
 	f("foo|.*", nil, []string{"", "a", "foo", "foobar"}, nil)
 	f(".+", nil, []string{"a", "foo"}, []string{""})
@@ -323,6 +329,76 @@ func TestTagFilterMatchSuffix(t *testing.T) {
 		mismatch("bar")
 		match("xhttpbar")
 	})
+	t.Run("regexp-iflag-no-suffix", func(t *testing.T) {
+		value := "(?i)http"
+		isNegative := false
+		isRegexp := true
+		expectedPrefix := tvNoTrailingTagSeparator("")
+		init(value, isNegative, isRegexp, expectedPrefix)
+
+		// Must match case-insenstive http
+		match("http")
+		match("HTTP")
+		match("hTTp")
+
+		mismatch("")
+		mismatch("foobar")
+		mismatch("xhttp")
+		mismatch("xhttp://")
+		mismatch("hTTp://foobar.com")
+	})
+	t.Run("negative-regexp-iflag-no-suffix", func(t *testing.T) {
+		value := "(?i)http"
+		isNegative := true
+		isRegexp := true
+		expectedPrefix := tvNoTrailingTagSeparator("")
+		init(value, isNegative, isRegexp, expectedPrefix)
+
+		// Mustn't match case-insensitive http
+		mismatch("http")
+		mismatch("HTTP")
+		mismatch("hTTp")
+
+		match("")
+		match("foobar")
+		match("xhttp")
+		match("xhttp://")
+		match("hTTp://foobar.com")
+	})
+	t.Run("regexp-iflag-any-suffix", func(t *testing.T) {
+		value := "(?i)http.*"
+		isNegative := false
+		isRegexp := true
+		expectedPrefix := tvNoTrailingTagSeparator("")
+		init(value, isNegative, isRegexp, expectedPrefix)
+
+		// Must match case-insenstive http
+		match("http")
+		match("HTTP")
+		match("hTTp://foobar.com")
+
+		mismatch("")
+		mismatch("foobar")
+		mismatch("xhttp")
+		mismatch("xhttp://")
+	})
+	t.Run("negative-regexp-iflag-any-suffix", func(t *testing.T) {
+		value := "(?i)http.*"
+		isNegative := true
+		isRegexp := true
+		expectedPrefix := tvNoTrailingTagSeparator("")
+		init(value, isNegative, isRegexp, expectedPrefix)
+
+		// Mustn't match case-insensitive http
+		mismatch("http")
+		mismatch("HTTP")
+		mismatch("hTTp://foobar.com")
+
+		match("")
+		match("foobar")
+		match("xhttp")
+		match("xhttp://")
+	})
 	t.Run("non-empty-string-regexp-negative-match", func(t *testing.T) {
 		value := ".+"
 		isNegative := true
@@ -409,6 +485,8 @@ func TestGetOrValues(t *testing.T) {
 	f("foo(?:bar|baz)x(qwe|rt)", []string{"foobarxqwe", "foobarxrt", "foobazxqwe", "foobazxrt"})
 	f("foo(bar||baz)", []string{"foo", "foobar", "foobaz"})
 	f("(a|b|c)(d|e|f)(g|h|k)", nil)
+	f("(?i)foo", nil)
+	f("(?i)(foo|bar)", nil)
 }

 func TestGetRegexpPrefix(t *testing.T) {
@@ -463,6 +541,7 @@ func TestGetRegexpPrefix(t *testing.T) {
 	f(t, "a(b|c.*).+", "a", "(?:b|c(?-s:.)*)(?-s:.)+")
 	f(t, "ab|ac", "a", "[b-c]")
 	f(t, "(?i)xyz", "", "(?i:XYZ)")
+	f(t, "(?i)foo|bar", "", "(?i:FOO)|(?i:BAR)")
 	f(t, "(?i)up.+x", "", "(?i:UP)(?-s:.)+(?i:X)")
 	f(t, "(?smi)xy.*z$", "", "(?i:XY)(?s:.)*(?i:Z)(?m:$)")

--- a/lib/storage/tsid.go
+++ b/lib/storage/tsid.go
@@ -88,34 +88,16 @@ func (t *TSID) Unmarshal(src []byte) ([]byte, error) {

 // Less return true if t < b.
 func (t *TSID) Less(b *TSID) bool {
-	if t.MetricID == b.MetricID {
-		// Fast path - two TSID values are identical.
-		return false
+	// Do not compare MetricIDs here as fast path for determining identical TSIDs,
+	// since identical TSIDs aren't passed here in hot paths.
+	if t.MetricGroupID != b.MetricGroupID {
+		return t.MetricGroupID < b.MetricGroupID
 	}
-
-	if t.MetricGroupID < b.MetricGroupID {
-		return true
+	if t.JobID != b.JobID {
+		return t.JobID < b.JobID
 	}
-	if t.MetricGroupID > b.MetricGroupID {
-		return false
+	if t.InstanceID != b.InstanceID {
+		return t.InstanceID < b.InstanceID
 	}
-	if t.JobID < b.JobID {
-		return true
-	}
-	if t.JobID > b.JobID {
-		return false
-	}
-	if t.InstanceID < b.InstanceID {
-		return true
-	}
-	if t.InstanceID > b.InstanceID {
-		return false
-	}
-	if t.MetricID < b.MetricID {
-		return true
-	}
-	if t.MetricID > b.MetricID {
-		return false
-	}
-	return false
+	return t.MetricID < b.MetricID
 }
--- a/lib/uint64set/uint64set.go
+++ b/lib/uint64set/uint64set.go
@@ -0,0 +1,341 @@
+package uint64set
+
+import (
+	"math/bits"
+	"sort"
+)
+
+// Set is a fast set for uint64.
+//
+// It should work faster than map[uint64]struct{} for semi-sparse uint64 values
+// such as MetricIDs generated by lib/storage.
+//
+// It is unsafe calling Set methods from concurrent goroutines.
+type Set struct {
+	itemsCount int
+	buckets    bucket32Sorter
+}
+
+type bucket32Sorter []*bucket32
+
+func (s *bucket32Sorter) Len() int { return len(*s) }
+func (s *bucket32Sorter) Less(i, j int) bool {
+	a := *s
+	return a[i].hi < a[j].hi
+}
+func (s *bucket32Sorter) Swap(i, j int) {
+	a := *s
+	a[i], a[j] = a[j], a[i]
+}
+
+// Clone returns an independent copy of s.
+func (s *Set) Clone() *Set {
+	if s == nil {
+		return nil
+	}
+	var dst Set
+	dst.itemsCount = s.itemsCount
+	dst.buckets = make([]*bucket32, len(s.buckets))
+	for i, b32 := range s.buckets {
+		dst.buckets[i] = b32.clone()
+	}
+	return &dst
+}
+
+// Len returns the number of distinct uint64 values in s.
+func (s *Set) Len() int {
+	if s == nil {
+		return 0
+	}
+	return s.itemsCount
+}
+
+// Add adds x to s.
+func (s *Set) Add(x uint64) {
+	hi := uint32(x >> 32)
+	lo := uint32(x)
+	for _, b32 := range s.buckets {
+		if b32.hi == hi {
+			if b32.add(lo) {
+				s.itemsCount++
+			}
+			return
+		}
+	}
+	s.addAlloc(hi, lo)
+}
+
+func (s *Set) addAlloc(hi, lo uint32) {
+	var b32 bucket32
+	b32.hi = hi
+	_ = b32.add(lo)
+	s.itemsCount++
+	s.buckets = append(s.buckets, &b32)
+}
+
+// Has verifies whether x exists in s.
+func (s *Set) Has(x uint64) bool {
+	hi := uint32(x >> 32)
+	lo := uint32(x)
+	if s == nil {
+		return false
+	}
+	for _, b32 := range s.buckets {
+		if b32.hi == hi {
+			return b32.has(lo)
+		}
+	}
+	return false
+}
+
+// Del deletes x from s.
+func (s *Set) Del(x uint64) {
+	hi := uint32(x >> 32)
+	lo := uint32(x)
+	for _, b32 := range s.buckets {
+		if b32.hi == hi {
+			if b32.del(lo) {
+				s.itemsCount--
+			}
+			return
+		}
+	}
+}
+
+// AppendTo appends all the items from the set to dst and returns the result.
+//
+// The returned items are sorted.
+func (s *Set) AppendTo(dst []uint64) []uint64 {
+	if s == nil {
+		return dst
+	}
+	// pre-allocate memory for dst
+	dstLen := len(dst)
+	if n := s.Len() - cap(dst) + dstLen; n > 0 {
+		dst = append(dst[:cap(dst)], make([]uint64, n)...)
+		dst = dst[:dstLen]
+	}
+	// sort s.buckets if it isn't sorted yet
+	if !sort.IsSorted(&s.buckets) {
+		sort.Sort(&s.buckets)
+	}
+	for _, b32 := range s.buckets {
+		dst = b32.appendTo(dst)
+	}
+	return dst
+}
+
+type bucket32 struct {
+	hi      uint32
+	b16his  []uint16
+	buckets []*bucket16
+}
+
+func (b *bucket32) clone() *bucket32 {
+	var dst bucket32
+	dst.hi = b.hi
+	dst.b16his = append(dst.b16his[:0], b.b16his...)
+	dst.buckets = make([]*bucket16, len(b.buckets))
+	for i, b16 := range b.buckets {
+		dst.buckets[i] = b16.clone()
+	}
+	return &dst
+}
+
+// This is for sort.Interface
+func (b *bucket32) Len() int           { return len(b.b16his) }
+func (b *bucket32) Less(i, j int) bool { return b.b16his[i] < b.b16his[j] }
+func (b *bucket32) Swap(i, j int) {
+	his := b.b16his
+	buckets := b.buckets
+	his[i], his[j] = his[j], his[i]
+	buckets[i], buckets[j] = buckets[j], buckets[i]
+}
+
+const maxUnsortedBuckets = 32
+
+func (b *bucket32) add(x uint32) bool {
+	hi := uint16(x >> 16)
+	lo := uint16(x)
+	if len(b.buckets) > maxUnsortedBuckets {
+		return b.addSlow(hi, lo)
+	}
+	for i, hi16 := range b.b16his {
+		if hi16 == hi {
+			return i < len(b.buckets) && b.buckets[i].add(lo)
+		}
+	}
+	b.addAllocSmall(hi, lo)
+	return true
+}
+
+func (b *bucket32) addAllocSmall(hi, lo uint16) {
+	var b16 bucket16
+	_ = b16.add(lo)
+	b.b16his = append(b.b16his, hi)
+	b.buckets = append(b.buckets, &b16)
+	if len(b.buckets) > maxUnsortedBuckets {
+		sort.Sort(b)
+	}
+}
+
+func (b *bucket32) addSlow(hi, lo uint16) bool {
+	n := binarySearch16(b.b16his, hi)
+	if n < 0 || n >= len(b.b16his) || b.b16his[n] != hi {
+		b.addAllocBig(hi, lo, n)
+		return true
+	}
+	return n < len(b.buckets) && b.buckets[n].add(lo)
+}
+
+func (b *bucket32) addAllocBig(hi, lo uint16, n int) {
+	if n < 0 {
+		return
+	}
+	var b16 bucket16
+	_ = b16.add(lo)
+	if n >= len(b.b16his) {
+		b.b16his = append(b.b16his, hi)
+		b.buckets = append(b.buckets, &b16)
+		return
+	}
+	b.b16his = append(b.b16his[:n+1], b.b16his[n:]...)
+	b.b16his[n] = hi
+	b.buckets = append(b.buckets[:n+1], b.buckets[n:]...)
+	b.buckets[n] = &b16
+}
+
+func (b *bucket32) has(x uint32) bool {
+	hi := uint16(x >> 16)
+	lo := uint16(x)
+	if len(b.buckets) > maxUnsortedBuckets {
+		return b.hasSlow(hi, lo)
+	}
+	for i, hi16 := range b.b16his {
+		if hi16 == hi {
+			return i < len(b.buckets) && b.buckets[i].has(lo)
+		}
+	}
+	return false
+}
+
+func (b *bucket32) hasSlow(hi, lo uint16) bool {
+	n := binarySearch16(b.b16his, hi)
+	if n < 0 || n >= len(b.b16his) || b.b16his[n] != hi {
+		return false
+	}
+	return n < len(b.buckets) && b.buckets[n].has(lo)
+}
+
+func (b *bucket32) del(x uint32) bool {
+	hi := uint16(x >> 16)
+	lo := uint16(x)
+	if len(b.buckets) > maxUnsortedBuckets {
+		return b.delSlow(hi, lo)
+	}
+	for i, hi16 := range b.b16his {
+		if hi16 == hi {
+			return i < len(b.buckets) && b.buckets[i].del(lo)
+		}
+	}
+	return false
+}
+
+func (b *bucket32) delSlow(hi, lo uint16) bool {
+	n := binarySearch16(b.b16his, hi)
+	if n < 0 || n >= len(b.b16his) || b.b16his[n] != hi {
+		return false
+	}
+	return n < len(b.buckets) && b.buckets[n].del(lo)
+}
+
+func (b *bucket32) appendTo(dst []uint64) []uint64 {
+	if len(b.buckets) <= maxUnsortedBuckets && !sort.IsSorted(b) {
+		sort.Sort(b)
+	}
+	for i, b16 := range b.buckets {
+		hi16 := b.b16his[i]
+		dst = b16.appendTo(dst, b.hi, hi16)
+	}
+	return dst
+}
+
+const (
+	bitsPerBucket  = 1 << 16
+	wordsPerBucket = bitsPerBucket / 64
+)
+
+type bucket16 struct {
+	bits [wordsPerBucket]uint64
+}
+
+func (b *bucket16) clone() *bucket16 {
+	var dst bucket16
+	copy(dst.bits[:], b.bits[:])
+	return &dst
+}
+
+func (b *bucket16) add(x uint16) bool {
+	wordNum, bitMask := getWordNumBitMask(x)
+	word := &b.bits[wordNum]
+	ok := *word&bitMask == 0
+	*word |= bitMask
+	return ok
+}
+
+func (b *bucket16) has(x uint16) bool {
+	wordNum, bitMask := getWordNumBitMask(x)
+	return b.bits[wordNum]&bitMask != 0
+}
+
+func (b *bucket16) del(x uint16) bool {
+	wordNum, bitMask := getWordNumBitMask(x)
+	word := &b.bits[wordNum]
+	ok := *word&bitMask != 0
+	*word &^= bitMask
+	return ok
+}
+
+func (b *bucket16) appendTo(dst []uint64, hi uint32, hi16 uint16) []uint64 {
+	hi64 := uint64(hi)<<32 | uint64(hi16)<<16
+	var wordNum uint64
+	for _, word := range b.bits {
+		if word == 0 {
+			wordNum++
+			continue
+		}
+		x64 := hi64 | (wordNum * 64)
+		for {
+			tzn := uint64(bits.TrailingZeros64(word))
+			if tzn >= 64 {
+				break
+			}
+			word &^= uint64(1) << tzn
+			x := x64 | tzn
+			dst = append(dst, x)
+		}
+		wordNum++
+	}
+	return dst
+}
+
+func getWordNumBitMask(x uint16) (uint16, uint64) {
+	wordNum := x / 64
+	bitMask := uint64(1) << (x & 63)
+	return wordNum, bitMask
+}
+
+func binarySearch16(u16 []uint16, x uint16) int {
+	// The code has been adapted from sort.Search.
+	n := len(u16)
+	i, j := 0, n
+	for i < j {
+		h := int(uint(i+j) >> 1)
+		if h >= 0 && h < len(u16) && u16[h] < x {
+			i = h + 1
+		} else {
+			j = h
+		}
+	}
+	return i
+}
--- a/lib/uint64set/uint64set_test.go
+++ b/lib/uint64set/uint64set_test.go
@@ -0,0 +1,224 @@
+package uint64set
+
+import (
+	"fmt"
+	"math/rand"
+	"sort"
+	"testing"
+	"time"
+)
+
+func TestSetBasicOps(t *testing.T) {
+	for _, itemsCount := range []int{1e2, 1e3, 1e4, 1e5, 1e6, maxUnsortedBuckets * bitsPerBucket * 2} {
+		t.Run(fmt.Sprintf("items_%d", itemsCount), func(t *testing.T) {
+			testSetBasicOps(t, itemsCount)
+		})
+	}
+}
+
+func testSetBasicOps(t *testing.T, itemsCount int) {
+	var s Set
+
+	offset := uint64(time.Now().UnixNano())
+
+	// Verify forward Add
+	for i := 0; i < itemsCount/2; i++ {
+		s.Add(uint64(i) + offset)
+	}
+	if n := s.Len(); n != itemsCount/2 {
+		t.Fatalf("unexpected s.Len() after forward Add; got %d; want %d", n, itemsCount/2)
+	}
+
+	// Verify backward Add
+	for i := 0; i < itemsCount/2; i++ {
+		s.Add(uint64(itemsCount-i-1) + offset)
+	}
+	if n := s.Len(); n != itemsCount {
+		t.Fatalf("unexpected s.Len() after backward Add; got %d; want %d", n, itemsCount)
+	}
+
+	// Verify repeated Add
+	for i := 0; i < itemsCount/2; i++ {
+		s.Add(uint64(i) + offset)
+	}
+	if n := s.Len(); n != itemsCount {
+		t.Fatalf("unexpected s.Len() after repeated Add; got %d; want %d", n, itemsCount)
+	}
+
+	// Verify Has on existing bits
+	for i := 0; i < itemsCount; i++ {
+		if !s.Has(uint64(i) + offset) {
+			t.Fatalf("missing bit %d", uint64(i)+offset)
+		}
+	}
+
+	// Verify Has on missing bits
+	for i := itemsCount; i < 2*itemsCount; i++ {
+		if s.Has(uint64(i) + offset) {
+			t.Fatalf("unexpected bit found: %d", uint64(i)+offset)
+		}
+	}
+
+	// Verify Clone
+	sCopy := s.Clone()
+	if n := sCopy.Len(); n != itemsCount {
+		t.Fatalf("unexpected sCopy.Len(); got %d; want %d", n, itemsCount)
+	}
+	for i := 0; i < itemsCount; i++ {
+		if !sCopy.Has(uint64(i) + offset) {
+			t.Fatalf("missing bit %d on sCopy", uint64(i)+offset)
+		}
+	}
+
+	// Verify AppendTo
+	a := s.AppendTo(nil)
+	if len(a) != itemsCount {
+		t.Fatalf("unexpected len of exported array; got %d; want %d; array:\n%d", len(a), itemsCount, a)
+	}
+	if !sort.SliceIsSorted(a, func(i, j int) bool { return a[i] < a[j] }) {
+		t.Fatalf("unsorted result returned from AppendTo: %d", a)
+	}
+	m := make(map[uint64]bool)
+	for _, x := range a {
+		m[x] = true
+	}
+	for i := 0; i < itemsCount; i++ {
+		if !m[uint64(i)+offset] {
+			t.Fatalf("missing bit %d in the exported bits; array:\n%d", uint64(i)+offset, a)
+		}
+	}
+
+	// Verify Del
+	for i := itemsCount / 2; i < itemsCount-itemsCount/4; i++ {
+		s.Del(uint64(i) + offset)
+	}
+	if n := s.Len(); n != itemsCount-itemsCount/4 {
+		t.Fatalf("unexpected s.Len() after Del; got %d; want %d", n, itemsCount-itemsCount/4)
+	}
+	a = s.AppendTo(a[:0])
+	if len(a) != itemsCount-itemsCount/4 {
+		t.Fatalf("unexpected len of exported array; got %d; want %d", len(a), itemsCount-itemsCount/4)
+	}
+	m = make(map[uint64]bool)
+	for _, x := range a {
+		m[x] = true
+	}
+	for i := 0; i < itemsCount; i++ {
+		if i >= itemsCount/2 && i < itemsCount-itemsCount/4 {
+			if m[uint64(i)+offset] {
+				t.Fatalf("unexpected bit found after deleting: %d", uint64(i)+offset)
+			}
+		} else {
+			if !m[uint64(i)+offset] {
+				t.Fatalf("missing bit %d in the exported bits after deleting", uint64(i)+offset)
+			}
+		}
+	}
+
+	// Try Del for non-existing items
+	for i := itemsCount / 2; i < itemsCount-itemsCount/4; i++ {
+		s.Del(uint64(i) + offset)
+		s.Del(uint64(i) + offset)
+		s.Del(uint64(i) + offset + uint64(itemsCount))
+	}
+	if n := s.Len(); n != itemsCount-itemsCount/4 {
+		t.Fatalf("unexpected s.Len() after Del for non-existing items; got %d; want %d", n, itemsCount-itemsCount/4)
+	}
+
+	// Verify sCopy has the original data
+	if n := sCopy.Len(); n != itemsCount {
+		t.Fatalf("unexpected sCopy.Len(); got %d; want %d", n, itemsCount)
+	}
+	for i := 0; i < itemsCount; i++ {
+		if !sCopy.Has(uint64(i) + offset) {
+			t.Fatalf("missing bit %d on sCopy", uint64(i)+offset)
+		}
+	}
+}
+
+func TestSetSparseItems(t *testing.T) {
+	for _, itemsCount := range []int{1e2, 1e3, 1e4} {
+		t.Run(fmt.Sprintf("items_%d", itemsCount), func(t *testing.T) {
+			testSetSparseItems(t, itemsCount)
+		})
+	}
+}
+
+func testSetSparseItems(t *testing.T, itemsCount int) {
+	var s Set
+	m := make(map[uint64]bool)
+	for i := 0; i < itemsCount; i++ {
+		x := rand.Uint64()
+		s.Add(x)
+		m[x] = true
+	}
+	if n := s.Len(); n != len(m) {
+		t.Fatalf("unexpected Len(); got %d; want %d", n, len(m))
+	}
+
+	// Check Has
+	for x := range m {
+		if !s.Has(x) {
+			t.Fatalf("missing item %d", x)
+		}
+	}
+	for i := 0; i < itemsCount; i++ {
+		x := uint64(i)
+		if m[x] {
+			continue
+		}
+		if s.Has(x) {
+			t.Fatalf("unexpected item found %d", x)
+		}
+	}
+
+	// Check Clone
+	sCopy := s.Clone()
+	if n := sCopy.Len(); n != len(m) {
+		t.Fatalf("unexpected sCopy.Len(); got %d; want %d", n, len(m))
+	}
+	for x := range m {
+		if !sCopy.Has(x) {
+			t.Fatalf("missing item %d on sCopy", x)
+		}
+	}
+
+	// Check AppendTo
+	a := s.AppendTo(nil)
+	if len(a) != len(m) {
+		t.Fatalf("unexpected len for AppendTo result; got %d; want %d", len(a), len(m))
+	}
+	if !sort.SliceIsSorted(a, func(i, j int) bool { return a[i] < a[j] }) {
+		t.Fatalf("unsorted result returned from AppendTo: %d", a)
+	}
+	for _, x := range a {
+		if !m[x] {
+			t.Fatalf("unexpected item found in AppendTo result: %d", x)
+		}
+	}
+
+	// Check Del
+	for x := range m {
+		s.Del(x)
+		s.Del(x)
+		s.Del(x + 1)
+		s.Del(x - 1)
+	}
+	if n := s.Len(); n != 0 {
+		t.Fatalf("unexpected number of items left after Del; got %d; want 0", n)
+	}
+	a = s.AppendTo(a[:0])
+	if len(a) != 0 {
+		t.Fatalf("unexpected number of items returned from AppendTo after Del; got %d; want 0; items\n%d", len(a), a)
+	}
+
+	// Check items in sCopy
+	if n := sCopy.Len(); n != len(m) {
+		t.Fatalf("unexpected sCopy.Len() after Del; got %d; want %d", n, len(m))
+	}
+	for x := range m {
+		if !sCopy.Has(x) {
+			t.Fatalf("missing item %d on sCopy after Del", x)
+		}
+	}
+}
--- a/lib/uint64set/uint64set_timing_test.go
+++ b/lib/uint64set/uint64set_timing_test.go
@@ -0,0 +1,321 @@
+package uint64set
+
+import (
+	"fmt"
+	"testing"
+	"time"
+
+	"github.com/valyala/fastrand"
+)
+
+func BenchmarkSetAddRandomLastBits(b *testing.B) {
+	const itemsCount = 1e5
+	for _, lastBits := range []uint64{20, 24, 28, 32} {
+		mask := (uint64(1) << lastBits) - 1
+		b.Run(fmt.Sprintf("lastBits_%d", lastBits), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					var s Set
+					var rng fastrand.RNG
+					for i := 0; i < itemsCount; i++ {
+						n := start | (uint64(rng.Uint32()) & mask)
+						s.Add(n)
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapAddRandomLastBits(b *testing.B) {
+	const itemsCount = 1e5
+	for _, lastBits := range []uint64{20, 24, 28, 32} {
+		mask := (uint64(1) << lastBits) - 1
+		b.Run(fmt.Sprintf("lastBits_%d", lastBits), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					m := make(map[uint64]struct{})
+					var rng fastrand.RNG
+					for i := 0; i < itemsCount; i++ {
+						n := start | (uint64(rng.Uint32()) & mask)
+						m[n] = struct{}{}
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkSetAddWithAllocs(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					end := start + itemsCount
+					var s Set
+					n := start
+					for n < end {
+						s.Add(n)
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapAddWithAllocs(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					end := start + itemsCount
+					m := make(map[uint64]struct{})
+					n := start
+					for n < end {
+						m[n] = struct{}{}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapAddNoAllocs(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					end := start + itemsCount
+					m := make(map[uint64]struct{}, itemsCount)
+					n := start
+					for n < end {
+						m[n] = struct{}{}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapAddReuse(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				m := make(map[uint64]struct{}, itemsCount)
+				for pb.Next() {
+					start := uint64(time.Now().UnixNano())
+					end := start + itemsCount
+					for k := range m {
+						delete(m, k)
+					}
+					n := start
+					for n < end {
+						m[n] = struct{}{}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkSetHasHitRandomLastBits(b *testing.B) {
+	const itemsCount = 1e5
+	for _, lastBits := range []uint64{20, 24, 28, 32} {
+		mask := (uint64(1) << lastBits) - 1
+		b.Run(fmt.Sprintf("lastBits_%d", lastBits), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			var s Set
+			var rng fastrand.RNG
+			for i := 0; i < itemsCount; i++ {
+				n := start | (uint64(rng.Uint32()) & mask)
+				s.Add(n)
+			}
+			a := s.AppendTo(nil)
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(len(a)))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					for _, n := range a {
+						if !s.Has(n) {
+							panic("unexpected miss")
+						}
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapHasHitRandomLastBits(b *testing.B) {
+	const itemsCount = 1e5
+	for _, lastBits := range []uint64{20, 24, 28, 32} {
+		mask := (uint64(1) << lastBits) - 1
+		b.Run(fmt.Sprintf("lastBits_%d", lastBits), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			m := make(map[uint64]struct{})
+			var rng fastrand.RNG
+			for i := 0; i < itemsCount; i++ {
+				n := start | (uint64(rng.Uint32()) & mask)
+				m[n] = struct{}{}
+			}
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(len(m)))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					for n := range m {
+						if _, ok := m[n]; !ok {
+							panic("unexpected miss")
+						}
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkSetHasHit(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			end := start + itemsCount
+			var s Set
+			n := start
+			for n < end {
+				s.Add(n)
+				n++
+			}
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					n := start
+					for n < end {
+						if !s.Has(n) {
+							panic("unexpected miss")
+						}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapHasHit(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			end := start + itemsCount
+			m := make(map[uint64]struct{}, itemsCount)
+			n := start
+			for n < end {
+				m[n] = struct{}{}
+				n++
+			}
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					n := start
+					for n < end {
+						if _, ok := m[n]; !ok {
+							panic("unexpected miss")
+						}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkSetHasMiss(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			end := start + itemsCount
+			var s Set
+			n := start
+			for n < end {
+				s.Add(n)
+				n++
+			}
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					n := end
+					nEnd := end + itemsCount
+					for n < nEnd {
+						if s.Has(n) {
+							panic("unexpected hit")
+						}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
+
+func BenchmarkMapHasMiss(b *testing.B) {
+	for _, itemsCount := range []uint64{1e3, 1e4, 1e5, 1e6, 1e7} {
+		b.Run(fmt.Sprintf("items_%d", itemsCount), func(b *testing.B) {
+			start := uint64(time.Now().UnixNano())
+			end := start + itemsCount
+			m := make(map[uint64]struct{}, itemsCount)
+			n := start
+			for n < end {
+				m[n] = struct{}{}
+				n++
+			}
+
+			b.ResetTimer()
+			b.ReportAllocs()
+			b.SetBytes(int64(itemsCount))
+			b.RunParallel(func(pb *testing.PB) {
+				for pb.Next() {
+					n := end
+					nEnd := end + itemsCount
+					for n < nEnd {
+						if _, ok := m[n]; ok {
+							panic("unexpected hit")
+						}
+						n++
+					}
+				}
+			})
+		})
+	}
+}
--- a/lib/workingsetcache/cache.go
+++ b/lib/workingsetcache/cache.go
@@ -9,6 +9,13 @@ import (
 	"github.com/VictoriaMetrics/fastcache"
 )

+// Cache modes.
+const (
+	split     = 0
+	switching = 1
+	whole     = 2
+)
+
 // Cache is a cache for working set entries.
 //
 // The cache evicts inactive entries after the given expireDuration.
@@ -20,14 +27,15 @@ type Cache struct {
 	curr atomic.Value
 	prev atomic.Value

-	// skipPrev indicates whether to use only curr and skip prev.
+	// mode indicates whether to use only curr and skip prev.
 	//
-	// This flag is set if curr is filled for more than 50% space.
+	// This flag is set to switching if curr is filled for more than 50% space.
 	// In this case using prev would result in RAM waste,
 	// it is better to use only curr cache with doubled size.
-	skipPrev uint64
+	// After the process of switching, this flag will be set to whole.
+	mode uint64

-	// mu serializes access to curr, prev and skipPrev
+	// mu serializes access to curr, prev and mode
 	// in expirationWorker and cacheSizeWatcher.
 	mu sync.Mutex

@@ -42,10 +50,28 @@ type Cache struct {
 //
 // Stop must be called on the returned cache when it is no longer needed.
 func Load(filePath string, maxBytes int, expireDuration time.Duration) *Cache {
-	// Split maxBytes between curr and prev caches.
-	maxBytes /= 2
 	curr := fastcache.LoadFromFileOrNew(filePath, maxBytes)
-	return newWorkingSetCache(curr, maxBytes, expireDuration)
+	var cs fastcache.Stats
+	curr.UpdateStats(&cs)
+	if cs.EntriesCount == 0 {
+		curr.Reset()
+		// The cache couldn't be loaded with maxBytes size.
+		// This may mean that the cache is split into curr and prev caches.
+		// Try loading it again with maxBytes / 2 size.
+		maxBytes /= 2
+		curr = fastcache.LoadFromFileOrNew(filePath, maxBytes)
+		return newWorkingSetCache(curr, maxBytes, expireDuration)
+	}
+
+	// The cache has been successfully loaded in full.
+	// Set its' mode to `whole`.
+	// There is no need in starting expirationWorker and cacheSizeWatcher.
+	var c Cache
+	c.curr.Store(curr)
+	c.prev.Store(fastcache.New(1024))
+	c.stopCh = make(chan struct{})
+	atomic.StoreUint64(&c.mode, whole)
+	return &c
 }

 // New creates new cache with the given maxBytes size and the given expireDuration
@@ -65,6 +91,7 @@ func newWorkingSetCache(curr *fastcache.Cache, maxBytes int, expireDuration time
 	c.curr.Store(curr)
 	c.prev.Store(prev)
 	c.stopCh = make(chan struct{})
+	atomic.StoreUint64(&c.mode, split)

 	c.wg.Add(1)
 	go func() {
@@ -90,7 +117,7 @@ func (c *Cache) expirationWorker(maxBytes int, expireDuration time.Duration) {
 		}

 		c.mu.Lock()
-		if atomic.LoadUint64(&c.skipPrev) != 0 {
+		if atomic.LoadUint64(&c.mode) == split {
 			// Expire prev cache and create fresh curr cache.
 			// Do not reuse prev cache, since it can have too big capacity.
 			prev := c.prev.Load().(*fastcache.Cache)
@@ -106,33 +133,63 @@ func (c *Cache) expirationWorker(maxBytes int, expireDuration time.Duration) {

 func (c *Cache) cacheSizeWatcher(maxBytes int) {
 	t := time.NewTicker(time.Minute)
+	defer t.Stop()
+
 	for {
 		select {
 		case <-c.stopCh:
-			t.Stop()
 			return
 		case <-t.C:
 		}
 		var cs fastcache.Stats
 		curr := c.curr.Load().(*fastcache.Cache)
 		curr.UpdateStats(&cs)
-		if cs.BytesSize < uint64(maxBytes)/2 {
-			continue
+		if cs.BytesSize >= uint64(maxBytes)/2 {
+			break
 		}
-
-		// curr cache size exceeds 50% of its capacity. It is better
-		// to double the size of curr cache and stop using prev cache,
-		// since this will result in higher summary cache capacity.
-		c.mu.Lock()
-		curr.Reset()
-		prev := c.prev.Load().(*fastcache.Cache)
-		prev.Reset()
-		curr = fastcache.New(maxBytes * 2)
-		c.curr.Store(curr)
-		atomic.StoreUint64(&c.skipPrev, 1)
-		c.mu.Unlock()
-		return
 	}
+
+	// curr cache size exceeds 50% of its capacity. It is better
+	// to double the size of curr cache and stop using prev cache,
+	// since this will result in higher summary cache capacity.
+	//
+	// Do this in the following steps:
+	// 1) switch to mode=switching
+	// 2) move curr cache to prev
+	// 3) create curr with the double size
+	// 4) wait until curr size exceeds maxBytes/2, i.e. it is populated with new data
+	// 5) switch to mode=whole
+	// 6) drop prev
+
+	c.mu.Lock()
+	atomic.StoreUint64(&c.mode, switching)
+	prev := c.prev.Load().(*fastcache.Cache)
+	prev.Reset()
+	curr := c.curr.Load().(*fastcache.Cache)
+	c.prev.Store(curr)
+	c.curr.Store(fastcache.New(maxBytes * 2))
+	c.mu.Unlock()
+
+	for {
+		select {
+		case <-c.stopCh:
+			return
+		case <-t.C:
+		}
+		var cs fastcache.Stats
+		curr := c.curr.Load().(*fastcache.Cache)
+		curr.UpdateStats(&cs)
+		if cs.BytesSize >= uint64(maxBytes)/2 {
+			break
+		}
+	}
+
+	c.mu.Lock()
+	atomic.StoreUint64(&c.mode, whole)
+	prev = c.prev.Load().(*fastcache.Cache)
+	prev.Reset()
+	c.prev.Store(fastcache.New(1024))
+	c.mu.Unlock()
 }

 // Save safes the cache to filePath.
@@ -159,7 +216,7 @@ func (c *Cache) Reset() {
 	curr := c.curr.Load().(*fastcache.Cache)
 	curr.Reset()

-	c.misses = 0
+	atomic.StoreUint64(&c.misses, 0)
 }

 // UpdateStats updates fcs with cache stats.
@@ -167,7 +224,7 @@ func (c *Cache) UpdateStats(fcs *fastcache.Stats) {
 	curr := c.curr.Load().(*fastcache.Cache)
 	fcsOrig := *fcs
 	curr.UpdateStats(fcs)
-	if atomic.LoadUint64(&c.skipPrev) != 0 {
+	if atomic.LoadUint64(&c.mode) == whole {
 		return
 	}

@@ -187,7 +244,7 @@ func (c *Cache) Get(dst, key []byte) []byte {
 		// Fast path - the entry is found in the current cache.
 		return result
 	}
-	if atomic.LoadUint64(&c.skipPrev) != 0 {
+	if atomic.LoadUint64(&c.mode) == whole {
 		return result
 	}

@@ -210,7 +267,7 @@ func (c *Cache) Has(key []byte) bool {
 	if curr.Has(key) {
 		return true
 	}
-	if atomic.LoadUint64(&c.skipPrev) != 0 {
+	if atomic.LoadUint64(&c.mode) == whole {
 		return false
 	}
 	prev := c.prev.Load().(*fastcache.Cache)
@@ -231,7 +288,7 @@ func (c *Cache) GetBig(dst, key []byte) []byte {
 		// Fast path - the entry is found in the current cache.
 		return result
 	}
-	if atomic.LoadUint64(&c.skipPrev) != 0 {
+	if atomic.LoadUint64(&c.mode) == whole {
 		return result
 	}

--- a/vendor/github.com/VictoriaMetrics/fastcache/.travis.yml
+++ b/vendor/github.com/VictoriaMetrics/fastcache/.travis.yml
@@ -1,20 +0,0 @@
-language: go
-
-go:
-  - 1.11.x
-
-script:
-  # build test for supported platforms
-  - GOOS=linux go build
-  - GOOS=darwin go build
-  - GOOS=freebsd go build
-  - GOOS=windows go build
-  - GOARCH=386 go build
-
-  # run tests on a standard platform
-  - go test -v ./... -coverprofile=coverage.txt -covermode=atomic
-  - go test -v ./... -race
-
-after_success:
-  # Upload coverage results to codecov.io
-  - bash <(curl -s https://codecov.io/bash)
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Aliaksandr Valialkin	d18ea0c95b	app/vmstorage: add `-bigMergeConcurrency` and `-smallMergeConcurrency` flags for tuning the maximum number of CPU cores used during merges	2019-10-31 16:19:13 +02:00
Aliaksandr Valialkin	e0b292c6de	lib/storage: small cleanup in Storage.add	2019-10-31 14:30:34 +02:00
Aliaksandr Valialkin	86f6be40db	README.md: update information about `vm_rows{type="indexdb"}` metric The previous information became outdated after v1.28.0, since now each row in the inverted index can refer to multiple time series.	2019-10-31 13:30:29 +02:00
Aliaksandr Valialkin	e76e21e4c7	lib/decimal: speed up FromFloat for common case with integers	2019-10-31 13:24:59 +02:00
Aliaksandr Valialkin	cfa5e279c2	lib/decimal: increase float64->decimal conversion precision a bit Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/213	2019-10-30 02:04:56 +02:00
Aliaksandr Valialkin	fa7c3ab93a	README.md: fix delimiter between {measurement} and {field_name} in the Influx line protocol example	2019-10-30 02:04:56 +02:00
Aliaksandr Valialkin	26d570bb3a	lib/storage: get parts to merge after applying the limit on the number of concurrent merges This should reduce write amplification under high ingestion rate.	2019-10-30 02:04:56 +02:00
Roman Khavronenko	62ed508546	Bump version requirements in description	2019-10-29 22:29:48 +00:00
Aliaksandr Valialkin	2e2eff90d5	lib/{mergeset,storage}: limit the maximum number of concurrent merges; leave smaller number of parts during final merge	2019-10-29 12:45:28 +02:00
Aliaksandr Valialkin	855e5c8963	vendor: update github.com/VictoriaMetrics/fastcache from v1.5.1 to v1.5.2	2019-10-29 11:31:29 +02:00
Aliaksandr Valialkin	04e48ef064	lib/fs: typo fix in comment to WriteFileAtomically	2019-10-29 11:31:26 +02:00
Roman Khavronenko	971206b514	update single-version dashboard with panels: (#219 ) * concurrent inserts * rows ignored	2019-10-28 13:54:10 +02:00
Aliaksandr Valialkin	d063bfaf83	vendor: `make vendor-update`	2019-10-28 13:39:05 +02:00
Roman Khavronenko	6ab48838bf	#215 : update klauspost/compress lib (#217 ) * #215: update klauspost/compress lib * #215: bump klauspost/compress lib to 1.9.1	2019-10-28 13:36:35 +02:00
Aliaksandr Valialkin	a42b5db39f	lib/decimal: increase float->decimal conversion precision for big numbers Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/213	2019-10-28 13:23:44 +02:00
Aliaksandr Valialkin	b0295dbf2e	app/vmselect: add `-search.latencyOffset` flag for tuning the time after data collection when data points become visible in query results Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/218	2019-10-28 12:31:07 +02:00
Petr Mikusek	3cea200309	Fix typo s/telergam/telegram/ in README.md	2019-10-23 19:30:36 +03:00
Aliaksandr Valialkin	32600ba4fc	deployment/docker: upgrade Go builder from go1.13.1 to go1.13.3	2019-10-20 23:50:05 +03:00
hanzai	b3c946e35a	warns during rows addition (#214 )	2019-10-20 23:41:07 +03:00
Aliaksandr Valialkin	e83fe938c8	all: `make fmt`	2019-10-17 20:04:34 +03:00
Aliaksandr Valialkin	f708aa7003	Makefile: disable `structcheck` in `golangci-lint`, since it gives false positive on embedded structs	2019-10-17 19:59:10 +03:00
Aliaksandr Valialkin	97ce4e03a5	all: add support for GOARCH=386 and fix all the issues related to 32-bit architectures such as GOARCH=arm Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212	2019-10-17 18:23:23 +03:00
Aliaksandr Valialkin	a398343bb6	vendor: update github.com/valyala/quicktemplate from v1.2.0 to v1.3.1	2019-10-17 18:23:19 +03:00
Aliaksandr Valialkin	6ebf537153	lib/memory: properly handle int overflow in sysTotalMemory This should fix builds on 32-bit architectures such as arm. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/212	2019-10-17 00:50:48 +03:00
Aliaksandr Valialkin	f752479cb8	app/victoria-metrics/test: add missing docs to public funcs PopulateTimeTplString and PopulateTimeTpl	2019-10-17 00:50:46 +03:00
Aliaksandr Valialkin	61e956e175	app/victoria-metrics: add a test for `max_lookback=<duration>` query arg Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209	2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin	c66a691593	app/vmselect/prometheus: add `-search.maxLookback` command-line flag for overriding dynamic calculations for max lookback interval This flag is similar to `-search.lookback-delta` if set. The max lookback interval is determined dynamically from interval between datapoints for each input time series if the flag isn't set. The interval can be overriden on per-query basis by passing `max_lookback=<duration>` query arg to `/api/v1/query` and `/api/v1/query_range`. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/209	2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin	cc21b31502	app/victoria-metrics/test: add a test for PopulateTimeTplString	2019-10-15 21:31:48 +03:00
Aliaksandr Valialkin	195cefd81a	lib/prompb: removed outdated README.md	2019-10-14 22:12:57 +03:00
Aliaksandr Valialkin	c1581c3810	vendor: `make vendor-update`	2019-10-13 23:17:47 +03:00
Aliaksandr Valialkin	16cae15c45	README.md: add `integrations` section	2019-10-11 19:14:28 +03:00
Aliaksandr Valialkin	f6334bffa1	lib/storage: harden the check that the original items are sorted after mergeTagToMetricIDsRows fails to preserve sort order	2019-10-09 12:13:17 +03:00
Aliaksandr Valialkin	2abd5154e0	lib/storage: typo fix in comment to maxRowsPerSmallPart.	2019-10-08 18:51:20 +03:00
Aliaksandr Valialkin	c1cf7d9f93	lib/storage: add tests for mergeTagToMetricIDsRows and return the original items if the function breaks items` ordering. This should save from data corruption issues revealed in the previous releases up to v1.28.0-beta5.	2019-10-08 16:27:35 +03:00
Aliaksandr Valialkin	956fdd89d3	app/vmselect/promql: take into account the previous point when calculating `max_over_time` and `min_over_time` This lines up with `first_over_time` function used in `rollup_candlestick`, so `rollup=low` always returns the minimum value. Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/204	2019-10-08 12:30:05 +03:00
Alexander Danilov	1bc6377863	Improve documentation a little bit	2019-10-07 22:18:40 +03:00
Artem Navoiev	1e2c511747	Add regression test for query apo Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187 cover: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/153 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/150	2019-10-07 22:18:04 +03:00
Aliaksandr Valialkin	0eeffb910f	vendor: `make vendor-update`	2019-10-06 15:47:23 +03:00
Aliaksandr Valialkin	4ba86f501a	vendor: update github.com/VictoriaMetrics/metrics from v1.7.1 to v1.7.2	2019-10-06 11:20:45 +03:00
Aliaksandr Valialkin	fdc5cfd838	lib/mergeset: reduce the maximum number of cached blocks, since there are reports on OOMs due to too big caches Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/189 Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/195	2019-09-30 12:25:40 +03:00
Artem Navoiev	a116f5e7c1	Add regression test for query apo (#194 ) Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187 cover: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184	2019-09-30 11:25:54 +03:00
Aliaksandr Valialkin	4e9e1ca0f7	app/vmselect/netstorage: hint the OS that tmpBlocksFile is read almost sequentially This became the case after `b7ee2e7af2` .	2019-09-30 00:11:14 +03:00
Aliaksandr Valialkin	c1d3705be0	app/vmselect/netstorage: marshal block outside tmpBlocksFile.WriteBlock This allows re-using the destination buffer for marshaling in the outer loop.	2019-09-28 21:07:13 +03:00
Aliaksandr Valialkin	b7ee2e7af2	app/vmselect/netstorage: reduce the number of disk seeks when the query processes big number of time series	2019-09-28 21:07:09 +03:00
Aliaksandr Valialkin	67d44b0845	app/vmselect/promql: do not generate timestamps for NaN values in `timestamp` function according to Prometheus logic	2019-09-27 18:54:43 +03:00
Artem Navoiev	1e6ae9eff4	Add regression test for duplicated labels and series Part of https://github.com/VictoriaMetrics/VictoriaMetrics/issues/187 cover: - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/155 - https://github.com/VictoriaMetrics/VictoriaMetrics/issues/172	2019-09-27 16:52:16 +03:00
Aliaksandr Valialkin	fa81f82714	deployment/docker: switch Go builder image from v1.13.0 to v1.13.1	2019-09-26 17:09:40 +03:00
Aliaksandr Valialkin	0fa6df94a2	lib/storage: optimize TSID comparison	2019-09-26 14:16:02 +03:00
Aliaksandr Valialkin	c39355921e	lib/storage: verify whether items are sorted in the end of call to mergeTagToMetricIDsRows This should prevent from inverted index corruption if bug in mergeTagToMetricIDsRows is discovered.	2019-09-26 13:13:41 +03:00
Artem Navoiev	cf4786f34a	add test for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161	2019-09-26 12:45:19 +03:00
Aliaksandr Valialkin	3e67862676	README.md: typo fix	2019-09-26 11:03:14 +03:00
Aliaksandr Valialkin	0db9fcedd5	lib/storage: properly match labels against regexp with `(?i)` flag Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/161	2019-09-26 11:03:10 +03:00
Aliaksandr Valialkin	391530bb74	README.md: mention recommended `ext4` options for `mkfs.ext4` when creating multi-TB partition	2019-09-25 23:52:43 +03:00
Aliaksandr Valialkin	60c5b368bc	README.md: tiny updates	2019-09-25 23:29:55 +03:00
Aliaksandr Valialkin	26dc21cf64	app/vmselect/promql: add `increases_over_time` and `decreases_over_time` functions `increases_over_time(q[d])` returns the number of `q` increases during the given duration `d`. `decreases_over_time(q[d])` returns the number of `q` decreases during the given duration `d`.	2019-09-25 20:38:44 +03:00
Aliaksandr Valialkin	2444433d83	lib/storage: add missing break in removeDuplicateMetricIDs	2019-09-25 18:23:43 +03:00
Aliaksandr Valialkin	ea4c828bae	lib/storage: remove duplicate MetricIDs in `tag->metricIDs` items before writing them into inverted index	2019-09-25 17:55:13 +03:00
Aliaksandr Valialkin	aebc45ad26	lib/{mergeset,storage}: do not cache inverted index blocks containing `tag->metricIDs` items This should reduce the amounts of used RAM during queries with filters over big number of time series.	2019-09-25 14:02:15 +03:00
Aliaksandr Valialkin	2cb811b42f	lib/uint64set: optimize Set.AppendTo	2019-09-25 00:34:17 +03:00
Aliaksandr Valialkin	b986516fbe	lib/storage: create and use `lib/uint64set` instead of `map[uint64]struct{}` This should improve inverted index search performance for filters matching big number of time series, since `lib/uint64set.Set` is faster than `map[uint64]struct{}` for both `Add` and `Has` calls. See the corresponding benchmarks in `lib/uint64set`.	2019-09-24 21:17:55 +03:00
Aliaksandr Valialkin	ef2296e420	lib/storage: typo fix: return dstData instead of data from mergeTagToMetricIDsRows	2019-09-24 19:32:34 +03:00
Aliaksandr Valialkin	a6086cde78	lib/storage: limit the number of metricIDs in tag->metricIDs row This reduces the overhead on index and metaindex in lib/mergeset	2019-09-24 00:49:51 +03:00
Aliaksandr Valialkin	c9063ece66	lib/storage: share tsids across all the partSearch instances This should reduce memory usage when big number of time series matches the given query.	2019-09-23 22:35:15 +03:00
Aliaksandr Valialkin	4e26ad869b	lib/{storage,mergeset}: verify PrepareBlock callback results Do not touch the first and the last item passed to PrepareBlock in order to preserve sort order of mergeset blocks.	2019-09-23 20:43:13 +03:00
Aliaksandr Valialkin	0772191975	lib/mergeset: detect whether we are in test by executable suffix	2019-09-22 23:12:15 +03:00
Aliaksandr Valialkin	48999e5396	lib/workingsetcache: remove data race when resetting c.misses	2019-09-22 19:36:49 +03:00
Aliaksandr Valialkin	0adebae1f8	lib/storage: generate the first tag->metricIDs item in a mergeset block with a single metricID The first item from each mergeset block goes into index (lib/mergeset.blockHeader), so it must be short in order to reduce index size.	2019-09-22 19:21:33 +03:00
Aliaksandr Valialkin	267efde5ae	README.md: update `troubleshooting` and `tuning` sections according to recent questions from our users	2019-09-22 19:12:24 +03:00
Aliaksandr Valialkin	0686ac52c3	lib/{storage,mergeset}: merge `tag->metricID` rows into `tag->metricIDs` rows for common `tag` values This should improve lookup performance if the same `label=value` pair exists in big number of time series. This should also reduce memory usage for mergeset data cache, since `tag->metricIDs` rows occupy less space than the original `tag->metricID` rows.	2019-09-20 22:06:41 +03:00
Aliaksandr Valialkin	68722c3c74	lib/encoding: optimize UnmarshalUint* and UnmarshalInt*	2019-09-20 13:08:16 +03:00
Aliaksandr Valialkin	a544f49c2b	lib/storage: optimize selecting all the metricIDs by scanning MetricID->TSID entries instead of tag->MetricID entries The number of MetricID->TSID entries is smaller than the number of tag->MetricID entries and MetricID->TSID entries are usually shorter than tag->MetricID entries. This should improve performance when selecting all the metricIDs.	2019-09-20 11:54:10 +03:00
Aliaksandr Valialkin	d32f88c378	app/vminsert/opentsdbhttp: remove FATAL prefix from logger.Fatalf errors for the sake of consistency with other logger.Fatalf calls	2019-09-19 22:15:59 +03:00
Aliaksandr Valialkin	00cfb2d2b9	lib/mergeset: rename misleading mergeSmallParts to mergeExistingParts	2019-09-19 21:48:20 +03:00
Aliaksandr Valialkin	37dc223e25	lib/mergeset: use sort.IsSorted instead of sort.SliceIsSorted in inmemoryBlock.isSorted in order to reduce memory allocations	2019-09-19 20:13:08 +03:00
Aliaksandr Valialkin	a84fe76677	lib/storage: use sort.Sort instead of sort.slice in getSortedMetricIDs	2019-09-19 20:07:22 +03:00
Aliaksandr Valialkin	3a697a935a	lib/storage: skip duplicate call to intersectMetricIDsWithTagFilter on zero successful intersects	2019-09-19 17:49:56 +03:00
Aliaksandr Valialkin	51a21c7d4b	lib/mergeset: fill partHeader.firstItem on first block flush	2019-09-19 17:48:09 +03:00
Aliaksandr Valialkin	3d83f5d334	lib/storage: mark tag filter returning errFallbackToMetricNameMatch as useless This will save CPU on subsequent calls for this filter	2019-09-18 19:10:32 +03:00
Aliaksandr Valialkin	6f3b2fd600	deployment/docker/docker-compose.yml: update Prometheus and Grafana image tags Prometheus: from v2.10.0 to v2.12.0 Grafana: v6.2.1 from to v6.3.5	2019-09-18 18:29:09 +03:00
Aliaksandr Valialkin	8d35718dc6	lib/storage: properly construct keys for uselessTagFiltersCache and register useless negative tag filters there	2019-09-17 23:20:27 +03:00
Aliaksandr Valialkin	33975513d0	vendor: update github.com/valyala/gozstd from v1.6.1 to v1.6.2	2019-09-16 21:50:49 +03:00
Aliaksandr Valialkin	63f2b539df	vendor: `make vendor-update`	2019-09-13 22:48:56 +03:00
Aliaksandr Valialkin	9428ec9c9f	deployment/docker: remove file system paths from the compiled binary	2019-09-13 22:45:59 +03:00
Aliaksandr Valialkin	0c8057924f	lib/mergeset: properly check for sorted block headers Fix a typo for https://github.com/VictoriaMetrics/VictoriaMetrics/issues/181	2019-09-13 21:59:29 +03:00
Aliaksandr Valialkin	d4218d27e6	app/vmselect/promql: properly handle subqueries like `aggr_func(rollup_func(metric[window:step]))` Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/184	2019-09-13 21:41:04 +03:00
hanzai	e2274714b1	lib/workingsetcache: adjust switching from mode=`split` to mode=`whole` smoothly and load cachefile successfully	2019-09-13 19:13:01 +03:00
Aliaksandr Valialkin	4d636c244d	app/vmselect/promql: binary operation fixes according to Prometheus behaviour The follosing issues were fixed: - VictoriaMetrics could leave superflouos labels when using `on` or `ignoring` modifiers - VictoriaMetrics could return `duplicate timeseries` error when using `group_left` or `group_right` with non-empty label list	2019-09-13 17:42:52 +03:00
Aliaksandr Valialkin	bad53e4207	lib/mergeset: dynamically calculate the maximum number of items per part, which can be cached in OS page cache	2019-09-11 14:53:45 +03:00
Artem Navoiev	3f581a9860	[ci] github actions - run pipeline on pull request. Fix running of test in external PR from forks	2019-09-11 09:30:11 +03:00
sundy-li	398e00aa54	README.md: fix ExtendedPromQL link url	2019-09-10 14:56:19 +03:00
Artem Navoiev	4fd741f40d	[tests] check timestamp in tests (#177 )	2019-09-08 19:48:38 +03:00
Artem Navoiev	4a2cd85b92	[ci] bump version of go to 1.13 in github actions config	2019-09-08 14:02:23 +03:00
Aliaksandr Valialkin	6c46afb087	vendor: update github.com/klauspost/compress from v1.7.6 to v1.8.2	2019-09-06 00:47:31 +03:00
Aliaksandr Valialkin	7343e8b408	vendor: update golang.org/x/sys	2019-09-06 00:47:31 +03:00
Artem Navoiev	22e3fabefd	Add OpenTSDB and Prometheus integration tests (#168 ) * [WIP] open tsdb and prometheus integration tests * app/victoria-metrics: fix race condition on parallel tests	2019-09-05 17:55:38 +03:00