Compare commits

...

64 Commits

Author SHA1 Message Date
Aliaksandr Valialkin
0b488f1e37 lib/storage: do not change timestamps to constant rate if values are constant or have constant delta
This breaks the original timestamps, which results in issues like
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/120 and
https://github.com/VictoriaMetrics/VictoriaMetrics/issues/141 .
2019-08-06 15:40:07 +03:00
Aliaksandr Valialkin
b8bb74ffc6 app/vmstorage: add vm_concurrent_addrows_* metrics for tracking concurrency for Storage.AddRows calls
Track also the number of dropped rows due to the exceeded timeout
on concurrency limit for Storage.AddRows. This number is tracked in `vm_concurrent_addrows_dropped_rows_total`
2019-08-06 15:08:33 +03:00
Aliaksandr Valialkin
5c9e48417a vendor: update github.com/VictoriaMetrics/metrics to v1.7.1 2019-08-05 19:21:36 +03:00
Aliaksandr Valialkin
5c83f8e203 app: add vm_concurrent_ metrics for visibility in concurrency limiters for vminsert and vmselect 2019-08-05 18:30:57 +03:00
Aliaksandr Valialkin
05713469c3 vendor: make vendor-update 2019-08-05 10:33:21 +03:00
Aliaksandr Valialkin
8822079b77 lib/storage: properly reset partSearch.fetchData in partSearch.reset 2019-08-05 09:56:06 +03:00
Aliaksandr Valialkin
99e048c9df app/vmselect: allow passing match[], start and time to /api/v1/label/<label_name>/values
`/api/v1/label/<label_name>/values?match[]=q` emulates emulates `label_values(q, <label_name>)`
call in Grafana templating.
2019-08-04 23:09:21 +03:00
Aliaksandr Valialkin
47e4b50112 app/vmselect: optimize /api/v1/series by skipping storage data
Fetch and process only time series metainfo.
2019-08-04 23:01:28 +03:00
Aliaksandr Valialkin
241170dc05 app/vmselect/prometheus: prevent from fetching and scanning all the data on /api/v1/searies call by default 2019-08-04 19:42:36 +03:00
Aliaksandr Valialkin
1c69f4eadc app/vmselect/promql: tune automatic window adjustement
Increase the windows adjustement for small scrape intervals,
since they usually have higher jitter.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/139
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-08-04 19:34:05 +03:00
Aliaksandr Valialkin
8d93b15b86 app/vmselect/promql: further increase the allowed jitter for scrape interval
Real-world production data shows higher jitter than 1/8 of scrape interval.
This may results in gaps on the graph. So increase the allowed jitter to 1/4
of scrape interval in order to reduce the probability of gaps on the graphs
over time series with high jitter for scrape_interval.
2019-08-02 20:10:23 +03:00
Aliaksandr Valialkin
fcc166622a README.md: mention that monitoring is recommended for VictoriaMetrics 2019-08-02 15:27:10 +03:00
Aliaksandr Valialkin
a9f39168d2 app/vminsert/influx: round automatically generated timestamp according to the given precision arg 2019-08-02 00:24:06 +03:00
Aliaksandr Valialkin
f090b2e917 app/vmselect/promql: tolerate higher jitter in scrape interval
Allow jitter for up to 1/8 instead of 1/16 for the scrape interval.
This should imrpove graphs when `step` is smaller than the `scrape_interval`.
2019-08-01 23:26:00 +03:00
Aliaksandr Valialkin
10caad4728 lib/decimal: modernize tests a bit 2019-07-31 21:10:03 +03:00
Aliaksandr Valialkin
3b90c2a99a Add CODE_OF_CONDUCT.md 2019-07-31 15:44:26 +03:00
Aliaksandr Valialkin
57ec4f5f92 Update issue templates
Add a template for feature request
2019-07-31 15:41:57 +03:00
Aliaksandr Valialkin
01cb15b6f5 Update issue templates
Add a template for bug report.
2019-07-31 15:39:41 +03:00
Aliaksandr Valialkin
b9256511e8 README.md: add join slack badge 2019-07-31 15:27:11 +03:00
Aliaksandr Valialkin
3a38b23fa3 app/vmselect/promql: add vm_slow_queries_total metric for counting slow queries
The query is slow if its execution time exceeds `-search.logSlowQueryDuration`
2019-07-31 03:36:37 +03:00
Aliaksandr Valialkin
8bd6f1f6df app/vmselect/promql: return NaN from histogram_quantile if at least a single bucket is broken 2019-07-31 01:18:07 +03:00
Aliaksandr Valialkin
4aaa5c2efc app/vmselect/promql: allow adjusting window for default rollup function
Default rollup function is `last_over_time`. It must support adjusting
the provided window in order to prevent from gaps on the graph
for window values smaller than scrape interval.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/134
2019-07-31 00:45:54 +03:00
Aliaksandr Valialkin
10f5a26bec app/vmselect/promql: return NaN values if invalid bucket counts are passed to histogram_quantile
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/136
2019-07-30 22:05:10 +03:00
Aliaksandr Valialkin
c14fd6c43f lib/storage: typo fixes after a77e88db7d 2019-07-30 15:38:52 +03:00
Aliaksandr Valialkin
a77e88db7d lib/storage: fix matching against tag filter with empty name
Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/137
2019-07-30 15:15:09 +03:00
Aliaksandr Valialkin
aad7236e5d README.md: formatting fixes 2019-07-28 22:02:42 +03:00
Artem Navoiev
5e5de6be9a Create CONTRIBUTING.md 2019-07-28 20:42:32 +03:00
Anton Patsev
90cf6f3fcb change /usr/bin/victoriametrics to /usr/bin/victoria-metrics-prod (#132) 2019-07-28 20:40:46 +03:00
Artem Navoiev
8e3d69219f Add roadmap (#130)
* Add roadmap
* Fix typos
2019-07-28 18:39:39 +01:00
Aliaksandr Valialkin
b842a2eccc README.md: mention that VictoriaMetrics needs free disk space for background merges 2019-07-28 12:26:16 +03:00
Aliaksandr Valialkin
afcc7fb167 app/vmselect/netstorage: improve error message when reading data blocks from storage
Mention the block number in the error. This should simplify troubleshooting in this code.
2019-07-28 12:12:35 +03:00
Aliaksandr Valialkin
57a57c711a package: changed the remaining /usr/local/bin to /usr/bin
This is a follo-up after 68f260d878
2019-07-28 11:08:07 +03:00
Anton Patsev
68f260d878 change /usr/local/bin to /usr/bin (#131) 2019-07-28 11:06:24 +03:00
Aliaksandr Valialkin
1eade9b358 app/vminsert: add vm_rows_per_insert summary metric
This metric should help tuning batch sizes on clients writing data to VictoriaMetrics
2019-07-27 13:21:46 +03:00
Aliaksandr Valialkin
7e8747f6ed README.md: add a section for production ARM build 2019-07-26 22:34:31 +03:00
Aliaksandr Valialkin
0168a1b658 package: various fixes
- Use `-prod` binaries instead of development binaries for both deb and rpm packages.
- Fix binary directory from /usr/sbin to /usr/local/bin as outlined in package/victoria-metrics.service
- Fix binary name from `victoriametrics` to `victoria-metrics-prod` in package/victoria-metrics.service
2019-07-26 22:31:04 +03:00
Aliaksandr Valialkin
bf6cbb762c app/vminsert: improve error messages for Influx, OpenTSDB and Graphite parsing
Include in the error message the line which failed to parse.

Updates https://github.com/VictoriaMetrics/VictoriaMetrics/issues/127
2019-07-26 22:08:52 +03:00
Kostya Vasilyev
6aeac37fc5 pick up .service file from ./rpm (#126)
* pick up .service file from ./rpm

* feedback from @patsevanton

* remove 'start' from ExecStart command
2019-07-26 21:56:30 +03:00
Aliaksandr Valialkin
c98725db55 app/vmstorage: consistency renaming for ignored rows metrics
vm_too_big_timestamp_rows_total -> vm_rows_ignored_total{reason="big_timestamp"}
  vm_too_small_timestamp_rows_total -> vm_rows_ignored_total{reason="small_timestamp"}
2019-07-26 20:02:06 +03:00
Anton Patsev
d8043f7161 Change default value storageDataPath (#125)
Fixes #124 .
2019-07-26 14:13:55 +03:00
Aliaksandr Valialkin
f586e1f83c lib/storage: add metrics for calculating skipped rows outside the retention
The metrics are:

    - vm_too_big_timestamp_rows_total
    - vm_too_small_timestamp_rows_total
2019-07-26 14:11:01 +03:00
Kostya Vasilyev
d1132bb188 deb packaging fixes: 1) stop the service in prerm 2) reload services in postrm (#123) 2019-07-26 12:38:59 +03:00
Aliaksandr Valialkin
915fb6df79 README.md: mention that arm builds can run on Raspberry Pi 2019-07-26 12:28:40 +03:00
Kostya Vasilyev
89eb6d78a4 RPM packaging (#122) 2019-07-25 23:47:41 +03:00
Aliaksandr Valialkin
17096b5750 app/vmselect/promql: return NaN from count() over zero time series
This aligns `count` behavior with Prometheus.
2019-07-25 22:02:30 +03:00
Aliaksandr Valialkin
66efa5745f app/vmselect/promql: properly calculate incremental aggregations grouped by __name__
Previously the following query may fail on multiple distinct metric names match:

    sum(count_over_time{__name__!=''}) by (__name__)
2019-07-25 21:53:20 +03:00
Anton Patsev
106ab78a47 Add package/rpm/ (#121) 2019-07-25 11:21:55 +03:00
Aliaksandr Valialkin
8aa474d685 README.md: move how to build VictoriaMetrics section to the bottom
This streamlines `getting started` experience
2019-07-25 11:17:30 +03:00
Aliaksandr Valialkin
9e059bb330 README.md: add links to ARM build and Pure Go build in TOC 2019-07-25 11:05:35 +03:00
Aliaksandr Valialkin
2346335ea6 README.md: moved advanced topics to the bottom, so they don't clutter getting started workflow 2019-07-25 11:00:41 +03:00
Aliaksandr Valialkin
b339890dca lib/encoding/zstd: go fmt 2019-07-25 01:37:16 +03:00
Aliaksandr Valialkin
6c4ca89d75 lib/encoding/zstd: disable CRC checks in pure Go build
This should give slightly better compression and decompressions performance.
Additionally this shaves off 4 bytes per each compressed block.
2019-07-24 19:17:16 +03:00
Roman Khavronenko
f0fe7b5ad6 fix typo (#117) 2019-07-24 07:48:28 +01:00
Aliaksandr Valialkin
22ed4e7fd4 vendor: make vendor-update 2019-07-23 20:00:19 +03:00
Aliaksandr Valialkin
162f1fb1b7 all: small updates after PR #114 2019-07-23 19:54:50 +03:00
Aliaksandr Valialkin
d07f616609 lib/encoding: small fixes in tests after the PR #114 2019-07-23 19:37:51 +03:00
Roman Khavronenko
5bf4e5ffb5 all: add Pure Go build (pull request #114)
Updates #94
2019-07-23 19:26:39 +03:00
Kostya Vasilyev
8c3629a892 Debian packaging (#116)
* initial commit of deb packaging

* Incorporated feedback from @valyala:
- Put data directory under /var/lib
- More beef in systemd file
- Packaging for arm64
- Package all target which builds and packages both amd64 and arm64

* Remove PIDFile from systemd unit, useless

* per PR feedback, move debian specific files into deb subdirectory

Updates #107 .
2019-07-22 17:12:48 +03:00
Aliaksandr Valialkin
ea07cf68ba README.md: add querying Graphite data section
Mention that Graphite data may be read either via Prometheus querying API
or via go-graphite/carbonapi. See https://github.com/go-graphite/carbonapi/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml
2019-07-21 16:10:19 +03:00
Roman Khavronenko
4ee41bab43 add versioning to dashboard description (#113) 2019-07-21 14:34:50 +03:00
Roman Khavronenko
1273f31f19 Add CPU usage panel; rename Go runtime to Resource usage (#112)
* add CPU usage panel; rename `Go runtime` to `Resource usage`

* rm irate from CPU usage panel

Updates #92 .
2019-07-20 17:24:24 +03:00
Aliaksandr Valialkin
0f2ecde0e6 lib/encoding: improve gauge series detection
- Series with negative values are always gauges
- Counters may only have increasing values with possible counter resets

This should improve compression ratio for gauge series which
were previously mistakenly detected as counters.
2019-07-20 14:05:09 +03:00
Aliaksandr Valialkin
6cd77d4847 deployment: switch builder from go1.12.6 to go1.12.7 2019-07-20 12:15:05 +03:00
Roman Khavronenko
fb14f23532 mention docker-compose as option to spin up VM (#97) 2019-07-16 00:45:21 +03:00
222 changed files with 18733 additions and 560 deletions

30
.github/ISSUE_TEMPLATE/bug_report.md vendored Normal file
View File

@@ -0,0 +1,30 @@
---
name: Bug report
about: Create a report to help us improve
title: ''
labels: ''
assignees: ''
---
**Describe the bug**
A clear and concise description of what the bug is.
**To Reproduce**
Steps to reproduce the behavior
**Expected behavior**
A clear and concise description of what you expected to happen.
**Screenshots**
If applicable, add screenshots to help explain your problem.
**Version**
The line returned when passing `--version` command line flag to binary. For example:
```
$ ./victoria-metrics-prod --version
victoria-metrics-20190730-121249-heads-single-node-0-g671d9e55
```
**Additional context**
Add any other context about the problem here such as error logs, `/metrics` output, screenshots from [the official Grafana dashboard for VictoriaMetrics](https://grafana.com/dashboards/10229).

View File

@@ -0,0 +1,20 @@
---
name: Feature request
about: Suggest an idea for this project
title: ''
labels: ''
assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
**Describe the solution you'd like**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
Add any other context or screenshots about the feature request here.

4
.gitignore vendored
View File

@@ -9,3 +9,7 @@
/victoria-metrics-data
/vmstorage-data
/vmselect-cache
/package/temp-deb-*
/package/temp-rpm-*
/package/*.deb
/package/*.rpm

View File

@@ -15,8 +15,12 @@ before_install:
script:
- make check_all
- git diff --exit-code
- make test_full
- make test-full
- make test-pure
- make victoria-metrics
- make victoria-metrics-pure
- make victoria-metrics-arm
- make victoria-metrics-arm64
after_success:
- bash <(curl -s https://codecov.io/bash)
- bash <(curl -s https://codecov.io/bash)

76
CODE_OF_CONDUCT.md Normal file
View File

@@ -0,0 +1,76 @@
# Contributor Covenant Code of Conduct
## Our Pledge
In the interest of fostering an open and welcoming environment, we as
contributors and maintainers pledge to making participation in our project and
our community a harassment-free experience for everyone, regardless of age, body
size, disability, ethnicity, sex characteristics, gender identity and expression,
level of experience, education, socio-economic status, nationality, personal
appearance, race, religion, or sexual identity and orientation.
## Our Standards
Examples of behavior that contributes to creating a positive environment
include:
* Using welcoming and inclusive language
* Being respectful of differing viewpoints and experiences
* Gracefully accepting constructive criticism
* Focusing on what is best for the community
* Showing empathy towards other community members
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery and unwelcome sexual attention or
advances
* Trolling, insulting/derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or electronic
address, without explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Our Responsibilities
Project maintainers are responsible for clarifying the standards of acceptable
behavior and are expected to take appropriate and fair corrective action in
response to any instances of unacceptable behavior.
Project maintainers have the right and responsibility to remove, edit, or
reject comments, commits, code, wiki edits, issues, and other contributions
that are not aligned to this Code of Conduct, or to ban temporarily or
permanently any contributor for other behaviors that they deem inappropriate,
threatening, offensive, or harmful.
## Scope
This Code of Conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community. Examples of
representing a project or community include using an official project e-mail
address, posting via an official social media account, or acting as an appointed
representative at an online or offline event. Representation of a project may be
further defined and clarified by project maintainers.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported by contacting the project team at info@victoriametrics.com. All
complaints will be reviewed and investigated and will result in a response that
is deemed necessary and appropriate to the circumstances. The project team is
obligated to maintain confidentiality with regard to the reporter of an incident.
Further details of specific enforcement policies may be posted separately.
Project maintainers who do not follow or enforce the Code of Conduct in good
faith may face temporary or permanent repercussions as determined by other
members of the project's leadership.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
[homepage]: https://www.contributor-covenant.org
For answers to common questions about this code of conduct, see
https://www.contributor-covenant.org/faq

16
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,16 @@
If you like VictoriaMetrics and want to contribute, then we need the following:
- Filing issues and feature requests [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
- Spreading a word about VictoriaMetrics: conference talks, articles, comments, experience sharing with colleagues.
- Updating documentation.
We are open to third-party pull requests provided they follow [KISS design principle](https://en.wikipedia.org/wiki/KISS_principle):
- Prefer simple code and architecture.
- Avoid complex abstractions.
- Avoid magic code and fancy algorithms.
- Avoid [big external dependencies](https://medium.com/@valyala/stripping-dependency-bloat-in-victoriametrics-docker-image-983fb5912b0d).
- Minimize the number of moving parts in the distributed system.
- Avoid automated decisions, which may hurt cluster availability, consistency or performance.
Adhering `KISS` principle simplifies the resulting code and architecture, so it can be reviewed, understood and verified by many people.

View File

@@ -53,16 +53,22 @@ install-errcheck:
check_all: fmt vet lint errcheck golangci-lint
test:
GO111MODULE=on go test -mod=vendor ./lib/...
GO111MODULE=on go test -mod=vendor ./app/...
GO111MODULE=on go test -tags=integration -mod=vendor ./lib/... ./app/...
test_full:
test-pure:
GO111MODULE=on CGO_ENABLED=0 go test -tags=integration -mod=vendor ./lib/... ./app/...
test-full:
GO111MODULE=on go test -tags=integration -mod=vendor -coverprofile=coverage.txt -covermode=atomic ./lib/... ./app/...
benchmark:
GO111MODULE=on go test -mod=vendor -bench=. ./lib/...
GO111MODULE=on go test -mod=vendor -bench=. ./app/...
benchmark-pure:
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor -bench=. ./lib/...
GO111MODULE=on CGO_ENABLED=0 go test -mod=vendor -bench=. ./app/...
vendor-update:
GO111MODULE=on go get -u ./lib/...
GO111MODULE=on go get -u ./app/...

209
README.md
View File

@@ -1,4 +1,5 @@
[![Latest Release](https://img.shields.io/github/release/VictoriaMetrics/VictoriaMetrics.svg?style=flat-square)](https://github.com/VictoriaMetrics/VictoriaMetrics/releases/latest)
[![Slack](https://img.shields.io/badge/join%20slack-%23victoriametrics-brightgreen.svg)](http://slack.victoriametrics.com/)
[![GitHub license](https://img.shields.io/github/license/VictoriaMetrics/VictoriaMetrics.svg)](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/LICENSE)
[![Go Report](https://goreportcard.com/badge/github.com/VictoriaMetrics/VictoriaMetrics)](https://goreportcard.com/report/github.com/VictoriaMetrics/VictoriaMetrics)
[![Build Status](https://travis-ci.org/VictoriaMetrics/VictoriaMetrics.svg?branch=master)](https://travis-ci.org/VictoriaMetrics/VictoriaMetrics)
@@ -8,7 +9,7 @@
## Single-node VictoriaMetrics
VictoriaMetrics is fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.
VictoriaMetrics is fast, cost-effective and scalable time-series database. It can be used as long-term remote storage for Prometheus.
It is available in [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases),
[docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/) and
in [source code](https://github.com/VictoriaMetrics/VictoriaMetrics).
@@ -26,8 +27,8 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
[Outperforms InfluxDB and TimescaleDB by up to 20x](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
* [Uses 10x less RAM than InfluxDB](https://medium.com/@valyala/insert-benchmarks-with-inch-influxdb-vs-victoriametrics-e31a41ae2893) when working with millions of unique time series (aka high cardinality).
* High data compression, so [up to 70x more data points](https://medium.com/@valyala/when-size-matters-benchmarking-victoriametrics-vs-timescale-and-influxdb-6035811952d4)
may be crammed into a limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low iops (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
may be crammed into limited storage comparing to TimescaleDB.
* Optimized for storage with high-latency IO and low IOPS (HDD and network storage in AWS, Google Cloud, Microsoft Azure, etc). See [graphs from these benchmarks](https://medium.com/@valyala/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b).
* A single-node VictoriaMetrics may substitute moderately sized clusters built with competing solutions such as Thanos, Uber M3, Cortex, InfluxDB or TimescaleDB.
See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae)
and [comparing Thanos to VictoriaMetrics cluster](https://medium.com/@valyala/comparing-thanos-to-victoriametrics-cluster-b193bea1683).
@@ -52,20 +53,24 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
### Table of contents
- [How to build from sources](#how-to-build-from-sources)
- [Development build](#development-build)
- [Production build](#production-build)
- [Building docker images](#building-docker-images)
- [How to start VictoriaMetrics](#how-to-start-victoriametrics)
- [Setting up service](#setting-up-service)
- [Third-party contributions](#third-party-contributions)
- [Prometheus setup](#prometheus-setup)
- [Grafana setup](#grafana-setup)
- [How to upgrade VictoriaMetrics?](#how-to-upgrade-victoriametrics)
- [How to apply new config to VictoriaMetrics?](#how-to-apply-new-config-to-victoriametrics)
- [How to send data from InfluxDB-compatible agents such as Telegraf?](#how-to-send-data-from-influxdb-compatible-agents-such-as-telegraf)
- [How to send data from Graphite-compatible agents such as StatsD?](#how-to-send-data-from-graphite-compatible-agents-such-as-statsd)
- [Querying Graphite data](#querying-graphite-data)
- [How to send data from OpenTSDB-compatible agents?](#how-to-send-data-from-opentsdb-compatible-agents)
- [How to build from sources](#how-to-build-from-sources)
- [Development build](#development-build)
- [Production build](#production-build)
- [ARM build](#arm-build)
- [Pure Go build (CGO_ENABLED=0)](#pure-go-build-cgo_enabled0)
- [Building docker images](#building-docker-images)
- [Start with docker-compose](#start-with-docker-compose)
- [Setting up service](#setting-up-service)
- [Third-party contributions](#third-party-contributions)
- [How to work with snapshots?](#how-to-work-with-snapshots)
- [How to delete time series?](#how-to-delete-time-series)
- [How to export time series?](#how-to-export-time-series)
@@ -81,6 +86,7 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [Tuning](#tuning)
- [Monitoring](#monitoring)
- [Troubleshooting](#troubleshooting)
- [Roadmap](#roadmap)
- [Contacts](#contacts)
- [Community and contributions](#community-and-contributions)
- [Reporting bugs](#reporting-bugs)
@@ -91,56 +97,22 @@ Cluster version is available [here](https://github.com/VictoriaMetrics/VictoriaM
- [We kindly ask:](#we-kindly-ask)
### How to build from sources
We recommend using either [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) or
[docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/) instead of building VictoriaMetrics
from sources. Building from sources is reasonable when developing an additional features specific
to your needs.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make victoria-metrics` from the root folder of the repository.
It will build `victoria-metrics` binary and put it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make victoria-metrics-prod` from the root folder of the repository.
It will build `victoria-metrics-prod` binary and put it into the `bin` folder.
#### Building docker images
Run `make package-victoria-metrics`. It will build `victoriametrics/victoria-metrics:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-victoria-metrics`.
### How to start VictoriaMetrics
Just start VictoriaMetrics executable or docker image with the desired command-line flags.
Just start VictoriaMetrics [executable](https://github.com/VictoriaMetrics/VictoriaMetrics/releases)
or [docker image](https://hub.docker.com/r/victoriametrics/victoria-metrics/) with the desired command-line flags.
The following command line flags are used the most:
The following command-line flags are used the most:
* `-storageDataPath` - path to data directory. VictoriaMetrics stores all the data in this directory.
* `-retentionPeriod` - retention period in months for the data. Older data is automatically deleted.
* `-httpListenAddr` - TCP address to listen to for http requests. By default it listens port `8428` on all the network interfaces.
* `-graphiteListenAddr` - TCP and UDP address to listen to for Graphite data. By default it is disabled.
* `-opentsdbListenAddr` - TCP and UDP address to listen to for OpenTSDB data. By default it is disabled.
* `-httpListenAddr` - TCP address to listen to for http requests. By default, it listens port `8428` on all the network interfaces.
* `-graphiteListenAddr` - TCP and UDP address to listen to for Graphite data. By default, it is disabled.
* `-opentsdbListenAddr` - TCP and UDP address to listen to for OpenTSDB data. By default, it is disabled.
Pass `-help` to see all the available flags with description and default values.
### Setting up service
Read [these instructions](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/43) on how to set up VictoriaMetrics as a service in your OS.
### Third-party contributions
* [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm))
It is recommended setting up [monitoring](#monitoring) for VictoriaMetrics.
### Prometheus setup
@@ -166,7 +138,7 @@ Prometheus writes incoming data to local storage and replicates it to remote sto
This means the data remains available in local storage for `--storage.tsdb.retention.time` duration
even if remote storage is unavailable.
If you plan sending data to VictoriaMetrics from multiple Prometheus instances, then add the following lines into `global` section
If you plan to send data to VictoriaMetrics from multiple Prometheus instances, then add the following lines into `global` section
of [Prometheus config](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#configuration-file):
```yml
@@ -177,7 +149,7 @@ global:
This instructs Prometheus to add `datacenter=dc-123` label to each time series sent to remote storage.
The label name may be arbitrary - `datacenter` is just an example. The label value must be unique
across Prometheus instances, so time series may be filtered and grouped by this label.
across Prometheus instances, so those time series may be filtered and grouped by this label.
It is recommended upgrading Prometheus to [v2.10.0](https://github.com/prometheus/prometheus/releases) or newer,
@@ -217,7 +189,7 @@ VictoriaMetrics must be restarted for applying new config:
1) Send `SIGINT` signal to VictoriaMetrics process in order to gracefully stop it.
2) Wait until the process stops. This can take a few seconds.
3) Start VictoriaMetrics with new config.
3) Start VictoriaMetrics with the new config.
### How to send data from InfluxDB-compatible agents such as [Telegraf](https://www.influxdata.com/time-series-platform/telegraf/)?
@@ -260,7 +232,7 @@ to local VictoriaMetrics using `curl`:
curl -d 'measurement,tag1=value1,tag2=value2 field1=123,field2=1.23' -X POST 'http://localhost:8428/write'
```
Arbitrary number of lines delimited by '\n' may be sent in a single request.
An arbitrary number of lines delimited by '\n' may be sent in a single request.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
@@ -294,8 +266,8 @@ Example for writing data with Graphite plaintext protocol to local VictoriaMetri
echo "foo.bar.baz;tag1=value1;tag2=value2 123 `date +%s`" | nc -N localhost 2003
```
VictoriaMetrics sets the current time if timestamp is omitted.
Arbitrary number of lines delimited by `\n` may be sent in one go.
VictoriaMetrics sets the current time if the timestamp is omitted.
An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
@@ -309,6 +281,14 @@ The `/api/v1/export` endpoint should return the following response:
```
### Querying Graphite data
Data sent to VictoriaMetrics via `Graphite plaintext protocol` may be read either via
[Prometheus querying API](https://prometheus.io/docs/prometheus/latest/querying/api/)
or via [go-graphite/carbonapi](https://github.com/go-graphite/carbonapi/blob/master/cmd/carbonapi/carbonapi.example.prometheus.yaml).
### How to send data from OpenTSDB-compatible agents?
1) Enable OpenTSDB receiver in VictoriaMetrics by setting `-opentsdbListenAddr` command line flag. For instance,
@@ -327,7 +307,7 @@ Example for writing data with OpenTSDB protocol to local VictoriaMetrics using `
echo "put foo.bar.baz `date +%s` 123 tag1=value1 tag2=value2" | nc -N localhost 4242
```
Arbitrary number of lines delimited by `\n` may be sent in one go.
An arbitrary number of lines delimited by `\n` may be sent in one go.
After that the data may be read via [/api/v1/export](#how-to-export-time-series) endpoint:
```
@@ -341,9 +321,79 @@ The `/api/v1/export` endpoint should return the following response:
```
### How to build from sources
We recommend using either [binary releases](https://github.com/VictoriaMetrics/VictoriaMetrics/releases) or
[docker images](https://hub.docker.com/r/victoriametrics/victoria-metrics/) instead of building VictoriaMetrics
from sources. Building from sources is reasonable when developing additional features specific
to your needs.
#### Development build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make victoria-metrics` from the root folder of the repository.
It builds `victoria-metrics` binary and puts it into the `bin` folder.
#### Production build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make victoria-metrics-prod` from the root folder of the repository.
It builds `victoria-metrics-prod` binary and puts it into the `bin` folder.
#### ARM build
ARM build may run on Raspberry Pi or on [energy-efficient ARM servers](https://blog.cloudflare.com/arm-takes-wing/).
#### Development ARM build
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make victoria-metrics-arm` or `make victoria-metrics-arm64` from the root folder of the repository.
It builds `victoria-metrics-arm` or `victoria-metrics-arm64` binary respectively and puts it into the `bin` folder.
#### Production ARM build
1. [Install docker](https://docs.docker.com/install/).
2. Run `make victoria-metrics-arm-prod` or `make victoria-metrics-arm64-prod` from the root folder of the repository.
It builds `victoria-metrics-arm-prod` or `victoria-metrics-arm64-prod` binary respectively and puts it into the `bin` folder.
#### Pure Go build (CGO_ENABLED=0)
`Pure Go` mode builds only Go code without [cgo](https://golang.org/cmd/cgo/) dependencies.
This is an experimental mode, which may result in a lower compression ratio and slower decompression performance.
Use it with caution!
1. [Install Go](https://golang.org/doc/install). The minimum supported version is Go 1.12.
2. Run `make victoria-metrics-pure` from the root folder of the repository.
It builds `victoria-metrics-pure` binary and puts it into the `bin` folder.
#### Building docker images
Run `make package-victoria-metrics`. It builds `victoriametrics/victoria-metrics:<PKG_TAG>` docker image locally.
`<PKG_TAG>` is auto-generated image tag, which depends on source code in the repository.
The `<PKG_TAG>` may be manually set via `PKG_TAG=foobar make package-victoria-metrics`.
### Start with docker-compose
[Docker-compose](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/deployment/docker/docker-compose.yml)
helps to spin up VictoriaMetrics, Prometheus and Grafana with one command.
More details may be found [here](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/master/deployment/docker#folder-contains-basic-images-and-tools-for-building-and-running-victoria-metrics-in-docker).
### Setting up service
Read [these instructions](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/43) on how to set up VictoriaMetrics as a service in your OS.
### Third-party contributions
* [Unofficial yum repository](https://copr.fedorainfracloud.org/coprs/antonpatsev/VictoriaMetrics/) ([source code](https://github.com/patsevanton/victoriametrics-rpm))
### How to work with snapshots?
VictoriaMetrics is able to create [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
VictoriaMetrics can create [instant snapshots](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282)
for all the data stored under `-storageDataPath` directory.
Navigate to `http://<victoriametrics-addr>:8428/snapshot/create` in order to create an instant snapshot.
The page will return the following JSON response:
@@ -400,15 +450,15 @@ VictoriaMetrics exports [Prometheus-compatible federation data](https://promethe
at `http://<victoriametrics-addr>:8428/federate?match[]=<timeseries_selector_for_federation>`.
Optional `start` and `end` args may be added to the request in order to scrape the last point for each selected time series on the `[start ... end]` interval.
`start` and `end` may contain either unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values. By default the last point
on the interval `[now - max_lookback ... now]` is scraped for each time series. Default value for `max_lookback` is `5m` (5 minutes), but can be overridden.
`start` and `end` may contain either unix timestamp in seconds or [RFC3339](https://www.ietf.org/rfc/rfc3339.txt) values. By default, the last point
on the interval `[now - max_lookback ... now]` is scraped for each time series. The default value for `max_lookback` is `5m` (5 minutes), but can be overridden.
For instance, `/federate?match[]=up&max_lookback=1h` would return last points on the `[now - 1h ... now]` interval. This may be useful for time series federation
with scrape intervals exceeding `5m`.
### Capacity planning
Rough estimation of the required resources for ingestion path:
A rough estimation of the required resources for ingestion path:
* RAM size: less than 1KB per active time series. So, ~1GB of RAM is required for 1M active time series.
Time series is considered active if new data points have been added to it recently or if it has been recently queried.
@@ -430,15 +480,15 @@ Rough estimation of the required resources for ingestion path:
* Network usage: outbound traffic is negligible. Ingress traffic is ~100 bytes per ingested data point via
[Prometheus remote_write API](https://prometheus.io/docs/prometheus/latest/configuration/configuration/#remote_write).
The actual ingress bandwitdh usage depends on the average number of labels per ingested metric and the average size
of label values. Higher number of per-metric lables and longer label values mean higher ingress bandwidth.
The actual ingress bandwidth usage depends on the average number of labels per ingested metric and the average size
of label values. The higher number of per-metric labels and longer label values mean the higher ingress bandwidth.
The required resources for query path:
* RAM size: depends on the number of time series to scan in each query and the `step`
argument passed to [/api/v1/query_range](https://prometheus.io/docs/prometheus/latest/querying/api/#range-queries).
Higher number of scanned time series and lower `step` argument results in higher RAM usage.
The higher number of scanned time series and lower `step` argument results in the higher RAM usage.
* CPU cores: a CPU core per 30 millions of scanned data points per second.
@@ -494,7 +544,7 @@ There is no downsampling support at the moment, but:
- VictoriaMetrics has good compression for on-disk data. See [this article](https://medium.com/@valyala/victoriametrics-achieving-better-compression-for-time-series-data-than-gorilla-317bc1f95932)
for details.
These properties reduce the need in downsampling. We plan implementing downsampling in the future.
These properties reduce the need in downsampling. We plan to implement downsampling in the future.
See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36) for details.
@@ -506,7 +556,7 @@ Single-node VictoriaMetrics doesn't support multi-tenancy. Use [cluster version]
### Scalability and cluster version
Though single-node VictoriaMetrics cannot scale to multiple nodes, it is optimized for resource usage - storage size / bandwidth / IOPS, RAM, CPU.
This means that a single-node VictoriaMetrics may scale vertically and substitute moderately sized cluster built with competing solutions
This means that a single-node VictoriaMetrics may scale vertically and substitute a moderately sized cluster built with competing solutions
such as Thanos, Uber M3, InfluxDB or TimescaleDB. See [vertical scalability benchmarks](https://medium.com/@valyala/measuring-vertical-scalability-for-time-series-databases-in-google-cloud-92550d78d8ae).
So try single-node VictoriaMetrics at first and then [switch to cluster version](https://github.com/VictoriaMetrics/VictoriaMetrics/tree/cluster) if you still need
@@ -522,7 +572,7 @@ on [Prometheus side](https://prometheus.io/docs/alerting/overview/) or on [Grafa
### Security
Do not forget protecting sensitive endpoints in VictoriaMetrics when exposing it to untrusted networks such as internet.
Do not forget protecting sensitive endpoints in VictoriaMetrics when exposing it to untrusted networks such as the internet.
Consider setting the following command-line flags:
* `-tls`, `-tlsCertFile` and `-tlsKeyFile` for switching from HTTP to HTTPS.
@@ -537,10 +587,10 @@ For example, substitute `-graphiteListenAddr=:2003` with `-graphiteListenAddr=<i
### Tuning
* There is no need in VictoriaMetrics tuning, since it uses reasonable defaults for command-line flags,
* There is no need in VictoriaMetrics tuning since it uses reasonable defaults for command-line flags,
which are automatically adjusted for the available CPU and RAM resources.
* There is no need in Operating System tuning, since VictoriaMetrics is optimized for default OS settings.
The only option is increasing the limit on [the number open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a),
* There is no need in Operating System tuning since VictoriaMetrics is optimized for default OS settings.
The only option is increasing the limit on [the number of open files in the OS](https://medium.com/@muhammadtriwibowo/set-permanently-ulimit-n-open-files-in-ubuntu-4d61064429a),
so Prometheus instances could establish more connections to VictoriaMetrics.
@@ -575,11 +625,26 @@ The most interesting metrics are:
Another option is to increase `-memory.allowedPercent` command-line flag value. Be careful with this
option, since too big value for `-memory.allowedPercent` may result in high I/O usage.
* VictoriaMetrics requires free disk space for [merging data files to bigger ones](https://medium.com/@valyala/how-victoriametrics-makes-instant-snapshots-for-multi-terabyte-time-series-data-e1f3fb0e0282).
It may slow down when there is no enough free space left. So make sure `-storageDataPath` directory
has at least 20% of free space comparing to disk size.
* If VictoriaMetrics doesn't work because of certain parts are corrupted due to disk errors,
then just remove directoreis with broken parts. This will recover VictoriaMetrics at the cost
of data loss stored in the broken parts. In the future `vmrecover` tool will be created
of data loss stored in the broken parts. In the future, `vmrecover` tool will be created
for automatic recovering from such errors.
## Roadmap
- [ ] Replication [#118](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/118)
- [ ] Support of Object Storages (GCS, S3, Azure Storage) [#38](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/38)
- [ ] Data downsampling [#36](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/36)
- [ ] Alert Manager Integration [#119](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/119)
- [ ] CLI tool for data migration, re-balancing and adding/removing nodes [#103](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/103)
The discussion happens [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/129). Feel free to comment any item or add own one.
## Contacts
@@ -596,7 +661,7 @@ Feel free asking any questions regarding VictoriaMetrics:
- [google groups](https://groups.google.com/forum/#!forum/victorametrics-users)
If you like VictoriaMetrics and want contributing, then we need the following:
If you like VictoriaMetrics and want to contribute, then we need the following:
- Filing issues and feature requests [here](https://github.com/VictoriaMetrics/VictoriaMetrics/issues).
- Spreading a word about VictoriaMetrics: conference talks, articles, comments, experience sharing with colleagues.

View File

@@ -21,7 +21,52 @@ run-victoria-metrics:
$(MAKE) run-via-docker
victoria-metrics-arm:
CC=arm-linux-gnueabi-gcc CGO_ENABLED=1 GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm ./app/victoria-metrics
CGO_ENABLED=0 GOOS=linux GOARCH=arm GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm ./app/victoria-metrics
victoria-metrics-arm-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm' $(MAKE) app-via-docker
victoria-metrics-arm64:
CC=aarch64-linux-gnu-gcc CGO_ENABLED=1 GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm64 ./app/victoria-metrics
CGO_ENABLED=0 GOOS=linux GOARCH=arm64 GO111MODULE=on go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-arm64 ./app/victoria-metrics
victoria-metrics-arm64-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-arm64' DOCKER_OPTS='--env CGO_ENABLED=0 --env GOARCH=arm64' $(MAKE) app-via-docker
victoria-metrics-pure:
GO111MODULE=on CGO_ENABLED=0 go build -mod=vendor -ldflags "$(GO_BUILDINFO)" -o bin/victoria-metrics-pure ./app/victoria-metrics
victoria-metrics-pure-prod:
APP_NAME=victoria-metrics APP_SUFFIX='-pure' DOCKER_OPTS='--env CGO_ENABLED=0' $(MAKE) app-via-docker
### Packaging as DEB - amd64
victoria-metrics-package-deb: victoria-metrics-prod
./package/package_deb.sh amd64
### Packaging as DEB - arm64
victoria-metrics-package-deb-arm64: victoria-metrics-arm64-prod
./package/package_deb.sh arm64
### Packaging as DEB - all
victoria-metrics-package-deb-all: \
victoria-metrics-package-deb \
victoria-metrics-package-deb-arm64
### Packaging as RPM - amd64
victoria-metrics-package-rpm: victoria-metrics-prod
./package/package_rpm.sh amd64
### Packaging as RPM - arm64
victoria-metrics-package-rpm-arm64: victoria-metrics-arm64-prod
./package/package_rpm.sh arm64
### Packaging as RPM - all
victoria-metrics-package-rpm-all: \
victoria-metrics-package-rpm \
victoria-metrics-package-rpm-arm64
### Packaging as both DEB and RPM - all
victoria-metrics-package-deb-rpm-all: \
victoria-metrics-package-deb \
victoria-metrics-package-deb-arm64 \
victoria-metrics-package-rpm \
victoria-metrics-package-rpm-arm64

View File

@@ -32,6 +32,17 @@ func Init() {
func Do(f func() error) error {
// Limit the number of conurrent f calls in order to prevent from excess
// memory usage and CPU trashing.
select {
case ch <- struct{}{}:
err := f()
<-ch
return err
default:
}
// All the workers are busy.
// Sleep for up to waitDuration.
concurrencyLimitReached.Inc()
t := timerpool.Get(waitDuration)
select {
case ch <- struct{}{}:
@@ -41,9 +52,19 @@ func Do(f func() error) error {
return err
case <-t.C:
timerpool.Put(t)
concurrencyLimitErrors.Inc()
concurrencyLimitTimeout.Inc()
return fmt.Errorf("the server is overloaded with %d concurrent inserts; either increase -maxConcurrentInserts or reduce the load", cap(ch))
}
}
var concurrencyLimitErrors = metrics.NewCounter(`vm_concurrency_limit_errors_total`)
var (
concurrencyLimitReached = metrics.NewCounter(`vm_concurrent_insert_limit_reached_total`)
concurrencyLimitTimeout = metrics.NewCounter(`vm_concurrent_insert_limit_timeout_total`)
_ = metrics.NewGauge(`vm_concurrent_insert_capacity`, func() float64 {
return float64(cap(ch))
})
_ = metrics.NewGauge(`vm_concurrent_insert_current`, func() float64 {
return float64(len(ch))
})
)

View File

@@ -114,6 +114,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Graphite line %q: %s", s, err)
return dst, tagsPool, err
}
return dst, tagsPool, nil
@@ -121,6 +122,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
var err error
tagsPool, err = r.unmarshal(s[:n], tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Graphite line %q: %s", s[:n], err)
return dst, tagsPool, err
}
s = s[n+1:]

View File

@@ -14,7 +14,10 @@ import (
"github.com/VictoriaMetrics/metrics"
)
var rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="graphite"}`)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="graphite"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="graphite"}`)
)
// insertHandler processes remote write for graphite plaintext protocol.
//
@@ -51,6 +54,7 @@ func (ctx *pushCtx) InsertRows() error {
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
}

View File

@@ -196,6 +196,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag, fieldsPool []Field) ([]R
var err error
tagsPool, fieldsPool, err = r.unmarshal(s, tagsPool, fieldsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Influx line %q: %s", s, err)
return dst, tagsPool, fieldsPool, err
}
return dst, tagsPool, fieldsPool, nil
@@ -203,6 +204,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag, fieldsPool []Field) ([]R
var err error
tagsPool, fieldsPool, err = r.unmarshal(s[:n], tagsPool, fieldsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal Influx line %q: %s", s[:n], err)
return dst, tagsPool, fieldsPool, err
}
s = s[n+1:]

View File

@@ -22,7 +22,10 @@ var (
skipSingleField = flag.Bool("influxSkipSingleField", false, "Uses `{measurement}` instead of `{measurement}{separator}{field_name}` for metic name if Influx line contains only a single field")
)
var rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="influx"}`)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="influx"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="influx"}`)
)
// InsertHandler processes remote write for influx line protocol.
//
@@ -84,6 +87,7 @@ func (ctx *pushCtx) InsertRows(db string) error {
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range rows {
r := &rows[i]
ic.Labels = ic.Labels[:0]
@@ -109,8 +113,10 @@ func (ctx *pushCtx) InsertRows(db string) error {
ic.AddLabel("", metricGroup)
ic.WriteDataPoint(ctx.metricNameBuf, ic.Labels[:1], r.Timestamp, f.Value)
}
rowsInserted.Add(len(r.Fields))
rowsTotal += len(r.Fields)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}
@@ -164,6 +170,7 @@ func (ctx *pushCtx) Read(r io.Reader, tsMultiplier int64) bool {
}
} else if tsMultiplier < 0 {
tsMultiplier = -tsMultiplier
currentTs -= currentTs % tsMultiplier
for i := range ctx.Rows.Rows {
row := &ctx.Rows.Rows[i]
if row.Timestamp == 0 {

View File

@@ -111,6 +111,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
var err error
tagsPool, err = r.unmarshal(s, tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal OpenTSDB line %q: %s", s, err)
return dst, tagsPool, err
}
return dst, tagsPool, nil
@@ -118,6 +119,7 @@ func unmarshalRows(dst []Row, s string, tagsPool []Tag) ([]Row, []Tag, error) {
var err error
tagsPool, err = r.unmarshal(s[:n], tagsPool)
if err != nil {
err = fmt.Errorf("cannot unmarshal OpenTSDB line %q: %s", s[:n], err)
return dst, tagsPool, err
}
s = s[n+1:]

View File

@@ -14,7 +14,10 @@ import (
"github.com/VictoriaMetrics/metrics"
)
var rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb"}`)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="opentsdb"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="opentsdb"}`)
)
// insertHandler processes remote write for OpenTSDB put protocol.
//
@@ -51,6 +54,7 @@ func (ctx *pushCtx) InsertRows() error {
ic.WriteDataPoint(nil, ic.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(rows))
rowsPerInsert.Update(float64(len(rows)))
return ic.FlushBufs()
}

View File

@@ -12,7 +12,10 @@ import (
"github.com/VictoriaMetrics/metrics"
)
var rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="prometheus"}`)
rowsPerInsert = metrics.NewSummary(`vm_rows_per_insert{type="prometheus"}`)
)
// InsertHandler processes remote write for prometheus.
func InsertHandler(r *http.Request, maxSize int64) error {
@@ -34,6 +37,7 @@ func insertHandlerInternal(r *http.Request, maxSize int64) error {
}
ic := &ctx.Common
ic.Reset(rowsLen)
rowsTotal := 0
for i := range timeseries {
ts := &timeseries[i]
var metricNameRaw []byte
@@ -41,8 +45,10 @@ func insertHandlerInternal(r *http.Request, maxSize int64) error {
r := &ts.Samples[i]
metricNameRaw = ic.WriteDataPointExt(metricNameRaw, ts.Labels, r.Timestamp, r.Value)
}
rowsInserted.Add(len(ts.Samples))
rowsTotal += len(ts.Samples)
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ic.FlushBufs()
}

View File

@@ -30,29 +30,49 @@ func Init() {
fs.RemoveDirContents(tmpDirPath)
netstorage.InitTmpBlocksDir(tmpDirPath)
promql.InitRollupResultCache(*vmstorage.DataPath + "/cache/rollupResult")
concurrencyCh = make(chan struct{}, *maxConcurrentRequests)
}
var concurrencyCh chan struct{}
// Stop stops vmselect
func Stop() {
promql.StopRollupResultCache()
}
var concurrencyCh chan struct{}
var (
concurrencyLimitReached = metrics.NewCounter(`vm_concurrent_select_limit_reached_total`)
concurrencyLimitTimeout = metrics.NewCounter(`vm_concurrent_select_limit_timeout_total`)
_ = metrics.NewGauge(`vm_concurrent_select_capacity`, func() float64 {
return float64(cap(concurrencyCh))
})
_ = metrics.NewGauge(`vm_concurrent_select_current`, func() float64 {
return float64(len(concurrencyCh))
})
)
// RequestHandler handles remote read API requests for Prometheus
func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
// Limit the number of concurrent queries.
// Sleep for a while until giving up. This should resolve short bursts in requests.
t := timerpool.Get(*maxQueueDuration)
select {
case concurrencyCh <- struct{}{}:
timerpool.Put(t)
defer func() { <-concurrencyCh }()
case <-t.C:
timerpool.Put(t)
httpserver.Errorf(w, "cannot handle more than %d concurrent requests", cap(concurrencyCh))
return true
default:
// Sleep for a while until giving up. This should resolve short bursts in requests.
concurrencyLimitReached.Inc()
t := timerpool.Get(*maxQueueDuration)
select {
case concurrencyCh <- struct{}{}:
timerpool.Put(t)
defer func() { <-concurrencyCh }()
case <-t.C:
timerpool.Put(t)
concurrencyLimitTimeout.Inc()
httpserver.Errorf(w, "cannot handle more than %d concurrent requests", cap(concurrencyCh))
return true
}
}
path := strings.Replace(r.URL.Path, "//", "/", -1)

View File

@@ -49,8 +49,9 @@ func (r *Result) reset() {
// Results holds results returned from ProcessSearchQuery.
type Results struct {
tr storage.TimeRange
deadline Deadline
tr storage.TimeRange
fetchData bool
deadline Deadline
tbf *tmpBlocksFile
@@ -103,10 +104,10 @@ func (rss *Results) RunParallel(f func(rs *Result, workerID uint)) error {
err = fmt.Errorf("timeout exceeded during query execution: %s", rss.deadline.Timeout)
break
}
if err = pts.Unpack(rss.tbf, rs, rss.tr, maxWorkersCount); err != nil {
if err = pts.Unpack(rss.tbf, rs, rss.tr, rss.fetchData, maxWorkersCount); err != nil {
break
}
if len(rs.Timestamps) == 0 {
if len(rs.Timestamps) == 0 && rss.fetchData {
// Skip empty blocks.
continue
}
@@ -149,7 +150,7 @@ type packedTimeseries struct {
}
// Unpack unpacks pts to dst.
func (pts *packedTimeseries) Unpack(tbf *tmpBlocksFile, dst *Result, tr storage.TimeRange, maxWorkersCount int) error {
func (pts *packedTimeseries) Unpack(tbf *tmpBlocksFile, dst *Result, tr storage.TimeRange, fetchData bool, maxWorkersCount int) error {
dst.reset()
if err := dst.MetricName.Unmarshal(bytesutil.ToUnsafeBytes(pts.metricName)); err != nil {
@@ -176,7 +177,7 @@ func (pts *packedTimeseries) Unpack(tbf *tmpBlocksFile, dst *Result, tr storage.
var err error
for addr := range workCh {
sb := getSortBlock()
if err = sb.unpackFrom(tbf, addr, tr); err != nil {
if err = sb.unpackFrom(tbf, addr, tr, fetchData); err != nil {
break
}
@@ -295,10 +296,12 @@ func (sb *sortBlock) reset() {
sb.NextIdx = 0
}
func (sb *sortBlock) unpackFrom(tbf *tmpBlocksFile, addr tmpBlockAddr, tr storage.TimeRange) error {
func (sb *sortBlock) unpackFrom(tbf *tmpBlocksFile, addr tmpBlockAddr, tr storage.TimeRange, fetchData bool) error {
tbf.MustReadBlockAt(&sb.b, addr)
if err := sb.b.UnmarshalData(); err != nil {
return fmt.Errorf("cannot unmarshal block: %s", err)
if fetchData {
if err := sb.b.UnmarshalData(); err != nil {
return fmt.Errorf("cannot unmarshal block: %s", err)
}
}
timestamps := sb.b.Timestamps()
@@ -460,7 +463,7 @@ var ssPool sync.Pool
var missingMetricNamesForMetricID = metrics.NewCounter(`vm_missing_metric_names_for_metric_id_total`)
// ProcessSearchQuery performs sq on storage nodes until the given deadline.
func ProcessSearchQuery(sq *storage.SearchQuery, deadline Deadline) (*Results, error) {
func ProcessSearchQuery(sq *storage.SearchQuery, fetchData bool, deadline Deadline) (*Results, error) {
// Setup search.
tfss, err := setupTfss(sq.TagFilterss)
if err != nil {
@@ -476,35 +479,38 @@ func ProcessSearchQuery(sq *storage.SearchQuery, deadline Deadline) (*Results, e
sr := getStorageSearch()
defer putStorageSearch(sr)
sr.Init(vmstorage.Storage, tfss, tr, *maxMetricsPerSearch)
sr.Init(vmstorage.Storage, tfss, tr, fetchData, *maxMetricsPerSearch)
tbf := getTmpBlocksFile()
m := make(map[string][]tmpBlockAddr)
blocksRead := 0
for sr.NextMetricBlock() {
blocksRead++
addr, err := tbf.WriteBlock(sr.MetricBlock.Block)
if err != nil {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("cannot write data to temporary blocks file: %s", err)
return nil, fmt.Errorf("cannot write data block #%d to temporary blocks file: %s", blocksRead, err)
}
if time.Until(deadline.Deadline) < 0 {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("timeout exceeded while fetching data from storage: %s", deadline.Timeout)
return nil, fmt.Errorf("timeout exceeded while fetching data block #%d from storage: %s", blocksRead, deadline.Timeout)
}
metricName := sr.MetricBlock.MetricName
m[string(metricName)] = append(m[string(metricName)], addr)
}
if err := sr.Error(); err != nil {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("search error: %s", err)
return nil, fmt.Errorf("search error after reading %d data blocks: %s", blocksRead, err)
}
if err := tbf.Finalize(); err != nil {
putTmpBlocksFile(tbf)
return nil, fmt.Errorf("cannot finalize temporary blocks file: %s", err)
return nil, fmt.Errorf("cannot finalize temporary blocks file with %d blocks: %s", blocksRead, err)
}
var rss Results
rss.packedTimeseries = make([]packedTimeseries, len(m))
rss.tr = tr
rss.fetchData = fetchData
rss.deadline = deadline
rss.tbf = tbf
i := 0

View File

@@ -6,12 +6,15 @@ import (
"math"
"net/http"
"runtime"
"sort"
"strconv"
"strings"
"sync"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/netstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/quicktemplate"
@@ -65,7 +68,7 @@ func FederateHandler(w http.ResponseWriter, r *http.Request) error {
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, deadline)
rss, err := netstorage.ProcessSearchQuery(sq, true, deadline)
if err != nil {
return fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}
@@ -157,7 +160,7 @@ func exportHandler(w http.ResponseWriter, matches []string, start, end int64, fo
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, deadline)
rss, err := netstorage.ProcessSearchQuery(sq, true, deadline)
if err != nil {
return fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}
@@ -230,9 +233,39 @@ var deleteDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/
func LabelValuesHandler(labelName string, w http.ResponseWriter, r *http.Request) error {
startTime := time.Now()
deadline := getDeadline(r)
labelValues, err := netstorage.GetLabelValues(labelName, deadline)
if err != nil {
return fmt.Errorf(`cannot obtain label values for %q: %s`, labelName, err)
if err := r.ParseForm(); err != nil {
return fmt.Errorf("cannot parse form values: %s", err)
}
var labelValues []string
if len(r.Form["match[]"]) == 0 && len(r.Form["start"]) == 0 && len(r.Form["end"]) == 0 {
var err error
labelValues, err = netstorage.GetLabelValues(labelName, deadline)
if err != nil {
return fmt.Errorf(`cannot obtain label values for %q: %s`, labelName, err)
}
} else {
// Extended functionality that allows filtering by label filters and time range
// i.e. /api/v1/label/foo/values?match[]=foobar{baz="abc"}&start=...&end=...
// is equivalent to `label_values(foobar{baz="abc"}, foo)` call on the selected
// time range in Grafana templating.
matches := r.Form["match[]"]
if len(matches) == 0 {
matches = []string{fmt.Sprintf("{%s!=''}", labelName)}
}
ct := currentTime()
end, err := getTime(r, "end", ct)
if err != nil {
return err
}
start, err := getTime(r, "start", end-defaultStep)
if err != nil {
return err
}
labelValues, err = labelValuesWithMatches(labelName, matches, start, end, deadline)
if err != nil {
return fmt.Errorf("cannot obtain label values for %q, match[]=%q, start=%d, end=%d: %s", labelName, matches, start, end, err)
}
}
w.Header().Set("Content-Type", "application/json")
@@ -241,6 +274,50 @@ func LabelValuesHandler(labelName string, w http.ResponseWriter, r *http.Request
return nil
}
func labelValuesWithMatches(labelName string, matches []string, start, end int64, deadline netstorage.Deadline) ([]string, error) {
if len(matches) == 0 {
logger.Panicf("BUG: matches must be non-empty")
}
tagFilterss, err := getTagFilterssFromMatches(matches)
if err != nil {
return nil, err
}
if start >= end {
start = end - defaultStep
}
sq := &storage.SearchQuery{
MinTimestamp: start,
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
if err != nil {
return nil, fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}
m := make(map[string]struct{})
var mLock sync.Mutex
err = rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
labelValue := rs.MetricName.GetTagValue(labelName)
if len(labelValue) == 0 {
return
}
mLock.Lock()
m[string(labelValue)] = struct{}{}
mLock.Unlock()
})
if err != nil {
return nil, fmt.Errorf("error when data fetching: %s", err)
}
labelValues := make([]string, 0, len(m))
for labelValue := range m {
labelValues = append(labelValues, labelValue)
}
sort.Strings(labelValues)
return labelValues, nil
}
var labelValuesDuration = metrics.NewSummary(`vm_request_duration_seconds{path="/api/v1/label/{}/values"}`)
// LabelsCountHandler processes /api/v1/labels/count request.
@@ -309,13 +386,16 @@ func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
if len(matches) == 0 {
return fmt.Errorf("missing `match[]` arg")
}
// Set start to minTimeMsecs by default as Prometheus does.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
start, err := getTime(r, "start", minTimeMsecs)
end, err := getTime(r, "end", ct)
if err != nil {
return err
}
end, err := getTime(r, "end", ct)
// Do not set start to minTimeMsecs by default as Prometheus does,
// since this leads to fetching and scanning all the data from the storage,
// which can take a lot of time for big storages.
// It is better setting start as end-defaultStep by default.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/91
start, err := getTime(r, "start", end-defaultStep)
if err != nil {
return err
}
@@ -333,7 +413,7 @@ func SeriesHandler(w http.ResponseWriter, r *http.Request) error {
MaxTimestamp: end,
TagFilterss: tagFilterss,
}
rss, err := netstorage.ProcessSearchQuery(sq, deadline)
rss, err := netstorage.ProcessSearchQuery(sq, false, deadline)
if err != nil {
return fmt.Errorf("cannot fetch data for %q: %s", sq, err)
}

View File

@@ -312,7 +312,11 @@ func aggrFuncCount(tss []*timeseries) []*timeseries {
}
count++
}
dst.Values[i] = float64(count)
v := float64(count)
if count == 0 {
v = nan
}
dst.Values[i] = v
}
return tss[:1]
}

View File

@@ -348,7 +348,12 @@ func mergeAggrCount(dst, src *incrementalAggrContext) {
}
func finalizeAggrCount(iac *incrementalAggrContext) {
// Nothing to do
dstValues := iac.ts.Values
for i, v := range dstValues {
if v == 0 {
dstValues[i] = nan
}
}
}
func updateAggrSum2(iac *incrementalAggrContext, values []float64) {

View File

@@ -81,7 +81,7 @@ func TestIncrementalAggr(t *testing.T) {
})
t.Run("count", func(t *testing.T) {
t.Parallel()
valuesExpected := []float64{6, 0, 5, 5}
valuesExpected := []float64{6, nan, 5, 5}
f("count", valuesExpected)
})
t.Run("sum2", func(t *testing.T) {

View File

@@ -466,31 +466,19 @@ func evalRollupFuncWithSubquery(ec *EvalConfig, name string, rf rollupFunc, re *
preFunc, rcs := getRollupConfigs(name, rf, ec.Start, ec.End, ec.Step, window, sharedTimestamps)
tss := make([]*timeseries, 0, len(tssSQ)*len(rcs))
var tssLock sync.Mutex
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
doParallel(tssSQ, func(tsSQ *timeseries, values []float64, timestamps []int64) ([]float64, []int64) {
values, timestamps = removeNanValues(values[:0], timestamps[:0], tsSQ.Values, tsSQ.Timestamps)
preFunc(values, timestamps)
for _, rc := range rcs {
var ts timeseries
ts.MetricName.CopyFrom(&tsSQ.MetricName)
if len(rc.TagValue) > 0 {
ts.MetricName.AddTag("rollup", rc.TagValue)
}
ts.Values = rc.Do(ts.Values[:0], values, timestamps)
ts.Timestamps = sharedTimestamps
ts.denyReuse = true
doRollupForTimeseries(rc, &ts, &tsSQ.MetricName, values, timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
tss = append(tss, &ts)
tssLock.Unlock()
}
return values, timestamps
})
if !rollupFuncsKeepMetricGroup[name] {
tss = copyTimeseriesMetricNames(tss)
for _, ts := range tss {
ts.MetricName.ResetMetricGroup()
}
}
return tss, nil
}
@@ -582,7 +570,7 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
MaxTimestamp: ec.End + ec.Step,
TagFilterss: [][]storage.TagFilter{me.TagFilters},
}
rss, err := netstorage.ProcessSearchQuery(sq, ec.Deadline)
rss, err := netstorage.ProcessSearchQuery(sq, true, ec.Deadline)
if err != nil {
return nil, err
}
@@ -614,22 +602,16 @@ func evalRollupFuncWithMetricExpr(ec *EvalConfig, name string, rf rollupFunc, me
defer rml.Put(uint64(rollupMemorySize))
// Evaluate rollup
removeMetricGroup := !rollupFuncsKeepMetricGroup[name]
var tss []*timeseries
if iafc != nil {
tss, err = evalRollupWithIncrementalAggregate(iafc, rss, rcs, preFunc, sharedTimestamps)
tss, err = evalRollupWithIncrementalAggregate(iafc, rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
} else {
tss, err = evalRollupNoIncrementalAggregate(rss, rcs, preFunc, sharedTimestamps)
tss, err = evalRollupNoIncrementalAggregate(rss, rcs, preFunc, sharedTimestamps, removeMetricGroup)
}
if err != nil {
return nil, err
}
if !rollupFuncsKeepMetricGroup[name] {
tss = copyTimeseriesMetricNames(tss)
for _, ts := range tss {
ts.MetricName.ResetMetricGroup()
}
}
tss = mergeTimeseries(tssCached, tss, start, ec)
rollupResultCacheV.Put(name, ec, me, iafc, window, tss)
@@ -649,21 +631,19 @@ func getRollupMemoryLimiter() *memoryLimiter {
}
func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64) ([]*timeseries, error) {
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
ts := getTimeseries()
defer putTimeseries(ts)
for _, rc := range rcs {
ts.Reset()
ts.MetricName.CopyFrom(&rs.MetricName)
if len(rc.TagValue) > 0 {
ts.MetricName.AddTag("rollup", rc.TagValue)
}
ts.Values = rc.Do(ts.Values[:0], rs.Values, rs.Timestamps)
ts.Timestamps = sharedTimestamps
doRollupForTimeseries(rc, ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
iafc.updateTimeseries(ts, workerID)
// ts.Timestamps points to sharedTimestamps. Zero it, so it can be re-used.
ts.Timestamps = nil
ts.denyReuse = false
}
})
if err != nil {
@@ -674,21 +654,14 @@ func evalRollupWithIncrementalAggregate(iafc *incrementalAggrFuncContext, rss *n
}
func evalRollupNoIncrementalAggregate(rss *netstorage.Results, rcs []*rollupConfig,
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64) ([]*timeseries, error) {
preFunc func(values []float64, timestamps []int64), sharedTimestamps []int64, removeMetricGroup bool) ([]*timeseries, error) {
tss := make([]*timeseries, 0, rss.Len()*len(rcs))
var tssLock sync.Mutex
err := rss.RunParallel(func(rs *netstorage.Result, workerID uint) {
preFunc(rs.Values, rs.Timestamps)
for _, rc := range rcs {
var ts timeseries
ts.MetricName.CopyFrom(&rs.MetricName)
if len(rc.TagValue) > 0 {
ts.MetricName.AddTag("rollup", rc.TagValue)
}
ts.Values = rc.Do(ts.Values[:0], rs.Values, rs.Timestamps)
ts.Timestamps = sharedTimestamps
ts.denyReuse = true
doRollupForTimeseries(rc, &ts, &rs.MetricName, rs.Values, rs.Timestamps, sharedTimestamps, removeMetricGroup)
tssLock.Lock()
tss = append(tss, &ts)
tssLock.Unlock()
@@ -700,6 +673,20 @@ func evalRollupNoIncrementalAggregate(rss *netstorage.Results, rcs []*rollupConf
return tss, nil
}
func doRollupForTimeseries(rc *rollupConfig, tsDst *timeseries, mnSrc *storage.MetricName, valuesSrc []float64, timestampsSrc []int64,
sharedTimestamps []int64, removeMetricGroup bool) {
tsDst.MetricName.CopyFrom(mnSrc)
if len(rc.TagValue) > 0 {
tsDst.MetricName.AddTag("rollup", rc.TagValue)
}
if removeMetricGroup {
tsDst.MetricName.ResetMetricGroup()
}
tsDst.Values = rc.Do(tsDst.Values[:0], valuesSrc, timestampsSrc)
tsDst.Timestamps = sharedTimestamps
tsDst.denyReuse = true
}
func getRollupConfigs(name string, rf rollupFunc, start, end, step, window int64, sharedTimestamps []int64) (func(values []float64, timestamps []int64), []*rollupConfig) {
preFunc := func(values []float64, timestamps []int64) {}
if rollupFuncsRemoveCounterResets[name] {

View File

@@ -16,6 +16,8 @@ import (
var logSlowQueryDuration = flag.Duration("search.logSlowQueryDuration", 5*time.Second, "Log queries with execution time exceeding this value. Zero disables slow query logging")
var slowQueries = metrics.NewCounter(`vm_slow_queries_total`)
// ExpandWithExprs expands WITH expressions inside q and returns the resulting
// PromQL without WITH expressions.
func ExpandWithExprs(q string) (string, error) {
@@ -36,6 +38,7 @@ func Exec(ec *EvalConfig, q string, isFirstPointOnly bool) ([]netstorage.Result,
if d >= *logSlowQueryDuration {
logger.Infof("slow query according to -search.logSlowQueryDuration=%s: duration=%s, start=%d, end=%d, step=%d, query=%q",
*logSlowQueryDuration, d, ec.Start/1000, ec.End/1000, ec.Step/1000, q)
slowQueries.Inc()
}
}()
}

View File

@@ -2158,6 +2158,26 @@ func TestExecSuccess(t *testing.T) {
resultExpected := []netstorage.Result{r1, r2}
f(q, resultExpected)
})
t.Run(`histogram_quantile(negative-bucket-count)`, func(t *testing.T) {
t.Parallel()
q := `sort(histogram_quantile(0.6,
label_set(90, "foo", "bar", "le", "10")
or label_set(-100, "foo", "bar", "le", "30")
or label_set(300, "foo", "bar", "le", "+Inf")
))`
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`histogram_quantile(nan-bucket-count)`, func(t *testing.T) {
t.Parallel()
q := `sort(histogram_quantile(0.6,
label_set(90, "foo", "bar", "le", "10")
or label_set(NaN, "foo", "bar", "le", "30")
or label_set(300, "foo", "bar", "le", "+Inf")
))`
resultExpected := []netstorage.Result{}
f(q, resultExpected)
})
t.Run(`median_over_time()`, func(t *testing.T) {
t.Parallel()
q := `median_over_time({})`
@@ -2343,10 +2363,10 @@ func TestExecSuccess(t *testing.T) {
})
t.Run(`count(multi-vector)`, func(t *testing.T) {
t.Parallel()
q := `count(label_set(10, "foo", "bar") or label_set((15-time()/100)^0.5, "baz", "sss"))`
q := `count(label_set(time()<1500, "foo", "bar") or label_set(time()<1800, "baz", "sss"))`
r := netstorage.Result{
MetricName: metricNameExpected,
Values: []float64{2, 2, 2, 1, 1, 1},
Values: []float64{2, 2, 2, 1, nan, nan},
Timestamps: timestampsExpected,
}
resultExpected := []netstorage.Result{r}

View File

@@ -54,10 +54,13 @@ var rollupFuncs = map[string]newRollupFunc{
}
var rollupFuncsMayAdjustWindow = map[string]bool{
"deriv": true,
"deriv_fast": true,
"irate": true,
"rate": true,
"default_rollup": true,
"first_over_time": true,
"last_over_time": true,
"deriv": true,
"deriv_fast": true,
"irate": true,
"rate": true,
}
var rollupFuncsRemoveCounterResets = map[string]bool{
@@ -228,10 +231,27 @@ func getMaxPrevInterval(timestamps []int64) int64 {
}
d := (timestamps[len(timestamps)-1] - timestamps[0]) / int64(len(timestamps)-1)
if d <= 0 {
return 1
return int64(maxSilenceInterval)
}
// Slightly increase d in order to handle possible jitter in scrape interval.
return d + (d / 16)
// Increase d more for smaller scrape intervals in order to hide possible gaps
// when high jitter is present.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/139 .
if d <= 2*1000 {
return d + 4*d
}
if d <= 4*1000 {
return d + 2*d
}
if d <= 8*1000 {
return d + d
}
if d <= 16*1000 {
return d + d/2
}
if d <= 32*1000 {
return d + d/4
}
return d + d/8
}
func removeCounterResets(values []float64) {

View File

@@ -347,7 +347,7 @@ func TestRollupNoWindowNoPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{2, 0, 0, 0, 0, 0, 0, nan}
valuesExpected := []float64{2, 0, 0, 0, 0, 0, 0, 0}
timestampsExpected := []int64{120, 124, 128, 132, 136, 140, 144, 148}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
@@ -371,15 +371,15 @@ func TestRollupWindowNoPoints(t *testing.T) {
t.Run("afterEnd", func(t *testing.T) {
rc := rollupConfig{
Func: rollupFirst,
Start: 141,
End: 171,
Start: 161,
End: 191,
Step: 10,
Window: 3,
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{34, nan, nan, nan}
timestampsExpected := []int64{141, 151, 161, 171}
valuesExpected := []float64{34, 34, 34, nan}
timestampsExpected := []int64{161, 171, 181, 191}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})
}
@@ -454,7 +454,7 @@ func TestRollupWindowPartialPoints(t *testing.T) {
}
rc.Timestamps = getTimestamps(rc.Start, rc.End, rc.Step)
values := rc.Do(nil, testValues, testTimestamps)
valuesExpected := []float64{44, 34, 34, nan}
valuesExpected := []float64{44, 34, 34, 34}
timestampsExpected := []int64{100, 120, 140, 160}
testRowsEqual(t, values, rc.Timestamps, valuesExpected, timestampsExpected)
})

View File

@@ -324,6 +324,14 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
if math.IsNaN(phi) {
return nan
}
// Verify for broken buckets with NaN or negative values.
for _, xs := range xss {
v := xs.ts.Values[i]
if math.IsNaN(v) || v < 0 {
// Broken bucket.
return nan
}
}
if phi < 0 {
return -inf
}
@@ -334,10 +342,6 @@ func transformHistogramQuantile(tfa *transformFuncArg) ([]*timeseries, error) {
for _, xs := range xss {
v := xs.ts.Values[i]
le := xs.le
if v <= vPrev {
v = vPrev
le = lePrev
}
if v < vReq {
vPrev = v
lePrev = le

View File

@@ -358,6 +358,29 @@ func registerStorageMetrics() {
return float64(idbm().SizeBytes)
})
metrics.NewGauge(`vm_rows_ignored_total{reason="big_timestamp"}`, func() float64 {
return float64(m().TooBigTimestampRows)
})
metrics.NewGauge(`vm_rows_ignored_total{reason="small_timestamp"}`, func() float64 {
return float64(m().TooSmallTimestampRows)
})
metrics.NewGauge(`vm_concurrent_addrows_limit_reached_total`, func() float64 {
return float64(m().AddRowsConcurrencyLimitReached)
})
metrics.NewGauge(`vm_concurrent_addrows_limit_timeout_total`, func() float64 {
return float64(m().AddRowsConcurrencyLimitTimeout)
})
metrics.NewGauge(`vm_concurrent_addrows_dropped_rows_total`, func() float64 {
return float64(m().AddRowsConcurrencyDroppedRows)
})
metrics.NewGauge(`vm_concurrent_addrows_capacity`, func() float64 {
return float64(m().AddRowsConcurrencyCapacity)
})
metrics.NewGauge(`vm_concurrent_addrows_current`, func() float64 {
return float64(m().AddRowsConcurrencyCurrent)
})
metrics.NewGauge(`vm_rows{type="storage/big"}`, func() float64 {
return float64(tm().BigRowsCount)
})

View File

@@ -60,12 +60,12 @@
}
]
},
"description": "Overview for single node VictoriaMetrics",
"description": "Overview for single node VictoriaMetrics v1.22.2 or higher",
"editable": true,
"gnetId": 10229,
"graphTooltip": 0,
"id": null,
"iteration": 1562261153865,
"iteration": 1563651131627,
"links": [
{
"icon": "doc",
@@ -1693,7 +1693,7 @@
},
"id": 46,
"panels": [],
"title": "Go runtime",
"title": "Resource usage",
"type": "row"
},
{
@@ -1805,7 +1805,6 @@
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"fill": 1,
"gridPos": {
"h": 8,
@@ -1813,13 +1812,13 @@
"x": 12,
"y": 69
},
"id": 42,
"id": 57,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": false,
"show": true,
"total": false,
"values": false
},
@@ -1838,10 +1837,10 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(go_gc_duration_seconds_sum{job=\"$job\"}[2m]))",
"expr": "rate(process_cpu_seconds_total{job=\"$job\"}[5m])",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "gc duration",
"intervalFactor": 1,
"legendFormat": "Rate 5m",
"refId": "A"
}
],
@@ -1849,7 +1848,7 @@
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "GC duration",
"title": "CPU",
"tooltip": {
"shared": true,
"sort": 0,
@@ -1865,7 +1864,7 @@
},
"yaxes": [
{
"format": "s",
"format": "short",
"label": null,
"logBase": 1,
"max": null,
@@ -1986,6 +1985,92 @@
"x": 12,
"y": 77
},
"id": 42,
"legend": {
"avg": false,
"current": false,
"max": false,
"min": false,
"show": false,
"total": false,
"values": false
},
"lines": true,
"linewidth": 1,
"links": [],
"nullPointMode": "null",
"options": {},
"percentage": false,
"pointradius": 2,
"points": false,
"renderer": "flot",
"seriesOverrides": [],
"spaceLength": 10,
"stack": false,
"steppedLine": false,
"targets": [
{
"expr": "sum(rate(go_gc_duration_seconds_sum{job=\"$job\"}[2m]))",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "gc duration",
"refId": "A"
}
],
"thresholds": [],
"timeFrom": null,
"timeRegions": [],
"timeShift": null,
"title": "GC duration",
"tooltip": {
"shared": true,
"sort": 0,
"value_type": "individual"
},
"type": "graph",
"xaxis": {
"buckets": null,
"mode": "time",
"name": null,
"show": true,
"values": []
},
"yaxes": [
{
"format": "s",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
},
{
"format": "short",
"label": null,
"logBase": 1,
"max": null,
"min": null,
"show": true
}
],
"yaxis": {
"align": false,
"alignLevel": null
}
},
{
"aliasColors": {},
"bars": false,
"dashLength": 10,
"dashes": false,
"datasource": "${DS_PROMETHEUS}",
"fill": 1,
"gridPos": {
"h": 8,
"w": 12,
"x": 0,
"y": 85
},
"id": 48,
"legend": {
"avg": false,
@@ -2011,7 +2096,7 @@
"steppedLine": false,
"targets": [
{
"expr": "sum(go_threads{job=\"$job\"})",
"expr": "sum(process_num_threads{job=\"$job\"})",
"format": "time_series",
"intervalFactor": 2,
"legendFormat": "threads",
@@ -2145,5 +2230,5 @@
"timezone": "",
"title": "VictoriaMetrics",
"uid": "wNf0q_kZk",
"version": 1
"version": 2
}

View File

@@ -1,5 +1,5 @@
DOCKER_NAMESPACE := victoriametrics
BUILDER_IMAGE := local/builder:go1.12.6
BUILDER_IMAGE := local/builder:go1.12.7
CERTS_IMAGE := local/certs:1.0.2
package-certs:
@@ -19,8 +19,9 @@ app-via-docker: package-certs package-builder
--mount type=bind,src="$(shell pwd)/gocache-for-docker",dst=/gocache \
--env GOCACHE=/gocache \
--env GO111MODULE=on \
$(DOCKER_OPTS) \
$(BUILDER_IMAGE) \
go build $(RACE) -mod=vendor -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' -o bin/$(APP_NAME)-prod $(PKG_PREFIX)/app/$(APP_NAME)
go build $(RACE) -mod=vendor -ldflags "-s -w -extldflags '-static' $(GO_BUILDINFO)" -tags 'netgo osusergo' -o bin/$(APP_NAME)$(APP_SUFFIX)-prod $(PKG_PREFIX)/app/$(APP_NAME)
package-via-docker:
(docker image ls --format '{{.Repository}}:{{.Tag}}' | grep -q '$(DOCKER_NAMESPACE)/$(APP_NAME):$(PKG_TAG)$(RACE)') || (\

View File

@@ -1,2 +1,2 @@
FROM golang:1.12.6
FROM golang:1.12.7
STOPSIGNAL SIGINT

6
go.mod
View File

@@ -2,15 +2,17 @@ module github.com/VictoriaMetrics/VictoriaMetrics
require (
github.com/VictoriaMetrics/fastcache v1.5.1
github.com/VictoriaMetrics/metrics v1.7.0
github.com/VictoriaMetrics/metrics v1.7.1
github.com/cespare/xxhash/v2 v2.0.1-0.20190104013014-3767db7a7e18
github.com/golang/snappy v0.0.1
github.com/google/go-cmp v0.3.0 // indirect
github.com/klauspost/compress v1.7.5
github.com/spaolacci/murmur3 v1.1.0 // indirect
github.com/valyala/fastjson v1.4.1
github.com/valyala/gozstd v1.5.1
github.com/valyala/histogram v1.0.1
github.com/valyala/quicktemplate v1.1.1
golang.org/x/sys v0.0.0-20190616124812-15dcb6c0061f
golang.org/x/sys v0.0.0-20190804053845-51ab0e2deafa
)
go 1.12

13
go.sum
View File

@@ -3,8 +3,8 @@ github.com/OneOfOne/xxhash v1.2.5 h1:zl/OfRA6nftbBK9qTohYBJ5xvw6C/oNKizR7cZGl3cI
github.com/OneOfOne/xxhash v1.2.5/go.mod h1:eZbhyaAYD41SGSSsnmcpxVoRiQ/MPUTjUdIIOT9Um7Q=
github.com/VictoriaMetrics/fastcache v1.5.1 h1:qHgHjyoNFV7jgucU8QZUuU4gcdhfs8QW1kw68OD2Lag=
github.com/VictoriaMetrics/fastcache v1.5.1/go.mod h1:+jv9Ckb+za/P1ZRg/sulP5Ni1v49daAVERr0H3CuscE=
github.com/VictoriaMetrics/metrics v1.7.0 h1:+bdBpPEMOSgOwoQFf4KHqgeAy6xiXn/uzlrKx2YSCT8=
github.com/VictoriaMetrics/metrics v1.7.0/go.mod h1:LU2j9qq7xqZYXz8tF3/RQnB2z2MbZms5TDiIg9/NHiQ=
github.com/VictoriaMetrics/metrics v1.7.1 h1:g2qrY6Upn8rvlvR40cGHFY0crwi4hpqF0n9vJMNsCSg=
github.com/VictoriaMetrics/metrics v1.7.1/go.mod h1:LU2j9qq7xqZYXz8tF3/RQnB2z2MbZms5TDiIg9/NHiQ=
github.com/allegro/bigcache v1.2.1-0.20190218064605-e24eb225f156 h1:eMwmnE/GDgah4HI848JfFxHt+iPb26b4zyfspmqY0/8=
github.com/allegro/bigcache v1.2.1-0.20190218064605-e24eb225f156/go.mod h1:Cb/ax3seSYIx7SuZdm2G2xzfwmv3TPSk2ucNfQESPXM=
github.com/cespare/xxhash v1.1.0 h1:a6HrQnmkObjyL+Gs60czilIUGqrzKutQD6XZog3p+ko=
@@ -16,9 +16,14 @@ github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c
github.com/davecgh/go-spew v1.1.1/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/golang/snappy v0.0.1 h1:Qgr9rKW7uDUkrbSmQeiDsGa8SjGyCOGtuasMWwvp2P4=
github.com/golang/snappy v0.0.1/go.mod h1:/XxbfmMg8lxefKM7IXC3fBNl/7bRcc72aCRzEWrmP2Q=
github.com/google/go-cmp v0.3.0 h1:crn/baboCvb5fXaQ0IJ1SGTsTVrWpDsCWC8EGETZijY=
github.com/google/go-cmp v0.3.0/go.mod h1:8QqcDgzrUqlUb/G2PQTWiueGozuR1884gddMywk6iLU=
github.com/klauspost/compress v1.4.0/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
github.com/klauspost/compress v1.4.1/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
github.com/klauspost/compress v1.7.5 h1:NMapGoDIKPKpk2hpcgAU6XHfsREHG2p8PIg7C3f/jpI=
github.com/klauspost/compress v1.7.5/go.mod h1:RyIbtBH6LamlWaDj8nUwkbUhJ87Yi3uG0guNDohfE1A=
github.com/klauspost/cpuid v0.0.0-20180405133222-e7e905edc00e/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
github.com/klauspost/cpuid v1.2.0 h1:NMpwD2G9JSFOE1/TJjGSo5zG7Yb2bTe7eq1jH+irmeE=
github.com/klauspost/cpuid v1.2.0/go.mod h1:Pj4uuM528wm8OyEC2QMXAi2YiTZ96dNQPGgoMS4s3ek=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
@@ -44,5 +49,5 @@ github.com/valyala/quicktemplate v1.1.1 h1:C58y/wN0FMTi2PR0n3onltemfFabany53j7M6
github.com/valyala/quicktemplate v1.1.1/go.mod h1:EH+4AkTd43SvgIbQHYu59/cJyxDoOVRUAfrukLPuGJ4=
github.com/valyala/tcplisten v0.0.0-20161114210144-ceec8f93295a/go.mod h1:v3UYOV9WzVtRmSR+PDvWpU/qWl4Wa5LApYYX4ZtKbio=
golang.org/x/net v0.0.0-20180911220305-26e67e76b6c3/go.mod h1:mL1N/T3taQHkDXs73rZJwtUhF3w3ftmwwsq0BUmARs4=
golang.org/x/sys v0.0.0-20190616124812-15dcb6c0061f h1:25KHgbfyiSm6vwQLbM3zZIe1v9p/3ea4Rz+nnM5K/i4=
golang.org/x/sys v0.0.0-20190616124812-15dcb6c0061f/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
golang.org/x/sys v0.0.0-20190804053845-51ab0e2deafa h1:KIDDMLT1O0Nr7TSxp8xM5tJcdn8tgyAONntO829og1M=
golang.org/x/sys v0.0.0-20190804053845-51ab0e2deafa/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=

View File

@@ -108,60 +108,60 @@ func testCalibrateScale(t *testing.T, a, b []int64, ae, be int16, aExpected, bEx
}
func TestMaxUpExponent(t *testing.T) {
testMaxUpExponent(t, 0, 1024)
testMaxUpExponent(t, -1<<63, 0)
testMaxUpExponent(t, (-1<<63)+1, 0)
testMaxUpExponent(t, (1<<63)-1, 0)
testMaxUpExponent(t, 1, 18)
testMaxUpExponent(t, 12, 17)
testMaxUpExponent(t, 123, 16)
testMaxUpExponent(t, 1234, 15)
testMaxUpExponent(t, 12345, 14)
testMaxUpExponent(t, 123456, 13)
testMaxUpExponent(t, 1234567, 12)
testMaxUpExponent(t, 12345678, 11)
testMaxUpExponent(t, 123456789, 10)
testMaxUpExponent(t, 1234567890, 9)
testMaxUpExponent(t, 12345678901, 8)
testMaxUpExponent(t, 123456789012, 7)
testMaxUpExponent(t, 1234567890123, 6)
testMaxUpExponent(t, 12345678901234, 5)
testMaxUpExponent(t, 123456789012345, 4)
testMaxUpExponent(t, 1234567890123456, 3)
testMaxUpExponent(t, 12345678901234567, 2)
testMaxUpExponent(t, 123456789012345678, 1)
testMaxUpExponent(t, 1234567890123456789, 0)
testMaxUpExponent(t, 923456789012345678, 0)
testMaxUpExponent(t, 92345678901234567, 1)
testMaxUpExponent(t, 9234567890123456, 2)
testMaxUpExponent(t, 923456789012345, 3)
testMaxUpExponent(t, 92345678901234, 4)
testMaxUpExponent(t, 9234567890123, 5)
testMaxUpExponent(t, 923456789012, 6)
testMaxUpExponent(t, 92345678901, 7)
testMaxUpExponent(t, 9234567890, 8)
testMaxUpExponent(t, 923456789, 9)
testMaxUpExponent(t, 92345678, 10)
testMaxUpExponent(t, 9234567, 11)
testMaxUpExponent(t, 923456, 12)
testMaxUpExponent(t, 92345, 13)
testMaxUpExponent(t, 9234, 14)
testMaxUpExponent(t, 923, 15)
testMaxUpExponent(t, 92, 17)
testMaxUpExponent(t, 9, 18)
}
f := func(v int64, eExpected int16) {
t.Helper()
func testMaxUpExponent(t *testing.T, v int64, eExpected int16) {
t.Helper()
e := maxUpExponent(v)
if e != eExpected {
t.Fatalf("unexpected e for v=%d; got %d; epxecting %d", v, e, eExpected)
}
e = maxUpExponent(-v)
if e != eExpected {
t.Fatalf("unexpected e for v=%d; got %d; expecting %d", -v, e, eExpected)
}
}
e := maxUpExponent(v)
if e != eExpected {
t.Fatalf("unexpected e for v=%d; got %d; epxecting %d", v, e, eExpected)
}
e = maxUpExponent(-v)
if e != eExpected {
t.Fatalf("unexpected e for v=%d; got %d; expecting %d", -v, e, eExpected)
}
f(0, 1024)
f(-1<<63, 0)
f((-1<<63)+1, 0)
f((1<<63)-1, 0)
f(1, 18)
f(12, 17)
f(123, 16)
f(1234, 15)
f(12345, 14)
f(123456, 13)
f(1234567, 12)
f(12345678, 11)
f(123456789, 10)
f(1234567890, 9)
f(12345678901, 8)
f(123456789012, 7)
f(1234567890123, 6)
f(12345678901234, 5)
f(123456789012345, 4)
f(1234567890123456, 3)
f(12345678901234567, 2)
f(123456789012345678, 1)
f(1234567890123456789, 0)
f(923456789012345678, 0)
f(92345678901234567, 1)
f(9234567890123456, 2)
f(923456789012345, 3)
f(92345678901234, 4)
f(9234567890123, 5)
f(923456789012, 6)
f(92345678901, 7)
f(9234567890, 8)
f(923456789, 9)
f(92345678, 10)
f(9234567, 11)
f(923456, 12)
f(92345, 13)
f(9234, 14)
f(923, 15)
f(92, 17)
f(9, 18)
}
func TestAppendFloatToDecimal(t *testing.T) {
@@ -207,83 +207,103 @@ func testAppendFloatToDecimal(t *testing.T, fa []float64, daExpected []int64, eE
}
func TestFloatToDecimal(t *testing.T) {
testFloatToDecimal(t, 0, 0, 0)
testFloatToDecimal(t, 1, 1, 0)
testFloatToDecimal(t, -1, -1, 0)
testFloatToDecimal(t, 0.9, 9, -1)
testFloatToDecimal(t, 0.99, 99, -2)
testFloatToDecimal(t, 9, 9, 0)
testFloatToDecimal(t, 99, 99, 0)
testFloatToDecimal(t, 20, 2, 1)
testFloatToDecimal(t, 100, 1, 2)
testFloatToDecimal(t, 3000, 3, 3)
testFloatToDecimal(t, 0.123, 123, -3)
testFloatToDecimal(t, -0.123, -123, -3)
testFloatToDecimal(t, 1.2345, 12345, -4)
testFloatToDecimal(t, -1.2345, -12345, -4)
testFloatToDecimal(t, 12000, 12, 3)
testFloatToDecimal(t, -12000, -12, 3)
testFloatToDecimal(t, 1e-30, 1, -30)
testFloatToDecimal(t, -1e-30, -1, -30)
testFloatToDecimal(t, 1e-260, 1, -260)
testFloatToDecimal(t, -1e-260, -1, -260)
testFloatToDecimal(t, 321e260, 321, 260)
testFloatToDecimal(t, -321e260, -321, 260)
testFloatToDecimal(t, 1234567890123, 1234567890123, 0)
testFloatToDecimal(t, -1234567890123, -1234567890123, 0)
testFloatToDecimal(t, 123e5, 123, 5)
testFloatToDecimal(t, 15e18, 15, 18)
testFloatToDecimal(t, math.Inf(1), vInfPos, 0)
testFloatToDecimal(t, math.Inf(-1), vInfNeg, 0)
testFloatToDecimal(t, 1<<63-1, 922337203685, 7)
testFloatToDecimal(t, -1<<63, -922337203685, 7)
}
func testFloatToDecimal(t *testing.T, f float64, vExpected int64, eExpected int16) {
t.Helper()
v, e := FromFloat(f)
if v != vExpected {
t.Fatalf("unexpected v for f=%e; got %d; expecting %d", f, v, vExpected)
}
if e != eExpected {
t.Fatalf("unexpected e for f=%e; got %d; expecting %d", f, e, eExpected)
f := func(f float64, vExpected int64, eExpected int16) {
t.Helper()
v, e := FromFloat(f)
if v != vExpected {
t.Fatalf("unexpected v for f=%e; got %d; expecting %d", f, v, vExpected)
}
if e != eExpected {
t.Fatalf("unexpected e for f=%e; got %d; expecting %d", f, e, eExpected)
}
}
f(0, 0, 0)
f(1, 1, 0)
f(-1, -1, 0)
f(0.9, 9, -1)
f(0.99, 99, -2)
f(9, 9, 0)
f(99, 99, 0)
f(20, 2, 1)
f(100, 1, 2)
f(3000, 3, 3)
f(0.123, 123, -3)
f(-0.123, -123, -3)
f(1.2345, 12345, -4)
f(-1.2345, -12345, -4)
f(12000, 12, 3)
f(-12000, -12, 3)
f(1e-30, 1, -30)
f(-1e-30, -1, -30)
f(1e-260, 1, -260)
f(-1e-260, -1, -260)
f(321e260, 321, 260)
f(-321e260, -321, 260)
f(1234567890123, 1234567890123, 0)
f(-1234567890123, -1234567890123, 0)
f(123e5, 123, 5)
f(15e18, 15, 18)
f(math.Inf(1), vInfPos, 0)
f(math.Inf(-1), vInfNeg, 0)
f(1<<63-1, 922337203685, 7)
f(-1<<63, -922337203685, 7)
// Test precision loss due to conversionPrecision.
f(0.1234567890123456, 12345678901234, -14)
f(-123456.7890123456, -12345678901234, -8)
}
func TestFloatToDecimalRoundtrip(t *testing.T) {
testFloatToDecimalRoundtrip(t, 0)
testFloatToDecimalRoundtrip(t, 1)
testFloatToDecimalRoundtrip(t, 0.123)
testFloatToDecimalRoundtrip(t, 1.2345)
testFloatToDecimalRoundtrip(t, 12000)
testFloatToDecimalRoundtrip(t, 1e-30)
testFloatToDecimalRoundtrip(t, 1e-260)
testFloatToDecimalRoundtrip(t, 321e260)
testFloatToDecimalRoundtrip(t, 1234567890123)
testFloatToDecimalRoundtrip(t, 12.34567890125)
testFloatToDecimalRoundtrip(t, 15e18)
f := func(f float64) {
t.Helper()
testFloatToDecimalRoundtrip(t, math.Inf(1))
testFloatToDecimalRoundtrip(t, math.Inf(-1))
testFloatToDecimalRoundtrip(t, 1<<63-1)
testFloatToDecimalRoundtrip(t, -1<<63)
v, e := FromFloat(f)
fNew := ToFloat(v, e)
if !equalFloat(fNew, f) {
t.Fatalf("unexpected fNew for v=%d, e=%d; got %g; expecting %g", v, e, fNew, f)
}
v, e = FromFloat(-f)
fNew = ToFloat(v, e)
if !equalFloat(fNew, -f) {
t.Fatalf("unexepcted fNew for v=%d, e=%d; got %g; expecting %g", v, e, fNew, -f)
}
}
f(0)
f(1)
f(0.123)
f(1.2345)
f(12000)
f(1e-30)
f(1e-260)
f(321e260)
f(1234567890123)
f(12.34567890125)
f(-1234567.8901256789)
f(15e18)
f(math.Inf(1))
f(math.Inf(-1))
f(1<<63 - 1)
f(-1 << 63)
for i := 0; i < 1e4; i++ {
f := rand.NormFloat64()
testFloatToDecimalRoundtrip(t, f)
testFloatToDecimalRoundtrip(t, f*1e-6)
testFloatToDecimalRoundtrip(t, f*1e6)
v := rand.NormFloat64()
f(v)
f(v * 1e-6)
f(v * 1e6)
testFloatToDecimalRoundtrip(t, roundFloat(f, 20))
testFloatToDecimalRoundtrip(t, roundFloat(f, 10))
testFloatToDecimalRoundtrip(t, roundFloat(f, 5))
testFloatToDecimalRoundtrip(t, roundFloat(f, 0))
testFloatToDecimalRoundtrip(t, roundFloat(f, -5))
testFloatToDecimalRoundtrip(t, roundFloat(f, -10))
testFloatToDecimalRoundtrip(t, roundFloat(f, -20))
f(roundFloat(v, 20))
f(roundFloat(v, 10))
f(roundFloat(v, 5))
f(roundFloat(v, 0))
f(roundFloat(v, -5))
f(roundFloat(v, -10))
f(roundFloat(v, -20))
}
}
@@ -292,22 +312,6 @@ func roundFloat(f float64, exp int) float64 {
return math.Trunc(f) * math.Pow10(exp)
}
func testFloatToDecimalRoundtrip(t *testing.T, f float64) {
t.Helper()
v, e := FromFloat(f)
fNew := ToFloat(v, e)
if !equalFloat(fNew, f) {
t.Fatalf("unexpected fNew for v=%d, e=%d; got %g; expecting %g", v, e, fNew, f)
}
v, e = FromFloat(-f)
fNew = ToFloat(v, e)
if !equalFloat(fNew, -f) {
t.Fatalf("unexepcted fNew for v=%d, e=%d; got %g; expecting %g", v, e, fNew, -f)
}
}
func equalFloat(f1, f2 float64) bool {
if math.IsInf(f1, 0) {
return math.IsInf(f1, 1) == math.IsInf(f2, 1) || math.IsInf(f1, -1) == math.IsInf(f2, -1)

View File

@@ -1,8 +1,8 @@
package encoding
import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/metrics"
"github.com/valyala/gozstd"
)
// CompressZSTDLevel appends compressed src to dst and returns
@@ -13,7 +13,7 @@ func CompressZSTDLevel(dst, src []byte, compressLevel int) []byte {
compressCalls.Inc()
originalBytes.Add(len(src))
dstLen := len(dst)
dst = gozstd.CompressLevel(dst, src, compressLevel)
dst = zstd.CompressLevel(dst, src, compressLevel)
compressedBytes.Add(len(dst) - dstLen)
return dst
}
@@ -22,7 +22,7 @@ func CompressZSTDLevel(dst, src []byte, compressLevel int) []byte {
// the appended dst.
func DecompressZSTD(dst, src []byte) ([]byte, error) {
decompressCalls.Inc()
return gozstd.Decompress(dst, src)
return zstd.Decompress(dst, src)
}
var (

View File

@@ -296,7 +296,7 @@ func isDeltaConst(a []int64) bool {
// i.e. arbitrary changing values.
//
// It is OK if a few gauges aren't detected (i.e. detected as counters),
// since misdetected counters as gauges are much worse condition.
// since misdetected counters as gauges leads to worser compression ratio.
func isGauge(a []int64) bool {
// Check all the items in a, since a part of items may lead
// to incorrect gauge detection.
@@ -305,32 +305,36 @@ func isGauge(a []int64) bool {
return false
}
extremes := 0
plus := a[0] <= a[1]
v1 := a[1]
for _, v2 := range a[2:] {
if plus {
if v2 < v1 {
extremes++
plus = false
}
} else {
if v2 > v1 {
extremes++
plus = true
}
}
v1 = v2
resets := 0
vPrev := a[0]
if vPrev < 0 {
// Counter values cannot be negative.
return true
}
if extremes <= 2 {
// Probably counter reset.
for _, v := range a[1:] {
if v < vPrev {
if v < 0 {
// Counter values cannot be negative.
return true
}
if v > (vPrev >> 3) {
// Decreasing sequence detected.
// This is a gauge.
return true
}
// Possible counter reset.
resets++
}
vPrev = v
}
if resets <= 2 {
// Counter with a few resets.
return false
}
// A few extremes may indicate counter resets.
// Let it be a gauge if extremes exceed len(a)/32,
// otherwise assume counter reset.
return extremes > (len(a) >> 5)
// Let it be a gauge if resets exceeds len(a)/8,
// otherwise assume counter.
return resets > (len(a) >> 3)
}
func getCompressLevel(itemsCount int) int {

View File

@@ -0,0 +1,83 @@
// +build cgo
package encoding
import (
"math/rand"
"testing"
)
func TestMarshalUnmarshalInt64Array(t *testing.T) {
var va []int64
var v int64
// Verify nearest delta encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += int64(rand.NormFloat64() * 1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 23; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta)
}
for precisionBits := uint8(23); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += 30e6 + int64(rand.NormFloat64()*1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 24; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta2)
}
for precisionBits := uint8(24); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
// Verify nearest delta encoding.
va = va[:0]
v = 1000
for i := 0; i < 6; i++ {
v += int64(rand.NormFloat64() * 100)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 6; i++ {
v += 3000 + int64(rand.NormFloat64()*100)
va = append(va, v)
}
for precisionBits := uint8(5); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
}
func TestMarshalInt64ArraySize(t *testing.T) {
var va []int64
v := int64(rand.Float64() * 1e9)
for i := 0; i < 8*1024; i++ {
va = append(va, v)
v += 30e3 + int64(rand.NormFloat64()*1e3)
}
testMarshalInt64ArraySize(t, va, 1, 500, 1300)
testMarshalInt64ArraySize(t, va, 2, 500, 1400)
testMarshalInt64ArraySize(t, va, 3, 800, 1800)
testMarshalInt64ArraySize(t, va, 4, 1300, 2100)
testMarshalInt64ArraySize(t, va, 5, 2000, 3200)
testMarshalInt64ArraySize(t, va, 6, 3000, 4800)
testMarshalInt64ArraySize(t, va, 7, 4000, 6400)
testMarshalInt64ArraySize(t, va, 8, 6000, 8000)
testMarshalInt64ArraySize(t, va, 9, 7000, 8800)
testMarshalInt64ArraySize(t, va, 10, 8000, 10000)
}

View File

@@ -0,0 +1,83 @@
// +build !cgo
package encoding
import (
"math/rand"
"testing"
)
func TestMarshalUnmarshalInt64Array(t *testing.T) {
var va []int64
var v int64
// Verify nearest delta encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += int64(rand.NormFloat64() * 1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 17; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta)
}
for precisionBits := uint8(23); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += 30e6 + int64(rand.NormFloat64()*1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 15; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta2)
}
for precisionBits := uint8(24); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
// Verify nearest delta encoding.
va = va[:0]
v = 1000
for i := 0; i < 6; i++ {
v += int64(rand.NormFloat64() * 100)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 6; i++ {
v += 3000 + int64(rand.NormFloat64()*100)
va = append(va, v)
}
for precisionBits := uint8(5); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
}
func TestMarshalInt64ArraySize(t *testing.T) {
var va []int64
v := int64(rand.Float64() * 1e9)
for i := 0; i < 8*1024; i++ {
va = append(va, v)
v += 30e3 + int64(rand.NormFloat64()*1e3)
}
testMarshalInt64ArraySize(t, va, 1, 500, 1700)
testMarshalInt64ArraySize(t, va, 2, 600, 1800)
testMarshalInt64ArraySize(t, va, 3, 900, 2100)
testMarshalInt64ArraySize(t, va, 4, 1300, 2200)
testMarshalInt64ArraySize(t, va, 5, 2000, 3300)
testMarshalInt64ArraySize(t, va, 6, 3000, 5000)
testMarshalInt64ArraySize(t, va, 7, 4000, 6500)
testMarshalInt64ArraySize(t, va, 8, 6000, 8000)
testMarshalInt64ArraySize(t, va, 9, 7000, 8800)
testMarshalInt64ArraySize(t, va, 10, 8000, 17000)
}

View File

@@ -43,27 +43,29 @@ func TestIsDeltaConst(t *testing.T) {
}
func TestIsGauge(t *testing.T) {
testIsGauge(t, []int64{}, false)
testIsGauge(t, []int64{0}, false)
testIsGauge(t, []int64{1, 2}, false)
testIsGauge(t, []int64{0, 1, 2, 3, 4, 5}, false)
testIsGauge(t, []int64{0, -1, -2, -3, -4}, false)
testIsGauge(t, []int64{0, 0, 0, 0, 0, 0, 0}, false)
testIsGauge(t, []int64{1, 1, 1, 1, 1}, false)
testIsGauge(t, []int64{1, 1, 2, 2, 2, 2}, false)
testIsGauge(t, []int64{1, 5, 2, 3}, false) // a single counter reset
testIsGauge(t, []int64{1, 5, 2, 3, 2}, true)
testIsGauge(t, []int64{-1, -5, -2, -3}, false) // a single counter reset
testIsGauge(t, []int64{-1, -5, -2, -3, -2}, true)
}
func testIsGauge(t *testing.T, a []int64, okExpected bool) {
t.Helper()
ok := isGauge(a)
if ok != okExpected {
t.Fatalf("unexpected result for isGauge(%d); got %v; expecting %v", a, ok, okExpected)
f := func(a []int64, okExpected bool) {
t.Helper()
ok := isGauge(a)
if ok != okExpected {
t.Fatalf("unexpected result for isGauge(%d); got %v; expecting %v", a, ok, okExpected)
}
}
f([]int64{}, false)
f([]int64{0}, false)
f([]int64{1, 2}, false)
f([]int64{0, 1, 2, 3, 4, 5}, false)
f([]int64{0, -1, -2, -3, -4}, true)
f([]int64{0, 0, 0, 0, 0, 0, 0}, false)
f([]int64{1, 1, 1, 1, 1}, false)
f([]int64{1, 1, 2, 2, 2, 2}, false)
f([]int64{1, 17, 2, 3}, false) // a single counter reset
f([]int64{1, 5, 2, 3}, true)
f([]int64{1, 5, 2, 3, 2}, true)
f([]int64{-1, -5, -2, -3}, true)
f([]int64{-1, -5, -2, -3, -2}, true)
f([]int64{5, 6, 4, 3, 2}, true)
f([]int64{4, 5, 6, 5, 4, 3, 2}, true)
f([]int64{1064, 1132, 1083, 1062, 856, 747}, true)
}
func TestEnsureNonDecreasingSequence(t *testing.T) {
@@ -87,73 +89,6 @@ func testEnsureNonDecreasingSequence(t *testing.T, a []int64, vMin, vMax int64,
}
}
func TestMarshalUnmarshalInt64Array(t *testing.T) {
testMarshalUnmarshalInt64Array(t, []int64{1, 20, 234}, 4, MarshalTypeNearestDelta2)
testMarshalUnmarshalInt64Array(t, []int64{1, 20, -2345, 678934, 342}, 4, MarshalTypeNearestDelta)
testMarshalUnmarshalInt64Array(t, []int64{1, 20, 2345, 6789, 12342}, 4, MarshalTypeNearestDelta2)
// Constant encoding
testMarshalUnmarshalInt64Array(t, []int64{1}, 4, MarshalTypeConst)
testMarshalUnmarshalInt64Array(t, []int64{1, 2}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{-1, 0, 1, 2, 3, 4, 5}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{-10, -1, 8, 17, 26}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{0, 0, 0, 0, 0, 0}, 4, MarshalTypeConst)
testMarshalUnmarshalInt64Array(t, []int64{100, 100, 100, 100}, 4, MarshalTypeConst)
var va []int64
var v int64
// Verify nearest delta encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += int64(rand.NormFloat64() * 1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 23; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta)
}
for precisionBits := uint8(23); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 8*1024; i++ {
v += 30e6 + int64(rand.NormFloat64()*1e6)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 24; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeZSTDNearestDelta2)
}
for precisionBits := uint8(24); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
// Verify nearest delta encoding.
va = va[:0]
v = 1000
for i := 0; i < 6; i++ {
v += int64(rand.NormFloat64() * 100)
va = append(va, v)
}
for precisionBits := uint8(1); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta)
}
// Verify nearest delta2 encoding.
va = va[:0]
v = 0
for i := 0; i < 6; i++ {
v += 3000 + int64(rand.NormFloat64()*100)
va = append(va, v)
}
for precisionBits := uint8(5); precisionBits < 65; precisionBits++ {
testMarshalUnmarshalInt64Array(t, va, precisionBits, MarshalTypeNearestDelta2)
}
}
func testMarshalUnmarshalInt64Array(t *testing.T, va []int64, precisionBits uint8, mtExpected MarshalType) {
t.Helper()
@@ -257,24 +192,18 @@ func TestMarshalUnmarshalValues(t *testing.T) {
}
}
func TestMarshalInt64ArraySize(t *testing.T) {
var va []int64
v := int64(rand.Float64() * 1e9)
for i := 0; i < 8*1024; i++ {
va = append(va, v)
v += 30e3 + int64(rand.NormFloat64()*1e3)
}
func TestMarshalUnmarshalInt64ArrayGeneric(t *testing.T) {
testMarshalUnmarshalInt64Array(t, []int64{1, 20, 234}, 4, MarshalTypeNearestDelta2)
testMarshalUnmarshalInt64Array(t, []int64{1, 20, -2345, 678934, 342}, 4, MarshalTypeNearestDelta)
testMarshalUnmarshalInt64Array(t, []int64{1, 20, 2345, 6789, 12342}, 4, MarshalTypeNearestDelta2)
testMarshalInt64ArraySize(t, va, 1, 500, 1300)
testMarshalInt64ArraySize(t, va, 2, 600, 1400)
testMarshalInt64ArraySize(t, va, 3, 900, 1800)
testMarshalInt64ArraySize(t, va, 4, 1300, 2100)
testMarshalInt64ArraySize(t, va, 5, 2000, 3200)
testMarshalInt64ArraySize(t, va, 6, 3000, 4800)
testMarshalInt64ArraySize(t, va, 7, 4000, 6400)
testMarshalInt64ArraySize(t, va, 8, 6000, 8000)
testMarshalInt64ArraySize(t, va, 9, 7000, 8800)
testMarshalInt64ArraySize(t, va, 10, 8000, 10000)
// Constant encoding
testMarshalUnmarshalInt64Array(t, []int64{1}, 4, MarshalTypeConst)
testMarshalUnmarshalInt64Array(t, []int64{1, 2}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{-1, 0, 1, 2, 3, 4, 5}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{-10, -1, 8, 17, 26}, 4, MarshalTypeDeltaConst)
testMarshalUnmarshalInt64Array(t, []int64{0, 0, 0, 0, 0, 0}, 4, MarshalTypeConst)
testMarshalUnmarshalInt64Array(t, []int64{100, 100, 100, 100}, 4, MarshalTypeConst)
}
func testMarshalInt64ArraySize(t *testing.T, va []int64, precisionBits uint8, minSizeExpected, maxSizeExpected int) {

View File

@@ -0,0 +1,19 @@
// +build cgo
package zstd
import (
"github.com/valyala/gozstd"
)
// Decompress appends decompressed src to dst and returns the result.
func Decompress(dst, src []byte) ([]byte, error) {
return gozstd.Decompress(dst, src)
}
// CompressLevel appends compressed src to dst and returns the result.
//
// The given compressionLevel is used for the compression.
func CompressLevel(dst, src []byte, compressionLevel int) []byte {
return gozstd.CompressLevel(dst, src, compressionLevel)
}

View File

@@ -0,0 +1,78 @@
// +build !cgo
package zstd
import (
"sync"
"sync/atomic"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/klauspost/compress/zstd"
)
var (
decoder *zstd.Decoder
mu sync.Mutex
av atomic.Value
)
type registry map[int]*zstd.Encoder
func init() {
r := make(registry)
av.Store(r)
var err error
decoder, err = zstd.NewReader(nil)
if err != nil {
logger.Panicf("BUG: failed to create ZSTD reader: %s", err)
}
}
// Decompress appends decompressed src to dst and returns the result.
func Decompress(dst, src []byte) ([]byte, error) {
return decoder.DecodeAll(src, dst)
}
// CompressLevel appends compressed src to dst and returns the result.
//
// The given compressionLevel is used for the compression.
func CompressLevel(dst, src []byte, compressionLevel int) []byte {
e := getEncoder(compressionLevel)
return e.EncodeAll(src, dst)
}
func getEncoder(compressionLevel int) *zstd.Encoder {
r := av.Load().(registry)
e := r[compressionLevel]
if e != nil {
return e
}
mu.Lock()
// Create the encoder under lock in order to prevent from wasted work
// when concurrent goroutines create encoder for the same compressionLevel.
e = newEncoder(compressionLevel)
r1 := av.Load().(registry)
r2 := make(registry)
for k, v := range r1 {
r2[k] = v
}
r2[compressionLevel] = e
av.Store(r2)
mu.Unlock()
return e
}
func newEncoder(compressionLevel int) *zstd.Encoder {
level := zstd.EncoderLevelFromZstd(compressionLevel)
e, err := zstd.NewWriter(nil,
zstd.WithEncoderCRC(false), // Disable CRC for performance reasons.
zstd.WithEncoderLevel(level))
if err != nil {
logger.Panicf("BUG: failed to create ZSTD writer: %s", err)
}
return e
}

View File

@@ -0,0 +1,96 @@
// +build cgo
package zstd
import (
"math/rand"
"testing"
pure "github.com/klauspost/compress/zstd"
cgo "github.com/valyala/gozstd"
)
func TestCompressDecompress(t *testing.T) {
testCrossCompressDecompress(t, []byte("a"))
testCrossCompressDecompress(t, []byte("foobarbaz"))
var b []byte
for i := 0; i < 64*1024; i++ {
b = append(b, byte(rand.Int31n(256)))
}
testCrossCompressDecompress(t, b)
}
func testCrossCompressDecompress(t *testing.T, b []byte) {
testCompressDecompress(t, pureCompress, pureDecompress, b)
testCompressDecompress(t, cgoCompress, cgoDecompress, b)
testCompressDecompress(t, pureCompress, cgoDecompress, b)
testCompressDecompress(t, cgoCompress, pureDecompress, b)
}
func testCompressDecompress(t *testing.T, compress compressFn, decompress decompressFn, b []byte) {
bc, err := compress(nil, b, 5)
if err != nil {
t.Fatalf("unexpected error when compressing b=%x: %s", b, err)
}
bNew, err := decompress(nil, bc)
if err != nil {
t.Fatalf("unexpected error when decompressing b=%x from bc=%x: %s", b, bc, err)
}
if string(bNew) != string(b) {
t.Fatalf("invalid bNew; got\n%x; expecting\n%x", bNew, b)
}
prefix := []byte{1, 2, 33}
bcNew, err := compress(prefix, b, 5)
if err != nil {
t.Fatalf("unexpected error when compressing b=%x: %s", bcNew, err)
}
if string(bcNew[:len(prefix)]) != string(prefix) {
t.Fatalf("invalid prefix for b=%x; got\n%x; expecting\n%x", b, bcNew[:len(prefix)], prefix)
}
if string(bcNew[len(prefix):]) != string(bc) {
t.Fatalf("invalid prefixed bcNew for b=%x; got\n%x; expecting\n%x", b, bcNew[len(prefix):], bc)
}
bNew, err = decompress(prefix, bc)
if err != nil {
t.Fatalf("unexpected error when decompressing b=%x from bc=%x with prefix: %s", b, bc, err)
}
if string(bNew[:len(prefix)]) != string(prefix) {
t.Fatalf("invalid bNew prefix when decompressing bc=%x; got\n%x; expecting\n%x", bc, bNew[:len(prefix)], prefix)
}
if string(bNew[len(prefix):]) != string(b) {
t.Fatalf("invalid prefixed bNew; got\n%x; expecting\n%x", bNew[len(prefix):], b)
}
}
type compressFn func(dst, src []byte, compressionLevel int) ([]byte, error)
func pureCompress(dst, src []byte, _ int) ([]byte, error) {
w, err := pure.NewWriter(nil,
pure.WithEncoderCRC(false), // Disable CRC for performance reasons.
pure.WithEncoderLevel(pure.SpeedBestCompression))
if err != nil {
return nil, err
}
return w.EncodeAll(src, dst), nil
}
func cgoCompress(dst, src []byte, compressionLevel int) ([]byte, error) {
return cgo.CompressLevel(dst, src, compressionLevel), nil
}
type decompressFn func(dst, src []byte) ([]byte, error)
func pureDecompress(dst, src []byte) ([]byte, error) {
decoder, err := pure.NewReader(nil)
if err != nil {
return nil, err
}
return decoder.DecodeAll(src, dst)
}
func cgoDecompress(dst, src []byte) ([]byte, error) {
return cgo.Decompress(dst, src)
}

View File

@@ -205,19 +205,6 @@ func (b *Block) MarshalData(timestampsBlockOffset, valuesBlockOffset uint64) ([]
b.bh.ValuesBlockSize = uint32(len(b.valuesData))
b.values = b.values[:0]
if len(timestamps) > 1 && (b.bh.ValuesMarshalType == encoding.MarshalTypeConst || b.bh.ValuesMarshalType == encoding.MarshalTypeDeltaConst) {
// Special case - values are constant or are changed with constant rate.
// In this case we may 'cheat' by assuming timestamps are changed
// at ideal constant rate. This improves timestamps' compression rate.
minTimestamp := timestamps[0]
maxTimestamp := timestamps[len(timestamps)-1]
delta := (maxTimestamp - minTimestamp) / int64(len(timestamps)-1)
ts := minTimestamp
for i := 1; i < len(timestamps); i++ {
ts += delta
timestamps[i] = ts
}
}
b.timestampsData, b.bh.TimestampsMarshalType, b.bh.MinTimestamp = encoding.MarshalTimestamps(b.timestampsData[:0], timestamps, b.bh.PrecisionBits)
b.bh.TimestampsBlockOffset = timestampsBlockOffset
b.bh.TimestampsBlockSize = uint32(len(b.timestampsData))

View File

@@ -1381,7 +1381,7 @@ func matchTagFilters(mn *MetricName, tfs []*tagFilter, kb *bytesutil.ByteBuffer)
continue
}
// Found the matching tag name. Match for the value.
// Found the matching tag name. Match the value.
b := tag.Marshal(kb.B)
kb.B = b[:len(kb.B)]
ok, err := matchTagFilter(b, tf)
@@ -1394,7 +1394,7 @@ func matchTagFilters(mn *MetricName, tfs []*tagFilter, kb *bytesutil.ByteBuffer)
tagMatched = true
break
}
if !tagMatched {
if !tagMatched && !tf.isNegative {
// Matching tag name wasn't found.
return false, nil
}

View File

@@ -753,8 +753,8 @@ func TestMatchTagFilters(t *testing.T) {
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if ok {
t.Fatalf("Shouldn't match")
if !ok {
t.Fatalf("Should match")
}
tfs.Reset()
if err := tfs.Add([]byte("non-existing-tag"), []byte("foob.+metric"), true, true); err != nil {
@@ -764,8 +764,19 @@ func TestMatchTagFilters(t *testing.T) {
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if ok {
t.Fatalf("Shouldn't match")
if !ok {
t.Fatalf("Should match")
}
tfs.Reset()
if err := tfs.Add([]byte("non-existing-tag"), []byte(".+"), true, true); err != nil {
t.Fatalf("cannot add regexp, negative filter: %s", err)
}
ok, err = matchTagFilters(&mn, toTFPointers(tfs.tfs), &bb)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if !ok {
t.Fatalf("Should match")
}
// Negative match by existing tag
@@ -859,6 +870,17 @@ func TestMatchTagFilters(t *testing.T) {
if !ok {
t.Fatalf("Should match")
}
tfs.Reset()
if err := tfs.Add([]byte("key 3"), []byte(""), true, false); err != nil {
t.Fatalf("cannot add regexp, negative filter: %s", err)
}
ok, err = matchTagFilters(&mn, toTFPointers(tfs.tfs), &bb)
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
if !ok {
t.Fatalf("Should match")
}
// Positive match by multiple tags and MetricGroup
tfs.Reset()

View File

@@ -28,6 +28,9 @@ type partSearch struct {
// tr is a time range to search.
tr TimeRange
// Skip populating timestampsData and valuesData in Block if fetchData=false.
fetchData bool
metaindex []metaindexRow
ibCache *indexBlockCache
@@ -48,6 +51,7 @@ func (ps *partSearch) reset() {
ps.p = nil
ps.tsids = ps.tsids[:0]
ps.tsidIdx = 0
ps.fetchData = true
ps.metaindex = nil
ps.ibCache = nil
ps.bhs = nil
@@ -61,7 +65,7 @@ func (ps *partSearch) reset() {
}
// Init initializes the ps with the given p, tsids and tr.
func (ps *partSearch) Init(p *part, tsids []TSID, tr TimeRange) {
func (ps *partSearch) Init(p *part, tsids []TSID, tr TimeRange, fetchData bool) {
ps.reset()
ps.p = p
@@ -72,6 +76,7 @@ func (ps *partSearch) Init(p *part, tsids []TSID, tr TimeRange) {
ps.tsids = append(ps.tsids[:0], tsids...)
}
ps.tr = tr
ps.fetchData = fetchData
ps.metaindex = p.metaindex
ps.ibCache = &p.ibCache
@@ -281,11 +286,14 @@ func (ps *partSearch) searchBHS() bool {
func (ps *partSearch) readBlock(bh *blockHeader) {
ps.Block.Reset()
ps.Block.bh = *bh
if !ps.fetchData {
return
}
ps.Block.timestampsData = bytesutil.Resize(ps.Block.timestampsData[:0], int(bh.TimestampsBlockSize))
ps.p.timestampsFile.ReadAt(ps.Block.timestampsData, int64(bh.TimestampsBlockOffset))
ps.Block.valuesData = bytesutil.Resize(ps.Block.valuesData[:0], int(bh.ValuesBlockSize))
ps.p.valuesFile.ReadAt(ps.Block.valuesData, int64(bh.ValuesBlockOffset))
ps.Block.bh = *bh
}

View File

@@ -1247,7 +1247,7 @@ func testPartSearch(t *testing.T, p *part, tsids []TSID, tr TimeRange, expectedR
func testPartSearchSerial(p *part, tsids []TSID, tr TimeRange, expectedRawBlocks []rawBlock) error {
var ps partSearch
ps.Init(p, tsids, tr)
ps.Init(p, tsids, tr, true)
var bs []Block
for ps.NextBlock() {
var b Block

View File

@@ -56,7 +56,7 @@ func (pts *partitionSearch) reset() {
// Init initializes the search in the given partition for the given tsid and tr.
//
// MustClose must be called when partition search is done.
func (pts *partitionSearch) Init(pt *partition, tsids []TSID, tr TimeRange) {
func (pts *partitionSearch) Init(pt *partition, tsids []TSID, tr TimeRange, fetchData bool) {
if pts.needClosing {
logger.Panicf("BUG: missing partitionSearch.MustClose call before the next call to Init")
}
@@ -85,7 +85,7 @@ func (pts *partitionSearch) Init(pt *partition, tsids []TSID, tr TimeRange) {
}
pts.psPool = pts.psPool[:len(pts.pws)]
for i, pw := range pts.pws {
pts.psPool[i].Init(pw.p, tsids, tr)
pts.psPool[i].Init(pw.p, tsids, tr, fetchData)
}
// Initialize the psHeap.

View File

@@ -238,7 +238,7 @@ func testPartitionSearchSerial(pt *partition, tsids []TSID, tr TimeRange, rbsExp
bs := []Block{}
var pts partitionSearch
pts.Init(pt, tsids, tr)
pts.Init(pt, tsids, tr, true)
for pts.NextBlock() {
var b Block
b.CopyFrom(pts.Block)
@@ -263,7 +263,7 @@ func testPartitionSearchSerial(pt *partition, tsids []TSID, tr TimeRange, rbsExp
}
// verify that empty tsids returns empty result
pts.Init(pt, []TSID{}, tr)
pts.Init(pt, []TSID{}, tr, true)
if pts.NextBlock() {
return fmt.Errorf("unexpected block got for an empty tsids list: %+v", pts.Block)
}
@@ -271,6 +271,16 @@ func testPartitionSearchSerial(pt *partition, tsids []TSID, tr TimeRange, rbsExp
return fmt.Errorf("unexpected error on empty tsids list: %s", err)
}
pts.MustClose()
pts.Init(pt, []TSID{}, tr, false)
if pts.NextBlock() {
return fmt.Errorf("unexpected block got for an empty tsids list: %+v", pts.Block)
}
if err := pts.Error(); err != nil {
return fmt.Errorf("unexpected error on empty tsids list: %s", err)
}
pts.MustClose()
return nil
}

View File

@@ -110,7 +110,7 @@ func (s *Search) reset() {
// Init initializes s from the given storage, tfss and tr.
//
// MustClose must be called when the search is done.
func (s *Search) Init(storage *Storage, tfss []*TagFilters, tr TimeRange, maxMetrics int) {
func (s *Search) Init(storage *Storage, tfss []*TagFilters, tr TimeRange, fetchData bool, maxMetrics int) {
if s.needClosing {
logger.Panicf("BUG: missing MustClose call before the next call to Init")
}
@@ -123,7 +123,7 @@ func (s *Search) Init(storage *Storage, tfss []*TagFilters, tr TimeRange, maxMet
// It is ok to call Init on error from storage.searchTSIDs.
// Init must be called before returning because it will fail
// on Seach.MustClose otherwise.
s.ts.Init(storage.tb, tsids, tr)
s.ts.Init(storage.tb, tsids, tr, fetchData)
if err != nil {
s.err = err

View File

@@ -193,7 +193,7 @@ func testSearch(st *Storage, tr TimeRange, mrs []MetricRow, accountsCount int) e
}
// Search
s.Init(st, []*TagFilters{tfs}, tr, 1e5)
s.Init(st, []*TagFilters{tfs}, tr, true, 1e5)
var mbs []MetricBlock
for s.NextMetricBlock() {
var b Block

View File

@@ -65,6 +65,13 @@ type Storage struct {
currHourMetricIDsUpdaterWG sync.WaitGroup
retentionWatcherWG sync.WaitGroup
tooSmallTimestampRows uint64
tooBigTimestampRows uint64
addRowsConcurrencyLimitReached uint64
addRowsConcurrencyLimitTimeout uint64
addRowsConcurrencyDroppedRows uint64
}
// OpenStorage opens storage on the given path with the given number of retention months.
@@ -271,6 +278,15 @@ func (s *Storage) idb() *indexDB {
// Metrics contains essential metrics for the Storage.
type Metrics struct {
TooSmallTimestampRows uint64
TooBigTimestampRows uint64
AddRowsConcurrencyLimitReached uint64
AddRowsConcurrencyLimitTimeout uint64
AddRowsConcurrencyDroppedRows uint64
AddRowsConcurrencyCapacity uint64
AddRowsConcurrencyCurrent uint64
TSIDCacheSize uint64
TSIDCacheSizeBytes uint64
TSIDCacheRequests uint64
@@ -308,6 +324,15 @@ func (m *Metrics) Reset() {
// UpdateMetrics updates m with metrics from s.
func (s *Storage) UpdateMetrics(m *Metrics) {
m.TooSmallTimestampRows += atomic.LoadUint64(&s.tooSmallTimestampRows)
m.TooBigTimestampRows += atomic.LoadUint64(&s.tooBigTimestampRows)
m.AddRowsConcurrencyLimitReached += atomic.LoadUint64(&s.addRowsConcurrencyLimitReached)
m.AddRowsConcurrencyLimitTimeout += atomic.LoadUint64(&s.addRowsConcurrencyLimitTimeout)
m.AddRowsConcurrencyDroppedRows += atomic.LoadUint64(&s.addRowsConcurrencyDroppedRows)
m.AddRowsConcurrencyCapacity = uint64(cap(addRowsConcurrencyCh))
m.AddRowsConcurrencyCurrent = uint64(len(addRowsConcurrencyCh))
var cs fastcache.Stats
s.tsidCache.UpdateStats(&cs)
m.TSIDCacheSize += cs.EntriesCount
@@ -713,15 +738,24 @@ func (s *Storage) AddRows(mrs []MetricRow, precisionBits uint8) error {
// Limit the number of concurrent goroutines that may add rows to the storage.
// This should prevent from out of memory errors and CPU trashing when too many
// goroutines call AddRows.
t := timerpool.Get(addRowsTimeout)
select {
case addRowsConcurrencyCh <- struct{}{}:
timerpool.Put(t)
defer func() { <-addRowsConcurrencyCh }()
case <-t.C:
timerpool.Put(t)
return fmt.Errorf("Cannot add %d rows to storage in %s, since it is overloaded with %d concurrent writers. Add more CPUs or reduce load",
len(mrs), addRowsTimeout, cap(addRowsConcurrencyCh))
default:
// Sleep for a while until giving up
atomic.AddUint64(&s.addRowsConcurrencyLimitReached, 1)
t := timerpool.Get(addRowsTimeout)
select {
case addRowsConcurrencyCh <- struct{}{}:
timerpool.Put(t)
defer func() { <-addRowsConcurrencyCh }()
case <-t.C:
timerpool.Put(t)
atomic.AddUint64(&s.addRowsConcurrencyLimitTimeout, 1)
atomic.AddUint64(&s.addRowsConcurrencyDroppedRows, uint64(len(mrs)))
return fmt.Errorf("Cannot add %d rows to storage in %s, since it is overloaded with %d concurrent writers. Add more CPUs or reduce load",
len(mrs), addRowsTimeout, cap(addRowsConcurrencyCh))
}
}
// Add rows to the storage.
@@ -760,8 +794,14 @@ func (s *Storage) add(rows []rawRow, mrs []MetricRow, precisionBits uint8) ([]ra
// doesn't know how to work with them.
continue
}
if mr.Timestamp < minTimestamp || mr.Timestamp > maxTimestamp {
// Skip rows with timestamps outside the retention.
if mr.Timestamp < minTimestamp {
// Skip rows with too small timestamps outside the retention.
atomic.AddUint64(&s.tooSmallTimestampRows, 1)
continue
}
if mr.Timestamp > maxTimestamp {
// Skip rows with too big timestamps significantly exceeding the current time.
atomic.AddUint64(&s.tooBigTimestampRows, 1)
continue
}
r := &rows[rowsLen+j]

View File

@@ -502,12 +502,24 @@ func testStorageDeleteMetrics(s *Storage, workerNum int) error {
MaxTimestamp: 2e10,
}
metricBlocksCount := func(tfs *TagFilters) int {
// Verify the number of blocks with fetchData=true
n := 0
sr.Init(s, []*TagFilters{tfs}, tr, 1e5)
sr.Init(s, []*TagFilters{tfs}, tr, true, 1e5)
for sr.NextMetricBlock() {
n++
}
sr.MustClose()
// Make sure the number of blocks with fetchData=false is the same.
m := 0
sr.Init(s, []*TagFilters{tfs}, tr, false, 1e5)
for sr.NextMetricBlock() {
m++
}
sr.MustClose()
if n != m {
return -1
}
return n
}
for i := 0; i < metricsCount; i++ {

View File

@@ -55,7 +55,7 @@ func (ts *tableSearch) reset() {
// Init initializes the ts.
//
// MustClose must be called then the tableSearch is done.
func (ts *tableSearch) Init(tb *table, tsids []TSID, tr TimeRange) {
func (ts *tableSearch) Init(tb *table, tsids []TSID, tr TimeRange, fetchData bool) {
if ts.needClosing {
logger.Panicf("BUG: missing MustClose call before the next call to Init")
}
@@ -86,7 +86,7 @@ func (ts *tableSearch) Init(tb *table, tsids []TSID, tr TimeRange) {
}
ts.ptsPool = ts.ptsPool[:len(ts.ptws)]
for i, ptw := range ts.ptws {
ts.ptsPool[i].Init(ptw.pt, tsids, tr)
ts.ptsPool[i].Init(ptw.pt, tsids, tr, fetchData)
}
// Initialize the ptsHeap.

View File

@@ -251,7 +251,7 @@ func testTableSearchSerial(tb *table, tsids []TSID, tr TimeRange, rbsExpected []
bs := []Block{}
var ts tableSearch
ts.Init(tb, tsids, tr)
ts.Init(tb, tsids, tr, true)
for ts.NextBlock() {
var b Block
b.CopyFrom(ts.Block)
@@ -276,7 +276,7 @@ func testTableSearchSerial(tb *table, tsids []TSID, tr TimeRange, rbsExpected []
}
// verify that empty tsids returns empty result
ts.Init(tb, []TSID{}, tr)
ts.Init(tb, []TSID{}, tr, true)
if ts.NextBlock() {
return fmt.Errorf("unexpected block got for an empty tsids list: %+v", ts.Block)
}
@@ -284,5 +284,15 @@ func testTableSearchSerial(tb *table, tsids []TSID, tr TimeRange, rbsExpected []
return fmt.Errorf("unexpected error on empty tsids list: %s", err)
}
ts.MustClose()
ts.Init(tb, []TSID{}, tr, false)
if ts.NextBlock() {
return fmt.Errorf("unexpected block got for an empty tsids list with fetchData=false: %+v", ts.Block)
}
if err := ts.Error(); err != nil {
return fmt.Errorf("unexpected error on empty tsids list with fetchData=false: %s", err)
}
ts.MustClose()
return nil
}

View File

@@ -26,7 +26,11 @@ func BenchmarkTableSearch(b *testing.B) {
b.Run(fmt.Sprintf("tsidsCount_%d", tsidsCount), func(b *testing.B) {
for _, tsidsSearch := range []int{1, 1e1, 1e2, 1e3, 1e4} {
b.Run(fmt.Sprintf("tsidsSearch_%d", tsidsSearch), func(b *testing.B) {
benchmarkTableSearch(b, rowsCount, tsidsCount, tsidsSearch)
for _, fetchData := range []bool{true, false} {
b.Run(fmt.Sprintf("fetchData_%v", fetchData), func(b *testing.B) {
benchmarkTableSearch(b, rowsCount, tsidsCount, tsidsSearch, fetchData)
})
}
})
}
})
@@ -103,9 +107,9 @@ func createBenchTable(b *testing.B, path string, startTimestamp int64, rowsPerIn
tb.MustClose()
}
func benchmarkTableSearch(b *testing.B, rowsCount, tsidsCount, tsidsSearch int) {
func benchmarkTableSearch(b *testing.B, rowsCount, tsidsCount, tsidsSearch int, fetchData bool) {
startTimestamp := timestampFromTime(time.Now()) - 365*24*3600*1000
rowsPerInsert := int(maxRawRowsPerPartition)
rowsPerInsert := getMaxRawRowsPerPartition()
tb := openBenchTable(b, startTimestamp, rowsPerInsert, rowsCount, tsidsCount)
tr := TimeRange{
@@ -127,7 +131,7 @@ func benchmarkTableSearch(b *testing.B, rowsCount, tsidsCount, tsidsSearch int)
for i := range tsids {
tsids[i].MetricID = 1 + uint64(i)
}
ts.Init(tb, tsids, tr)
ts.Init(tb, tsids, tr, fetchData)
for ts.NextBlock() {
}
ts.MustClose()

View File

@@ -308,6 +308,62 @@ func TestTagFilterMatchSuffix(t *testing.T) {
mismatch("bar")
match("xhttpbar")
})
t.Run("non-empty-string-regexp-negative-match", func(t *testing.T) {
value := ".+"
isNegative := true
isRegexp := true
expectedPrefix := tvNoTrailingTagSeparator("")
init(value, isNegative, isRegexp, expectedPrefix)
if len(tf.orSuffixes) != 0 {
t.Fatalf("unexpected non-zero number of or suffixes: %d; %q", len(tf.orSuffixes), tf.orSuffixes)
}
match("")
mismatch("x")
mismatch("foo")
})
t.Run("non-empty-string-regexp-match", func(t *testing.T) {
value := ".+"
isNegative := false
isRegexp := true
expectedPrefix := tvNoTrailingTagSeparator("")
init(value, isNegative, isRegexp, expectedPrefix)
if len(tf.orSuffixes) != 0 {
t.Fatalf("unexpected non-zero number of or suffixes: %d; %q", len(tf.orSuffixes), tf.orSuffixes)
}
mismatch("")
match("x")
match("foo")
})
t.Run("match-all-regexp-negative-match", func(t *testing.T) {
value := ".*"
isNegative := true
isRegexp := true
expectedPrefix := tvNoTrailingTagSeparator("")
init(value, isNegative, isRegexp, expectedPrefix)
if len(tf.orSuffixes) != 0 {
t.Fatalf("unexpected non-zero number of or suffixes: %d; %q", len(tf.orSuffixes), tf.orSuffixes)
}
mismatch("")
mismatch("x")
mismatch("foo")
})
t.Run("match-all-regexp-match", func(t *testing.T) {
value := ".*"
isNegative := false
isRegexp := true
expectedPrefix := tvNoTrailingTagSeparator("")
init(value, isNegative, isRegexp, expectedPrefix)
if len(tf.orSuffixes) != 0 {
t.Fatalf("unexpected non-zero number of or suffixes: %d; %q", len(tf.orSuffixes), tf.orSuffixes)
}
match("")
match("x")
match("foo")
})
}
func TestGetOrValues(t *testing.T) {

1
package/VAR_BUILD Normal file
View File

@@ -0,0 +1 @@
1

1
package/VAR_VERSION Normal file
View File

@@ -0,0 +1 @@
1.7.0

0
package/deb/conffile Normal file
View File

7
package/deb/control Normal file
View File

@@ -0,0 +1,7 @@
Package: victoria-metrics
Maintainer: Aliaksandr Valialkin
Depends: libc6 (>= 2.7-1)
Section: net
Priority: optional
Homepage: https://github.com/VictoriaMetrics/VictoriaMetrics
Description: VictoriaMetrics is fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.

5
package/deb/postinst Normal file
View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -e
# Reload systemd unit
systemctl daemon-reload

5
package/deb/postrm Normal file
View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -e
# Reload systemd unit
systemctl daemon-reload

5
package/deb/prerm Normal file
View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -e
# Reload systemd unit
systemctl stop victoria-metrics

103
package/package_deb.sh Executable file
View File

@@ -0,0 +1,103 @@
#!/bin/bash
ARCH="amd64"
if [[ $# -ge 1 ]]
then
ARCH="$1"
fi
# Map to Debian architecture
if [[ "$ARCH" == "amd64" ]]; then
DEB_ARCH=amd64
EXENAME_SRC="victoria-metrics-prod"
elif [[ "$ARCH" == "arm64" ]]; then
DEB_ARCH=arm64
EXENAME_SRC="victoria-metrics-arm64-prod"
else
echo "*** Unknown arch $ARCH"
exit 1
fi
PACKDIR="./package"
TEMPDIR="${PACKDIR}/temp-deb-${DEB_ARCH}"
EXENAME_DST="victoria-metrics-prod"
# Pull in version info
VERSION=`cat ${PACKDIR}/VAR_VERSION | perl -ne 'chomp and print'`
BUILD=`cat ${PACKDIR}/VAR_BUILD | perl -ne 'chomp and print'`
# Create directories
[[ -d "${TEMPDIR}" ]] && rm -rf "${TEMPDIR}"
mkdir -p "${TEMPDIR}" && echo "*** Created : ${TEMPDIR}"
mkdir -p "${TEMPDIR}/usr/bin/"
mkdir -p "${TEMPDIR}/lib/systemd/system/"
echo "*** Version : ${VERSION}-${BUILD}"
echo "*** Arch : ${DEB_ARCH}"
OUT_DEB="victoria-metrics_${VERSION}-${BUILD}_$DEB_ARCH.deb"
echo "*** Out .deb : ${OUT_DEB}"
# Copy the binary
cp "./bin/${EXENAME_SRC}" "${TEMPDIR}/usr/bin/${EXENAME_DST}"
# Copy supporting files
cp "${PACKDIR}/victoria-metrics.service" "${TEMPDIR}/lib/systemd/system/"
# Generate debian-binary
echo "2.0" > "${TEMPDIR}/debian-binary"
# Generate control
echo "Version: $VERSION-$BUILD" > "${TEMPDIR}/control"
echo "Installed-Size:" `du -sb "${TEMPDIR}" | awk '{print int($1/1024)}'` >> "${TEMPDIR}/control"
echo "Architecture: $DEB_ARCH" >> "${TEMPDIR}/control"
cat "${PACKDIR}/deb/control" >> "${TEMPDIR}/control"
# Copy conffile
cp "${PACKDIR}/deb/conffile" "${TEMPDIR}/conffile"
# Copy postinst and postrm
cp "${PACKDIR}/deb/postinst" "${TEMPDIR}/postinst"
cp "${PACKDIR}/deb/prerm" "${TEMPDIR}/prerm"
cp "${PACKDIR}/deb/postrm" "${TEMPDIR}/postrm"
(
# Generate md5 sums
cd "${TEMPDIR}"
find ./usr ./lib -type f | while read i ; do
md5sum "$i" | sed 's/\.\///g' >> md5sums
done
# Archive control
chmod 644 control md5sums
chmod 755 postinst postrm prerm
fakeroot -- tar -c --xz -f ./control.tar.xz ./control ./md5sums ./postinst ./prerm ./postrm
# Archive data
fakeroot -- tar -c --xz -f ./data.tar.xz ./usr ./lib
# Make final archive
fakeroot -- ar -cr "${OUT_DEB}" debian-binary control.tar.xz data.tar.xz
)
ls -lh "${TEMPDIR}/${OUT_DEB}"
cp "${TEMPDIR}/${OUT_DEB}" "${PACKDIR}"
echo "*** Created : ${PACKDIR}/${OUT_DEB}"

90
package/package_rpm.sh Executable file
View File

@@ -0,0 +1,90 @@
#!/bin/bash
if ! which rpmbuild 2> /dev/null
then
echo "*** Fatal: please install rpmbuild"
exit 1
fi
ARCH="amd64"
if [[ $# -ge 1 ]]
then
ARCH="$1"
fi
# Map to Debian architecture
if [[ "$ARCH" == "amd64" ]]; then
RPM_ARCH=x86_64
EXENAME_SRC="victoria-metrics-prod"
elif [[ "$ARCH" == "arm64" ]]; then
RPM_ARCH=aarch64
EXENAME_SRC="victoria-metrics-arm64-prod"
else
echo "*** Unknown arch $ARCH"
exit 1
fi
PACKDIR="./package"
TEMPDIR="${PACKDIR}/temp-rpm-${RPM_ARCH}"
EXENAME_DST="victoria-metrics-prod"
# Pull in version info
VERSION=`cat ${PACKDIR}/VAR_VERSION | perl -ne 'chomp and print'`
BUILD=`cat ${PACKDIR}/VAR_BUILD | perl -ne 'chomp and print'`
# Create directories
[[ -d "${TEMPDIR}" ]] && rm -rf "${TEMPDIR}"
mkdir -p "${TEMPDIR}" && echo "*** Created : ${TEMPDIR}"
echo "*** Version : ${VERSION}-${BUILD}"
echo "*** Arch : ${RPM_ARCH}"
OUT_RPM="victoria-metrics-${VERSION}-${BUILD}.${RPM_ARCH}.rpm"
echo "*** Out .rpm : ${OUT_RPM}"
cat > "${TEMPDIR}/victoria-metrics.spec" <<EOF
Summary: The best long-term remote storage for Prometheus
Name: victoria-metrics
Version: ${VERSION}
Release: ${BUILD}
License: Apache License 2.0
URL: https://victoriametrics.com/
Group: System
Packager: Aliaksandr Valialkin
Requires: libpthread
Requires: libc
%description
VictoriaMetrics is fast, cost-effective and scalable time series database. It can be used as a long-term remote storage for Prometheus.
%files
%attr(0744, root, root) /usr/bin/*
%attr(0644, root, root) /lib/systemd/system/*
%prep
mkdir -p \$RPM_BUILD_ROOT/usr/bin/
mkdir -p \$RPM_BUILD_ROOT/lib/systemd/system/
cp ${PWD}/bin/${EXENAME_SRC} \$RPM_BUILD_ROOT/usr/bin/${EXENAME_DST}
cp ${PWD}/package/victoria-metrics.service \$RPM_BUILD_ROOT/lib/systemd/system/
%post
/usr/bin/systemctl daemon-reload
%preun
/usr/bin/systemctl stop victoria-metrics
%postun
/usr/bin/systemctl daemon-reload
EOF
rpmbuild -bb --target "${RPM_ARCH}" \
"${TEMPDIR}/victoria-metrics.spec"
cp "${HOME}/rpmbuild/RPMS/${RPM_ARCH}/${OUT_RPM}" "${PACKDIR}"
echo "*** Created : ${PACKDIR}/${OUT_RPM}"

16
package/rpm/README.md Normal file
View File

@@ -0,0 +1,16 @@
# victoriametrics-rpm
RPM for VictoriaMetrics - the best long-term remote storage for Prometheus
*Get and started*
```
yum -y install yum-plugin-copr
yum copr enable antonpatsev/VictoriaMetrics
yum makecache
yum -y install victoriametrics
systemctl start victoriametrics
```

View File

@@ -0,0 +1,53 @@
Name: victoriametrics
Version: 1.23.0
Release: 3
Summary: The best long-term remote storage for Prometheus
Group: Development Tools
License: ASL 2.0
URL: https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v%{version}/victoria-metrics-v%{version}.tar.gz
Source0: %{name}.service
# Use systemd for fedora >= 18, rhel >=7, SUSE >= 12 SP1 and openSUSE >= 42.1
%define use_systemd (0%{?fedora} && 0%{?fedora} >= 18) || (0%{?rhel} && 0%{?rhel} >= 7) || (!0%{?is_opensuse} && 0%{?suse_version} >=1210) || (0%{?is_opensuse} && 0%{?sle_version} >= 120100)
%description
VictoriaMetrics - the best long-term remote storage for Prometheus
%prep
curl -L %{url} > victoria-metrics.tar.gz
tar -zxf victoria-metrics.tar.gz
%install
%{__install} -m 0755 -d %{buildroot}%{_bindir}
cp victoria-metrics-prod %{buildroot}%{_bindir}/victoria-metrics-prod
%{__install} -m 0755 -d %{buildroot}/var/lib/victoria-metrics-data
%if %{use_systemd}
%{__mkdir} -p %{buildroot}%{_unitdir}
%{__install} -m644 %{SOURCE0} \
%{buildroot}%{_unitdir}/%{name}.service
%endif
%post
%if %use_systemd
/usr/bin/systemctl daemon-reload
%endif
%preun
%if %use_systemd
/usr/bin/systemctl stop %{name}
%endif
%postun
%if %use_systemd
/usr/bin/systemctl daemon-reload
%endif
%files
%{_bindir}/victoria-metrics-prod
/var/lib/victoria-metrics-data
%if %{use_systemd}
%{_unitdir}/%{name}.service
%endif

View File

@@ -0,0 +1,17 @@
[Unit]
Description=High-performance, cost-effective and scalable time series database, long-term remote storage for Prometheus
After=network.target
[Service]
Type=simple
StartLimitBurst=5
StartLimitInterval=0
Restart=on-failure
RestartSec=1
ExecStart=/usr/bin/victoria-metrics-prod -storageDataPath=/var/lib/victoria-metrics-data
ExecStop=/bin/kill -s SIGTERM $MAINPID
LimitNOFILE=65536
LimitNPROC=32000
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,17 @@
[Unit]
Description=High-performance, cost-effective and scalable time series database, long-term remote storage for Prometheus
After=network.target
[Service]
Type=simple
StartLimitBurst=5
StartLimitInterval=0
Restart=on-failure
RestartSec=1
ExecStart=/usr/bin/victoria-metrics-prod -storageDataPath=/var/lib/victoria-metrics-data
ExecStop=/bin/kill -s SIGTERM $MAINPID
LimitNOFILE=65536
LimitNPROC=32000
[Install]
WantedBy=multi-user.target

View File

@@ -67,7 +67,11 @@ func writeProcessMetrics(w io.Writer) {
// It is expensive obtaining `process_open_fds` when big number of file descriptors is opened,
// don't do it here.
fmt.Fprintf(w, "process_cpu_seconds_total %g\n", float64(p.Utime+p.Stime)/userHZ)
utime := float64(p.Utime) / userHZ
stime := float64(p.Stime) / userHZ
fmt.Fprintf(w, "process_cpu_seconds_system_total %g\n", stime)
fmt.Fprintf(w, "process_cpu_seconds_total %g\n", utime+stime)
fmt.Fprintf(w, "process_cpu_seconds_user_total %g\n", utime)
fmt.Fprintf(w, "process_major_pagefaults_total %d\n", p.Majflt)
fmt.Fprintf(w, "process_minor_pagefaults_total %d\n", p.Minflt)
fmt.Fprintf(w, "process_num_threads %d\n", p.NumThreads)

28
vendor/github.com/klauspost/compress/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,28 @@
Copyright (c) 2012 The Go Authors. All rights reserved.
Copyright (c) 2019 Klaus Post. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

79
vendor/github.com/klauspost/compress/fse/README.md generated vendored Normal file
View File

@@ -0,0 +1,79 @@
# Finite State Entropy
This package provides Finite State Entropy encoding and decoding.
Finite State Entropy (also referenced as [tANS](https://en.wikipedia.org/wiki/Asymmetric_numeral_systems#tANS))
encoding provides a fast near-optimal symbol encoding/decoding
for byte blocks as implemented in [zstandard](https://github.com/facebook/zstd).
This can be used for compressing input with a lot of similar input values to the smallest number of bytes.
This does not perform any multi-byte [dictionary coding](https://en.wikipedia.org/wiki/Dictionary_coder) as LZ coders,
but it can be used as a secondary step to compressors (like Snappy) that does not do entropy encoding.
* [Godoc documentation](https://godoc.org/github.com/klauspost/compress/fse)
## News
* Feb 2018: First implementation released. Consider this beta software for now.
# Usage
This package provides a low level interface that allows to compress single independent blocks.
Each block is separate, and there is no built in integrity checks.
This means that the caller should keep track of block sizes and also do checksums if needed.
Compressing a block is done via the [`Compress`](https://godoc.org/github.com/klauspost/compress/fse#Compress) function.
You must provide input and will receive the output and maybe an error.
These error values can be returned:
| Error | Description |
|---------------------|-----------------------------------------------------------------------------|
| `<nil>` | Everything ok, output is returned |
| `ErrIncompressible` | Returned when input is judged to be too hard to compress |
| `ErrUseRLE` | Returned from the compressor when the input is a single byte value repeated |
| `(error)` | An internal error occurred. |
As can be seen above there are errors that will be returned even under normal operation so it is important to handle these.
To reduce allocations you can provide a [`Scratch`](https://godoc.org/github.com/klauspost/compress/fse#Scratch) object
that can be re-used for successive calls. Both compression and decompression accepts a `Scratch` object, and the same
object can be used for both.
Be aware, that when re-using a `Scratch` object that the *output* buffer is also re-used, so if you are still using this
you must set the `Out` field in the scratch to nil. The same buffer is used for compression and decompression output.
Decompressing is done by calling the [`Decompress`](https://godoc.org/github.com/klauspost/compress/fse#Decompress) function.
You must provide the output from the compression stage, at exactly the size you got back. If you receive an error back
your input was likely corrupted.
It is important to note that a successful decoding does *not* mean your output matches your original input.
There are no integrity checks, so relying on errors from the decompressor does not assure your data is valid.
For more detailed usage, see examples in the [godoc documentation](https://godoc.org/github.com/klauspost/compress/fse#pkg-examples).
# Performance
A lot of factors are affecting speed. Block sizes and compressibility of the material are primary factors.
All compression functions are currently only running on the calling goroutine so only one core will be used per block.
The compressor is significantly faster if symbols are kept as small as possible. The highest byte value of the input
is used to reduce some of the processing, so if all your input is above byte value 64 for instance, it may be
beneficial to transpose all your input values down by 64.
With moderate block sizes around 64k speed are typically 200MB/s per core for compression and
around 300MB/s decompression speed.
The same hardware typically does Huffman (deflate) encoding at 125MB/s and decompression at 100MB/s.
# Plans
At one point, more internals will be exposed to facilitate more "expert" usage of the components.
A streaming interface is also likely to be implemented. Likely compatible with [FSE stream format](https://github.com/Cyan4973/FiniteStateEntropy/blob/dev/programs/fileio.c#L261).
# Contributing
Contributions are always welcome. Be aware that adding public functions will require good justification and breaking
changes will likely not be accepted. If in doubt open an issue before writing the PR.

107
vendor/github.com/klauspost/compress/fse/bitreader.go generated vendored Normal file
View File

@@ -0,0 +1,107 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package fse
import (
"errors"
"io"
)
// bitReader reads a bitstream in reverse.
// The last set bit indicates the start of the stream and is used
// for aligning the input.
type bitReader struct {
in []byte
off uint // next byte to read is at in[off - 1]
value uint64
bitsRead uint8
}
// init initializes and resets the bit reader.
func (b *bitReader) init(in []byte) error {
if len(in) < 1 {
return errors.New("corrupt stream: too short")
}
b.in = in
b.off = uint(len(in))
// The highest bit of the last byte indicates where to start
v := in[len(in)-1]
if v == 0 {
return errors.New("corrupt stream, did not find end of stream")
}
b.bitsRead = 64
b.value = 0
b.fill()
b.fill()
b.bitsRead += 8 - uint8(highBits(uint32(v)))
return nil
}
// getBits will return n bits. n can be 0.
func (b *bitReader) getBits(n uint8) uint16 {
if n == 0 || b.bitsRead >= 64 {
return 0
}
return b.getBitsFast(n)
}
// getBitsFast requires that at least one bit is requested every time.
// There are no checks if the buffer is filled.
func (b *bitReader) getBitsFast(n uint8) uint16 {
const regMask = 64 - 1
v := uint16((b.value << (b.bitsRead & regMask)) >> ((regMask + 1 - n) & regMask))
b.bitsRead += n
return v
}
// fillFast() will make sure at least 32 bits are available.
// There must be at least 4 bytes available.
func (b *bitReader) fillFast() {
if b.bitsRead < 32 {
return
}
// Do single re-slice to avoid bounds checks.
v := b.in[b.off-4 : b.off]
low := (uint32(v[0])) | (uint32(v[1]) << 8) | (uint32(v[2]) << 16) | (uint32(v[3]) << 24)
b.value = (b.value << 32) | uint64(low)
b.bitsRead -= 32
b.off -= 4
}
// fill() will make sure at least 32 bits are available.
func (b *bitReader) fill() {
if b.bitsRead < 32 {
return
}
if b.off > 4 {
v := b.in[b.off-4 : b.off]
low := (uint32(v[0])) | (uint32(v[1]) << 8) | (uint32(v[2]) << 16) | (uint32(v[3]) << 24)
b.value = (b.value << 32) | uint64(low)
b.bitsRead -= 32
b.off -= 4
return
}
for b.off > 0 {
b.value = (b.value << 8) | uint64(b.in[b.off-1])
b.bitsRead -= 8
b.off--
}
}
// finished returns true if all bits have been read from the bit stream.
func (b *bitReader) finished() bool {
return b.off == 0 && b.bitsRead >= 64
}
// close the bitstream and returns an error if out-of-buffer reads occurred.
func (b *bitReader) close() error {
// Release reference.
b.in = nil
if b.bitsRead > 64 {
return io.ErrUnexpectedEOF
}
return nil
}

168
vendor/github.com/klauspost/compress/fse/bitwriter.go generated vendored Normal file
View File

@@ -0,0 +1,168 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package fse
import "fmt"
// bitWriter will write bits.
// First bit will be LSB of the first byte of output.
type bitWriter struct {
bitContainer uint64
nBits uint8
out []byte
}
// bitMask16 is bitmasks. Has extra to avoid bounds check.
var bitMask16 = [32]uint16{
0, 1, 3, 7, 0xF, 0x1F,
0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF,
0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF, 0xFFFF,
0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
0xFFFF, 0xFFFF} /* up to 16 bits */
// addBits16NC will add up to 16 bits.
// It will not check if there is space for them,
// so the caller must ensure that it has flushed recently.
func (b *bitWriter) addBits16NC(value uint16, bits uint8) {
b.bitContainer |= uint64(value&bitMask16[bits&31]) << (b.nBits & 63)
b.nBits += bits
}
// addBits16Clean will add up to 16 bits. value may not contain more set bits than indicated.
// It will not check if there is space for them, so the caller must ensure that it has flushed recently.
func (b *bitWriter) addBits16Clean(value uint16, bits uint8) {
b.bitContainer |= uint64(value) << (b.nBits & 63)
b.nBits += bits
}
// addBits16ZeroNC will add up to 16 bits.
// It will not check if there is space for them,
// so the caller must ensure that it has flushed recently.
// This is fastest if bits can be zero.
func (b *bitWriter) addBits16ZeroNC(value uint16, bits uint8) {
if bits == 0 {
return
}
value <<= (16 - bits) & 15
value >>= (16 - bits) & 15
b.bitContainer |= uint64(value) << (b.nBits & 63)
b.nBits += bits
}
// flush will flush all pending full bytes.
// There will be at least 56 bits available for writing when this has been called.
// Using flush32 is faster, but leaves less space for writing.
func (b *bitWriter) flush() {
v := b.nBits >> 3
switch v {
case 0:
case 1:
b.out = append(b.out,
byte(b.bitContainer),
)
case 2:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
)
case 3:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
)
case 4:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
)
case 5:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
)
case 6:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
)
case 7:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
byte(b.bitContainer>>48),
)
case 8:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
byte(b.bitContainer>>48),
byte(b.bitContainer>>56),
)
default:
panic(fmt.Errorf("bits (%d) > 64", b.nBits))
}
b.bitContainer >>= v << 3
b.nBits &= 7
}
// flush32 will flush out, so there are at least 32 bits available for writing.
func (b *bitWriter) flush32() {
if b.nBits < 32 {
return
}
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24))
b.nBits -= 32
b.bitContainer >>= 32
}
// flushAlign will flush remaining full bytes and align to next byte boundary.
func (b *bitWriter) flushAlign() {
nbBytes := (b.nBits + 7) >> 3
for i := uint8(0); i < nbBytes; i++ {
b.out = append(b.out, byte(b.bitContainer>>(i*8)))
}
b.nBits = 0
b.bitContainer = 0
}
// close will write the alignment bit and write the final byte(s)
// to the output.
func (b *bitWriter) close() error {
// End mark
b.addBits16Clean(1, 1)
// flush until next byte.
b.flushAlign()
return nil
}
// reset and continue writing by appending to out.
func (b *bitWriter) reset(out []byte) {
b.bitContainer = 0
b.nBits = 0
b.out = out
}

56
vendor/github.com/klauspost/compress/fse/bytereader.go generated vendored Normal file
View File

@@ -0,0 +1,56 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package fse
// byteReader provides a byte reader that reads
// little endian values from a byte stream.
// The input stream is manually advanced.
// The reader performs no bounds checks.
type byteReader struct {
b []byte
off int
}
// init will initialize the reader and set the input.
func (b *byteReader) init(in []byte) {
b.b = in
b.off = 0
}
// advance the stream b n bytes.
func (b *byteReader) advance(n uint) {
b.off += int(n)
}
// Int32 returns a little endian int32 starting at current offset.
func (b byteReader) Int32() int32 {
b2 := b.b[b.off : b.off+4 : b.off+4]
v3 := int32(b2[3])
v2 := int32(b2[2])
v1 := int32(b2[1])
v0 := int32(b2[0])
return v0 | (v1 << 8) | (v2 << 16) | (v3 << 24)
}
// Uint32 returns a little endian uint32 starting at current offset.
func (b byteReader) Uint32() uint32 {
b2 := b.b[b.off : b.off+4 : b.off+4]
v3 := uint32(b2[3])
v2 := uint32(b2[2])
v1 := uint32(b2[1])
v0 := uint32(b2[0])
return v0 | (v1 << 8) | (v2 << 16) | (v3 << 24)
}
// unread returns the unread portion of the input.
func (b byteReader) unread() []byte {
return b.b[b.off:]
}
// remain will return the number of bytes remaining.
func (b byteReader) remain() int {
return len(b.b) - b.off
}

684
vendor/github.com/klauspost/compress/fse/compress.go generated vendored Normal file
View File

@@ -0,0 +1,684 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package fse
import (
"errors"
"fmt"
)
// Compress the input bytes. Input must be < 2GB.
// Provide a Scratch buffer to avoid memory allocations.
// Note that the output is also kept in the scratch buffer.
// If input is too hard to compress, ErrIncompressible is returned.
// If input is a single byte value repeated ErrUseRLE is returned.
func Compress(in []byte, s *Scratch) ([]byte, error) {
if len(in) <= 1 {
return nil, ErrIncompressible
}
if len(in) > (2<<30)-1 {
return nil, errors.New("input too big, must be < 2GB")
}
s, err := s.prepare(in)
if err != nil {
return nil, err
}
// Create histogram, if none was provided.
maxCount := s.maxCount
if maxCount == 0 {
maxCount = s.countSimple(in)
}
// Reset for next run.
s.clearCount = true
s.maxCount = 0
if maxCount == len(in) {
// One symbol, use RLE
return nil, ErrUseRLE
}
if maxCount == 1 || maxCount < (len(in)>>7) {
// Each symbol present maximum once or too well distributed.
return nil, ErrIncompressible
}
s.optimalTableLog()
err = s.normalizeCount()
if err != nil {
return nil, err
}
err = s.writeCount()
if err != nil {
return nil, err
}
if false {
err = s.validateNorm()
if err != nil {
return nil, err
}
}
err = s.buildCTable()
if err != nil {
return nil, err
}
err = s.compress(in)
if err != nil {
return nil, err
}
s.Out = s.bw.out
// Check if we compressed.
if len(s.Out) >= len(in) {
return nil, ErrIncompressible
}
return s.Out, nil
}
// cState contains the compression state of a stream.
type cState struct {
bw *bitWriter
stateTable []uint16
state uint16
}
// init will initialize the compression state to the first symbol of the stream.
func (c *cState) init(bw *bitWriter, ct *cTable, tableLog uint8, first symbolTransform) {
c.bw = bw
c.stateTable = ct.stateTable
nbBitsOut := (first.deltaNbBits + (1 << 15)) >> 16
im := int32((nbBitsOut << 16) - first.deltaNbBits)
lu := (im >> nbBitsOut) + first.deltaFindState
c.state = c.stateTable[lu]
return
}
// encode the output symbol provided and write it to the bitstream.
func (c *cState) encode(symbolTT symbolTransform) {
nbBitsOut := (uint32(c.state) + symbolTT.deltaNbBits) >> 16
dstState := int32(c.state>>(nbBitsOut&15)) + symbolTT.deltaFindState
c.bw.addBits16NC(c.state, uint8(nbBitsOut))
c.state = c.stateTable[dstState]
}
// encode the output symbol provided and write it to the bitstream.
func (c *cState) encodeZero(symbolTT symbolTransform) {
nbBitsOut := (uint32(c.state) + symbolTT.deltaNbBits) >> 16
dstState := int32(c.state>>(nbBitsOut&15)) + symbolTT.deltaFindState
c.bw.addBits16ZeroNC(c.state, uint8(nbBitsOut))
c.state = c.stateTable[dstState]
}
// flush will write the tablelog to the output and flush the remaining full bytes.
func (c *cState) flush(tableLog uint8) {
c.bw.flush32()
c.bw.addBits16NC(c.state, tableLog)
c.bw.flush()
}
// compress is the main compression loop that will encode the input from the last byte to the first.
func (s *Scratch) compress(src []byte) error {
if len(src) <= 2 {
return errors.New("compress: src too small")
}
tt := s.ct.symbolTT[:256]
s.bw.reset(s.Out)
// Our two states each encodes every second byte.
// Last byte encoded (first byte decoded) will always be encoded by c1.
var c1, c2 cState
// Encode so remaining size is divisible by 4.
ip := len(src)
if ip&1 == 1 {
c1.init(&s.bw, &s.ct, s.actualTableLog, tt[src[ip-1]])
c2.init(&s.bw, &s.ct, s.actualTableLog, tt[src[ip-2]])
c1.encodeZero(tt[src[ip-3]])
ip -= 3
} else {
c2.init(&s.bw, &s.ct, s.actualTableLog, tt[src[ip-1]])
c1.init(&s.bw, &s.ct, s.actualTableLog, tt[src[ip-2]])
ip -= 2
}
if ip&2 != 0 {
c2.encodeZero(tt[src[ip-1]])
c1.encodeZero(tt[src[ip-2]])
ip -= 2
}
// Main compression loop.
switch {
case !s.zeroBits && s.actualTableLog <= 8:
// We can encode 4 symbols without requiring a flush.
// We do not need to check if any output is 0 bits.
for ip >= 4 {
s.bw.flush32()
v3, v2, v1, v0 := src[ip-4], src[ip-3], src[ip-2], src[ip-1]
c2.encode(tt[v0])
c1.encode(tt[v1])
c2.encode(tt[v2])
c1.encode(tt[v3])
ip -= 4
}
case !s.zeroBits:
// We do not need to check if any output is 0 bits.
for ip >= 4 {
s.bw.flush32()
v3, v2, v1, v0 := src[ip-4], src[ip-3], src[ip-2], src[ip-1]
c2.encode(tt[v0])
c1.encode(tt[v1])
s.bw.flush32()
c2.encode(tt[v2])
c1.encode(tt[v3])
ip -= 4
}
case s.actualTableLog <= 8:
// We can encode 4 symbols without requiring a flush
for ip >= 4 {
s.bw.flush32()
v3, v2, v1, v0 := src[ip-4], src[ip-3], src[ip-2], src[ip-1]
c2.encodeZero(tt[v0])
c1.encodeZero(tt[v1])
c2.encodeZero(tt[v2])
c1.encodeZero(tt[v3])
ip -= 4
}
default:
for ip >= 4 {
s.bw.flush32()
v3, v2, v1, v0 := src[ip-4], src[ip-3], src[ip-2], src[ip-1]
c2.encodeZero(tt[v0])
c1.encodeZero(tt[v1])
s.bw.flush32()
c2.encodeZero(tt[v2])
c1.encodeZero(tt[v3])
ip -= 4
}
}
// Flush final state.
// Used to initialize state when decoding.
c2.flush(s.actualTableLog)
c1.flush(s.actualTableLog)
return s.bw.close()
}
// writeCount will write the normalized histogram count to header.
// This is read back by readNCount.
func (s *Scratch) writeCount() error {
var (
tableLog = s.actualTableLog
tableSize = 1 << tableLog
previous0 bool
charnum uint16
maxHeaderSize = ((int(s.symbolLen) * int(tableLog)) >> 3) + 3
// Write Table Size
bitStream = uint32(tableLog - minTablelog)
bitCount = uint(4)
remaining = int16(tableSize + 1) /* +1 for extra accuracy */
threshold = int16(tableSize)
nbBits = uint(tableLog + 1)
)
if cap(s.Out) < maxHeaderSize {
s.Out = make([]byte, 0, s.br.remain()+maxHeaderSize)
}
outP := uint(0)
out := s.Out[:maxHeaderSize]
// stops at 1
for remaining > 1 {
if previous0 {
start := charnum
for s.norm[charnum] == 0 {
charnum++
}
for charnum >= start+24 {
start += 24
bitStream += uint32(0xFFFF) << bitCount
out[outP] = byte(bitStream)
out[outP+1] = byte(bitStream >> 8)
outP += 2
bitStream >>= 16
}
for charnum >= start+3 {
start += 3
bitStream += 3 << bitCount
bitCount += 2
}
bitStream += uint32(charnum-start) << bitCount
bitCount += 2
if bitCount > 16 {
out[outP] = byte(bitStream)
out[outP+1] = byte(bitStream >> 8)
outP += 2
bitStream >>= 16
bitCount -= 16
}
}
count := s.norm[charnum]
charnum++
max := (2*threshold - 1) - remaining
if count < 0 {
remaining += count
} else {
remaining -= count
}
count++ // +1 for extra accuracy
if count >= threshold {
count += max // [0..max[ [max..threshold[ (...) [threshold+max 2*threshold[
}
bitStream += uint32(count) << bitCount
bitCount += nbBits
if count < max {
bitCount--
}
previous0 = count == 1
if remaining < 1 {
return errors.New("internal error: remaining<1")
}
for remaining < threshold {
nbBits--
threshold >>= 1
}
if bitCount > 16 {
out[outP] = byte(bitStream)
out[outP+1] = byte(bitStream >> 8)
outP += 2
bitStream >>= 16
bitCount -= 16
}
}
out[outP] = byte(bitStream)
out[outP+1] = byte(bitStream >> 8)
outP += (bitCount + 7) / 8
if uint16(charnum) > s.symbolLen {
return errors.New("internal error: charnum > s.symbolLen")
}
s.Out = out[:outP]
return nil
}
// symbolTransform contains the state transform for a symbol.
type symbolTransform struct {
deltaFindState int32
deltaNbBits uint32
}
// String prints values as a human readable string.
func (s symbolTransform) String() string {
return fmt.Sprintf("dnbits: %08x, fs:%d", s.deltaNbBits, s.deltaFindState)
}
// cTable contains tables used for compression.
type cTable struct {
tableSymbol []byte
stateTable []uint16
symbolTT []symbolTransform
}
// allocCtable will allocate tables needed for compression.
// If existing tables a re big enough, they are simply re-used.
func (s *Scratch) allocCtable() {
tableSize := 1 << s.actualTableLog
// get tableSymbol that is big enough.
if cap(s.ct.tableSymbol) < int(tableSize) {
s.ct.tableSymbol = make([]byte, tableSize)
}
s.ct.tableSymbol = s.ct.tableSymbol[:tableSize]
ctSize := tableSize
if cap(s.ct.stateTable) < ctSize {
s.ct.stateTable = make([]uint16, ctSize)
}
s.ct.stateTable = s.ct.stateTable[:ctSize]
if cap(s.ct.symbolTT) < 256 {
s.ct.symbolTT = make([]symbolTransform, 256)
}
s.ct.symbolTT = s.ct.symbolTT[:256]
}
// buildCTable will populate the compression table so it is ready to be used.
func (s *Scratch) buildCTable() error {
tableSize := uint32(1 << s.actualTableLog)
highThreshold := tableSize - 1
var cumul [maxSymbolValue + 2]int16
s.allocCtable()
tableSymbol := s.ct.tableSymbol[:tableSize]
// symbol start positions
{
cumul[0] = 0
for ui, v := range s.norm[:s.symbolLen-1] {
u := byte(ui) // one less than reference
if v == -1 {
// Low proba symbol
cumul[u+1] = cumul[u] + 1
tableSymbol[highThreshold] = u
highThreshold--
} else {
cumul[u+1] = cumul[u] + v
}
}
// Encode last symbol separately to avoid overflowing u
u := int(s.symbolLen - 1)
v := s.norm[s.symbolLen-1]
if v == -1 {
// Low proba symbol
cumul[u+1] = cumul[u] + 1
tableSymbol[highThreshold] = byte(u)
highThreshold--
} else {
cumul[u+1] = cumul[u] + v
}
if uint32(cumul[s.symbolLen]) != tableSize {
return fmt.Errorf("internal error: expected cumul[s.symbolLen] (%d) == tableSize (%d)", cumul[s.symbolLen], tableSize)
}
cumul[s.symbolLen] = int16(tableSize) + 1
}
// Spread symbols
s.zeroBits = false
{
step := tableStep(tableSize)
tableMask := tableSize - 1
var position uint32
// if any symbol > largeLimit, we may have 0 bits output.
largeLimit := int16(1 << (s.actualTableLog - 1))
for ui, v := range s.norm[:s.symbolLen] {
symbol := byte(ui)
if v > largeLimit {
s.zeroBits = true
}
for nbOccurrences := int16(0); nbOccurrences < v; nbOccurrences++ {
tableSymbol[position] = symbol
position = (position + step) & tableMask
for position > highThreshold {
position = (position + step) & tableMask
} /* Low proba area */
}
}
// Check if we have gone through all positions
if position != 0 {
return errors.New("position!=0")
}
}
// Build table
table := s.ct.stateTable
{
tsi := int(tableSize)
for u, v := range tableSymbol {
// TableU16 : sorted by symbol order; gives next state value
table[cumul[v]] = uint16(tsi + u)
cumul[v]++
}
}
// Build Symbol Transformation Table
{
total := int16(0)
symbolTT := s.ct.symbolTT[:s.symbolLen]
tableLog := s.actualTableLog
tl := (uint32(tableLog) << 16) - (1 << tableLog)
for i, v := range s.norm[:s.symbolLen] {
switch v {
case 0:
case -1, 1:
symbolTT[i].deltaNbBits = tl
symbolTT[i].deltaFindState = int32(total - 1)
total++
default:
maxBitsOut := uint32(tableLog) - highBits(uint32(v-1))
minStatePlus := uint32(v) << maxBitsOut
symbolTT[i].deltaNbBits = (maxBitsOut << 16) - minStatePlus
symbolTT[i].deltaFindState = int32(total - v)
total += v
}
}
if total != int16(tableSize) {
return fmt.Errorf("total mismatch %d (got) != %d (want)", total, tableSize)
}
}
return nil
}
// countSimple will create a simple histogram in s.count.
// Returns the biggest count.
// Does not update s.clearCount.
func (s *Scratch) countSimple(in []byte) (max int) {
for _, v := range in {
s.count[v]++
}
m := uint32(0)
for i, v := range s.count[:] {
if v > m {
m = v
}
if v > 0 {
s.symbolLen = uint16(i) + 1
}
}
return int(m)
}
// minTableLog provides the minimum logSize to safely represent a distribution.
func (s *Scratch) minTableLog() uint8 {
minBitsSrc := highBits(uint32(s.br.remain()-1)) + 1
minBitsSymbols := highBits(uint32(s.symbolLen-1)) + 2
if minBitsSrc < minBitsSymbols {
return uint8(minBitsSrc)
}
return uint8(minBitsSymbols)
}
// optimalTableLog calculates and sets the optimal tableLog in s.actualTableLog
func (s *Scratch) optimalTableLog() {
tableLog := s.TableLog
minBits := s.minTableLog()
maxBitsSrc := uint8(highBits(uint32(s.br.remain()-1))) - 2
if maxBitsSrc < tableLog {
// Accuracy can be reduced
tableLog = maxBitsSrc
}
if minBits > tableLog {
tableLog = minBits
}
// Need a minimum to safely represent all symbol values
if tableLog < minTablelog {
tableLog = minTablelog
}
if tableLog > maxTableLog {
tableLog = maxTableLog
}
s.actualTableLog = tableLog
}
var rtbTable = [...]uint32{0, 473195, 504333, 520860, 550000, 700000, 750000, 830000}
// normalizeCount will normalize the count of the symbols so
// the total is equal to the table size.
func (s *Scratch) normalizeCount() error {
var (
tableLog = s.actualTableLog
scale = 62 - uint64(tableLog)
step = (1 << 62) / uint64(s.br.remain())
vStep = uint64(1) << (scale - 20)
stillToDistribute = int16(1 << tableLog)
largest int
largestP int16
lowThreshold = (uint32)(s.br.remain() >> tableLog)
)
for i, cnt := range s.count[:s.symbolLen] {
// already handled
// if (count[s] == s.length) return 0; /* rle special case */
if cnt == 0 {
s.norm[i] = 0
continue
}
if cnt <= lowThreshold {
s.norm[i] = -1
stillToDistribute--
} else {
proba := (int16)((uint64(cnt) * step) >> scale)
if proba < 8 {
restToBeat := vStep * uint64(rtbTable[proba])
v := uint64(cnt)*step - (uint64(proba) << scale)
if v > restToBeat {
proba++
}
}
if proba > largestP {
largestP = proba
largest = i
}
s.norm[i] = proba
stillToDistribute -= proba
}
}
if -stillToDistribute >= (s.norm[largest] >> 1) {
// corner case, need another normalization method
return s.normalizeCount2()
}
s.norm[largest] += stillToDistribute
return nil
}
// Secondary normalization method.
// To be used when primary method fails.
func (s *Scratch) normalizeCount2() error {
const notYetAssigned = -2
var (
distributed uint32
total = uint32(s.br.remain())
tableLog = s.actualTableLog
lowThreshold = uint32(total >> tableLog)
lowOne = uint32((total * 3) >> (tableLog + 1))
)
for i, cnt := range s.count[:s.symbolLen] {
if cnt == 0 {
s.norm[i] = 0
continue
}
if cnt <= lowThreshold {
s.norm[i] = -1
distributed++
total -= cnt
continue
}
if cnt <= lowOne {
s.norm[i] = 1
distributed++
total -= cnt
continue
}
s.norm[i] = notYetAssigned
}
toDistribute := (1 << tableLog) - distributed
if (total / toDistribute) > lowOne {
// risk of rounding to zero
lowOne = uint32((total * 3) / (toDistribute * 2))
for i, cnt := range s.count[:s.symbolLen] {
if (s.norm[i] == notYetAssigned) && (cnt <= lowOne) {
s.norm[i] = 1
distributed++
total -= cnt
continue
}
}
toDistribute = (1 << tableLog) - distributed
}
if distributed == uint32(s.symbolLen)+1 {
// all values are pretty poor;
// probably incompressible data (should have already been detected);
// find max, then give all remaining points to max
var maxV int
var maxC uint32
for i, cnt := range s.count[:s.symbolLen] {
if cnt > maxC {
maxV = i
maxC = cnt
}
}
s.norm[maxV] += int16(toDistribute)
return nil
}
if total == 0 {
// all of the symbols were low enough for the lowOne or lowThreshold
for i := uint32(0); toDistribute > 0; i = (i + 1) % (uint32(s.symbolLen)) {
if s.norm[i] > 0 {
toDistribute--
s.norm[i]++
}
}
return nil
}
var (
vStepLog = 62 - uint64(tableLog)
mid = uint64((1 << (vStepLog - 1)) - 1)
rStep = (((1 << vStepLog) * uint64(toDistribute)) + mid) / uint64(total) // scale on remaining
tmpTotal = mid
)
for i, cnt := range s.count[:s.symbolLen] {
if s.norm[i] == notYetAssigned {
var (
end = tmpTotal + uint64(cnt)*rStep
sStart = uint32(tmpTotal >> vStepLog)
sEnd = uint32(end >> vStepLog)
weight = sEnd - sStart
)
if weight < 1 {
return errors.New("weight < 1")
}
s.norm[i] = int16(weight)
tmpTotal = end
}
}
return nil
}
// validateNorm validates the normalized histogram table.
func (s *Scratch) validateNorm() (err error) {
var total int
for _, v := range s.norm[:s.symbolLen] {
if v >= 0 {
total += int(v)
} else {
total -= int(v)
}
}
defer func() {
if err == nil {
return
}
fmt.Printf("selected TableLog: %d, Symbol length: %d\n", s.actualTableLog, s.symbolLen)
for i, v := range s.norm[:s.symbolLen] {
fmt.Printf("%3d: %5d -> %4d \n", i, s.count[i], v)
}
}()
if total != (1 << s.actualTableLog) {
return fmt.Errorf("warning: Total == %d != %d", total, 1<<s.actualTableLog)
}
for i, v := range s.count[s.symbolLen:] {
if v != 0 {
return fmt.Errorf("warning: Found symbol out of range, %d after cut", i)
}
}
return nil
}

370
vendor/github.com/klauspost/compress/fse/decompress.go generated vendored Normal file
View File

@@ -0,0 +1,370 @@
package fse
import (
"errors"
"fmt"
)
const (
tablelogAbsoluteMax = 15
)
// Decompress a block of data.
// You can provide a scratch buffer to avoid allocations.
// If nil is provided a temporary one will be allocated.
// It is possible, but by no way guaranteed that corrupt data will
// return an error.
// It is up to the caller to verify integrity of the returned data.
// Use a predefined Scrach to set maximum acceptable output size.
func Decompress(b []byte, s *Scratch) ([]byte, error) {
s, err := s.prepare(b)
if err != nil {
return nil, err
}
s.Out = s.Out[:0]
err = s.readNCount()
if err != nil {
return nil, err
}
err = s.buildDtable()
if err != nil {
return nil, err
}
err = s.decompress()
if err != nil {
return nil, err
}
return s.Out, nil
}
// readNCount will read the symbol distribution so decoding tables can be constructed.
func (s *Scratch) readNCount() error {
var (
charnum uint16
previous0 bool
b = &s.br
)
iend := b.remain()
if iend < 4 {
return errors.New("input too small")
}
bitStream := b.Uint32()
nbBits := uint((bitStream & 0xF) + minTablelog) // extract tableLog
if nbBits > tablelogAbsoluteMax {
return errors.New("tableLog too large")
}
bitStream >>= 4
bitCount := uint(4)
s.actualTableLog = uint8(nbBits)
remaining := int32((1 << nbBits) + 1)
threshold := int32(1 << nbBits)
gotTotal := int32(0)
nbBits++
for remaining > 1 {
if previous0 {
n0 := charnum
for (bitStream & 0xFFFF) == 0xFFFF {
n0 += 24
if b.off < iend-5 {
b.advance(2)
bitStream = b.Uint32() >> bitCount
} else {
bitStream >>= 16
bitCount += 16
}
}
for (bitStream & 3) == 3 {
n0 += 3
bitStream >>= 2
bitCount += 2
}
n0 += uint16(bitStream & 3)
bitCount += 2
if n0 > maxSymbolValue {
return errors.New("maxSymbolValue too small")
}
for charnum < n0 {
s.norm[charnum&0xff] = 0
charnum++
}
if b.off <= iend-7 || b.off+int(bitCount>>3) <= iend-4 {
b.advance(bitCount >> 3)
bitCount &= 7
bitStream = b.Uint32() >> bitCount
} else {
bitStream >>= 2
}
}
max := (2*(threshold) - 1) - (remaining)
var count int32
if (int32(bitStream) & (threshold - 1)) < max {
count = int32(bitStream) & (threshold - 1)
bitCount += nbBits - 1
} else {
count = int32(bitStream) & (2*threshold - 1)
if count >= threshold {
count -= max
}
bitCount += nbBits
}
count-- // extra accuracy
if count < 0 {
// -1 means +1
remaining += count
gotTotal -= count
} else {
remaining -= count
gotTotal += count
}
s.norm[charnum&0xff] = int16(count)
charnum++
previous0 = count == 0
for remaining < threshold {
nbBits--
threshold >>= 1
}
if b.off <= iend-7 || b.off+int(bitCount>>3) <= iend-4 {
b.advance(bitCount >> 3)
bitCount &= 7
} else {
bitCount -= (uint)(8 * (len(b.b) - 4 - b.off))
b.off = len(b.b) - 4
}
bitStream = b.Uint32() >> (bitCount & 31)
}
s.symbolLen = charnum
if s.symbolLen <= 1 {
return fmt.Errorf("symbolLen (%d) too small", s.symbolLen)
}
if s.symbolLen > maxSymbolValue+1 {
return fmt.Errorf("symbolLen (%d) too big", s.symbolLen)
}
if remaining != 1 {
return fmt.Errorf("corruption detected (remaining %d != 1)", remaining)
}
if bitCount > 32 {
return fmt.Errorf("corruption detected (bitCount %d > 32)", bitCount)
}
if gotTotal != 1<<s.actualTableLog {
return fmt.Errorf("corruption detected (total %d != %d)", gotTotal, 1<<s.actualTableLog)
}
b.advance((bitCount + 7) >> 3)
return nil
}
// decSymbol contains information about a state entry,
// Including the state offset base, the output symbol and
// the number of bits to read for the low part of the destination state.
type decSymbol struct {
newState uint16
symbol uint8
nbBits uint8
}
// allocDtable will allocate decoding tables if they are not big enough.
func (s *Scratch) allocDtable() {
tableSize := 1 << s.actualTableLog
if cap(s.decTable) < int(tableSize) {
s.decTable = make([]decSymbol, tableSize)
}
s.decTable = s.decTable[:tableSize]
if cap(s.ct.tableSymbol) < 256 {
s.ct.tableSymbol = make([]byte, 256)
}
s.ct.tableSymbol = s.ct.tableSymbol[:256]
if cap(s.ct.stateTable) < 256 {
s.ct.stateTable = make([]uint16, 256)
}
s.ct.stateTable = s.ct.stateTable[:256]
}
// buildDtable will build the decoding table.
func (s *Scratch) buildDtable() error {
tableSize := uint32(1 << s.actualTableLog)
highThreshold := tableSize - 1
s.allocDtable()
symbolNext := s.ct.stateTable[:256]
// Init, lay down lowprob symbols
s.zeroBits = false
{
largeLimit := int16(1 << (s.actualTableLog - 1))
for i, v := range s.norm[:s.symbolLen] {
if v == -1 {
s.decTable[highThreshold].symbol = uint8(i)
highThreshold--
symbolNext[i] = 1
} else {
if v >= largeLimit {
s.zeroBits = true
}
symbolNext[i] = uint16(v)
}
}
}
// Spread symbols
{
tableMask := tableSize - 1
step := tableStep(tableSize)
position := uint32(0)
for ss, v := range s.norm[:s.symbolLen] {
for i := 0; i < int(v); i++ {
s.decTable[position].symbol = uint8(ss)
position = (position + step) & tableMask
for position > highThreshold {
// lowprob area
position = (position + step) & tableMask
}
}
}
if position != 0 {
// position must reach all cells once, otherwise normalizedCounter is incorrect
return errors.New("corrupted input (position != 0)")
}
}
// Build Decoding table
{
tableSize := uint16(1 << s.actualTableLog)
for u, v := range s.decTable {
symbol := v.symbol
nextState := symbolNext[symbol]
symbolNext[symbol] = nextState + 1
nBits := s.actualTableLog - byte(highBits(uint32(nextState)))
s.decTable[u].nbBits = nBits
newState := (nextState << nBits) - tableSize
if newState > tableSize {
return fmt.Errorf("newState (%d) outside table size (%d)", newState, tableSize)
}
if newState == uint16(u) && nBits == 0 {
// Seems weird that this is possible with nbits > 0.
return fmt.Errorf("newState (%d) == oldState (%d) and no bits", newState, u)
}
s.decTable[u].newState = newState
}
}
return nil
}
// decompress will decompress the bitstream.
// If the buffer is over-read an error is returned.
func (s *Scratch) decompress() error {
br := &s.bits
br.init(s.br.unread())
var s1, s2 decoder
// Initialize and decode first state and symbol.
s1.init(br, s.decTable, s.actualTableLog)
s2.init(br, s.decTable, s.actualTableLog)
// Use temp table to avoid bound checks/append penalty.
var tmp = s.ct.tableSymbol[:256]
var off uint8
// Main part
if !s.zeroBits {
for br.off >= 8 {
br.fillFast()
tmp[off+0] = s1.nextFast()
tmp[off+1] = s2.nextFast()
br.fillFast()
tmp[off+2] = s1.nextFast()
tmp[off+3] = s2.nextFast()
off += 4
if off == 0 {
s.Out = append(s.Out, tmp...)
}
}
} else {
for br.off >= 8 {
br.fillFast()
tmp[off+0] = s1.next()
tmp[off+1] = s2.next()
br.fillFast()
tmp[off+2] = s1.next()
tmp[off+3] = s2.next()
off += 4
if off == 0 {
s.Out = append(s.Out, tmp...)
off = 0
if len(s.Out) >= s.DecompressLimit {
return fmt.Errorf("output size (%d) > DecompressLimit (%d)", len(s.Out), s.DecompressLimit)
}
}
}
}
s.Out = append(s.Out, tmp[:off]...)
// Final bits, a bit more expensive check
for {
if s1.finished() {
s.Out = append(s.Out, s1.final(), s2.final())
break
}
br.fill()
s.Out = append(s.Out, s1.next())
if s2.finished() {
s.Out = append(s.Out, s2.final(), s1.final())
break
}
s.Out = append(s.Out, s2.next())
if len(s.Out) >= s.DecompressLimit {
return fmt.Errorf("output size (%d) > DecompressLimit (%d)", len(s.Out), s.DecompressLimit)
}
}
return br.close()
}
// decoder keeps track of the current state and updates it from the bitstream.
type decoder struct {
state uint16
br *bitReader
dt []decSymbol
}
// init will initialize the decoder and read the first state from the stream.
func (d *decoder) init(in *bitReader, dt []decSymbol, tableLog uint8) {
d.dt = dt
d.br = in
d.state = uint16(in.getBits(tableLog))
}
// next returns the next symbol and sets the next state.
// At least tablelog bits must be available in the bit reader.
func (d *decoder) next() uint8 {
n := &d.dt[d.state]
lowBits := d.br.getBits(n.nbBits)
d.state = n.newState + lowBits
return n.symbol
}
// finished returns true if all bits have been read from the bitstream
// and the next state would require reading bits from the input.
func (d *decoder) finished() bool {
return d.br.finished() && d.dt[d.state].nbBits > 0
}
// final returns the current state symbol without decoding the next.
func (d *decoder) final() uint8 {
return d.dt[d.state].symbol
}
// nextFast returns the next symbol and sets the next state.
// This can only be used if no symbols are 0 bits.
// At least tablelog bits must be available in the bit reader.
func (d *decoder) nextFast() uint8 {
n := d.dt[d.state]
lowBits := d.br.getBitsFast(n.nbBits)
d.state = n.newState + lowBits
return n.symbol
}

143
vendor/github.com/klauspost/compress/fse/fse.go generated vendored Normal file
View File

@@ -0,0 +1,143 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
// Package fse provides Finite State Entropy encoding and decoding.
//
// Finite State Entropy encoding provides a fast near-optimal symbol encoding/decoding
// for byte blocks as implemented in zstd.
//
// See https://github.com/klauspost/compress/tree/master/fse for more information.
package fse
import (
"errors"
"fmt"
"math/bits"
)
const (
/*!MEMORY_USAGE :
* Memory usage formula : N->2^N Bytes (examples : 10 -> 1KB; 12 -> 4KB ; 16 -> 64KB; 20 -> 1MB; etc.)
* Increasing memory usage improves compression ratio
* Reduced memory usage can improve speed, due to cache effect
* Recommended max value is 14, for 16KB, which nicely fits into Intel x86 L1 cache */
maxMemoryUsage = 14
defaultMemoryUsage = 13
maxTableLog = maxMemoryUsage - 2
maxTablesize = 1 << maxTableLog
defaultTablelog = defaultMemoryUsage - 2
minTablelog = 5
maxSymbolValue = 255
)
var (
// ErrIncompressible is returned when input is judged to be too hard to compress.
ErrIncompressible = errors.New("input is not compressible")
// ErrUseRLE is returned from the compressor when the input is a single byte value repeated.
ErrUseRLE = errors.New("input is single value repeated")
)
// Scratch provides temporary storage for compression and decompression.
type Scratch struct {
// Private
count [maxSymbolValue + 1]uint32
norm [maxSymbolValue + 1]int16
symbolLen uint16 // Length of active part of the symbol table.
actualTableLog uint8 // Selected tablelog.
br byteReader
bits bitReader
bw bitWriter
ct cTable // Compression tables.
decTable []decSymbol // Decompression table.
zeroBits bool // no bits has prob > 50%.
clearCount bool // clear count
maxCount int // count of the most probable symbol
// Per block parameters.
// These can be used to override compression parameters of the block.
// Do not touch, unless you know what you are doing.
// Out is output buffer.
// If the scratch is re-used before the caller is done processing the output,
// set this field to nil.
// Otherwise the output buffer will be re-used for next Compression/Decompression step
// and allocation will be avoided.
Out []byte
// MaxSymbolValue will override the maximum symbol value of the next block.
MaxSymbolValue uint8
// TableLog will attempt to override the tablelog for the next block.
TableLog uint8
// DecompressLimit limits the maximum decoded size acceptable.
// If > 0 decompression will stop when approximately this many bytes
// has been decoded.
// If 0, maximum size will be 2GB.
DecompressLimit int
}
// Histogram allows to populate the histogram and skip that step in the compression,
// It otherwise allows to inspect the histogram when compression is done.
// To indicate that you have populated the histogram call HistogramFinished
// with the value of the highest populated symbol, as well as the number of entries
// in the most populated entry. These are accepted at face value.
// The returned slice will always be length 256.
func (s *Scratch) Histogram() []uint32 {
return s.count[:]
}
// HistogramFinished can be called to indicate that the histogram has been populated.
// maxSymbol is the index of the highest set symbol of the next data segment.
// maxCount is the number of entries in the most populated entry.
// These are accepted at face value.
func (s *Scratch) HistogramFinished(maxSymbol uint8, maxCount int) {
s.maxCount = maxCount
s.symbolLen = uint16(maxSymbol) + 1
s.clearCount = maxCount != 0
}
// prepare will prepare and allocate scratch tables used for both compression and decompression.
func (s *Scratch) prepare(in []byte) (*Scratch, error) {
if s == nil {
s = &Scratch{}
}
if s.MaxSymbolValue == 0 {
s.MaxSymbolValue = 255
}
if s.TableLog == 0 {
s.TableLog = defaultTablelog
}
if s.TableLog > maxTableLog {
return nil, fmt.Errorf("tableLog (%d) > maxTableLog (%d)", s.TableLog, maxTableLog)
}
if cap(s.Out) == 0 {
s.Out = make([]byte, 0, len(in))
}
if s.clearCount && s.maxCount == 0 {
for i := range s.count {
s.count[i] = 0
}
s.clearCount = false
}
s.br.init(in)
if s.DecompressLimit == 0 {
// Max size 2GB.
s.DecompressLimit = (2 << 30) - 1
}
return s, nil
}
// tableStep returns the next table index.
func tableStep(tableSize uint32) uint32 {
return (tableSize >> 1) + (tableSize >> 3) + 3
}
func highBits(val uint32) (n uint32) {
return uint32(bits.Len32(val) - 1)
}

View File

@@ -0,0 +1 @@
/huff0-fuzz.zip

87
vendor/github.com/klauspost/compress/huff0/README.md generated vendored Normal file
View File

@@ -0,0 +1,87 @@
# Huff0 entropy compression
This package provides Huff0 encoding and decoding as used in zstd.
[Huff0](https://github.com/Cyan4973/FiniteStateEntropy#new-generation-entropy-coders),
a Huffman codec designed for modern CPU, featuring OoO (Out of Order) operations on multiple ALU
(Arithmetic Logic Unit), achieving extremely fast compression and decompression speeds.
This can be used for compressing input with a lot of similar input values to the smallest number of bytes.
This does not perform any multi-byte [dictionary coding](https://en.wikipedia.org/wiki/Dictionary_coder) as LZ coders,
but it can be used as a secondary step to compressors (like Snappy) that does not do entropy encoding.
* [Godoc documentation](https://godoc.org/github.com/klauspost/compress/huff0)
THIS PACKAGE IS NOT CONSIDERED STABLE AND API OR ENCODING MAY CHANGE IN THE FUTURE.
## News
* Mar 2018: First implementation released. Consider this beta software for now.
# Usage
This package provides a low level interface that allows to compress single independent blocks.
Each block is separate, and there is no built in integrity checks.
This means that the caller should keep track of block sizes and also do checksums if needed.
Compressing a block is done via the [`Compress1X`](https://godoc.org/github.com/klauspost/compress/huff0#Compress1X) and
[`Compress4X`](https://godoc.org/github.com/klauspost/compress/huff0#Compress4X) functions.
You must provide input and will receive the output and maybe an error.
These error values can be returned:
| Error | Description |
|---------------------|-----------------------------------------------------------------------------|
| `<nil>` | Everything ok, output is returned |
| `ErrIncompressible` | Returned when input is judged to be too hard to compress |
| `ErrUseRLE` | Returned from the compressor when the input is a single byte value repeated |
| `ErrTooBig` | Returned if the input block exceeds the maximum allowed size (128 Kib) |
| `(error)` | An internal error occurred. |
As can be seen above some of there are errors that will be returned even under normal operation so it is important to handle these.
To reduce allocations you can provide a [`Scratch`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch) object
that can be re-used for successive calls. Both compression and decompression accepts a `Scratch` object, and the same
object can be used for both.
Be aware, that when re-using a `Scratch` object that the *output* buffer is also re-used, so if you are still using this
you must set the `Out` field in the scratch to nil. The same buffer is used for compression and decompression output.
The `Scratch` object will retain state that allows to re-use previous tables for encoding and decoding.
## Tables and re-use
Huff0 allows for reusing tables from the previous block to save space if that is expected to give better/faster results.
The Scratch object allows you to set a [`ReusePolicy`](https://godoc.org/github.com/klauspost/compress/huff0#ReusePolicy)
that controls this behaviour. See the documentation for details. This can be altered between each block.
Do however note that this information is *not* stored in the output block and it is up to the users of the package to
record whether [`ReadTable`](https://godoc.org/github.com/klauspost/compress/huff0#ReadTable) should be called,
based on the boolean reported back from the CompressXX call.
If you want to store the table separate from the data, you can access them as `OutData` and `OutTable` on the
[`Scratch`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch) object.
## Decompressing
The first part of decoding is to initialize the decoding table through [`ReadTable`](https://godoc.org/github.com/klauspost/compress/huff0#ReadTable).
This will initialize the decoding tables.
You can supply the complete block to `ReadTable` and it will return the data part of the block
which can be given to the decompressor.
Decompressing is done by calling the [`Decompress1X`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch.Decompress1X)
or [`Decompress4X`](https://godoc.org/github.com/klauspost/compress/huff0#Scratch.Decompress4X) function.
You must provide the output from the compression stage, at exactly the size you got back. If you receive an error back
your input was likely corrupted.
It is important to note that a successful decoding does *not* mean your output matches your original input.
There are no integrity checks, so relying on errors from the decompressor does not assure your data is valid.
# Contributing
Contributions are always welcome. Be aware that adding public functions will require good justification and breaking
changes will likely not be accepted. If in doubt open an issue before writing the PR.

115
vendor/github.com/klauspost/compress/huff0/bitreader.go generated vendored Normal file
View File

@@ -0,0 +1,115 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package huff0
import (
"errors"
"io"
)
// bitReader reads a bitstream in reverse.
// The last set bit indicates the start of the stream and is used
// for aligning the input.
type bitReader struct {
in []byte
off uint // next byte to read is at in[off - 1]
value uint64
bitsRead uint8
}
// init initializes and resets the bit reader.
func (b *bitReader) init(in []byte) error {
if len(in) < 1 {
return errors.New("corrupt stream: too short")
}
b.in = in
b.off = uint(len(in))
// The highest bit of the last byte indicates where to start
v := in[len(in)-1]
if v == 0 {
return errors.New("corrupt stream, did not find end of stream")
}
b.bitsRead = 64
b.value = 0
b.fill()
b.fill()
b.bitsRead += 8 - uint8(highBit32(uint32(v)))
return nil
}
// getBits will return n bits. n can be 0.
func (b *bitReader) getBits(n uint8) uint16 {
if n == 0 || b.bitsRead >= 64 {
return 0
}
return b.getBitsFast(n)
}
// getBitsFast requires that at least one bit is requested every time.
// There are no checks if the buffer is filled.
func (b *bitReader) getBitsFast(n uint8) uint16 {
const regMask = 64 - 1
v := uint16((b.value << (b.bitsRead & regMask)) >> ((regMask + 1 - n) & regMask))
b.bitsRead += n
return v
}
// peekBitsFast requires that at least one bit is requested every time.
// There are no checks if the buffer is filled.
func (b *bitReader) peekBitsFast(n uint8) uint16 {
const regMask = 64 - 1
v := uint16((b.value << (b.bitsRead & regMask)) >> ((regMask + 1 - n) & regMask))
return v
}
// fillFast() will make sure at least 32 bits are available.
// There must be at least 4 bytes available.
func (b *bitReader) fillFast() {
if b.bitsRead < 32 {
return
}
// Do single re-slice to avoid bounds checks.
v := b.in[b.off-4 : b.off]
low := (uint32(v[0])) | (uint32(v[1]) << 8) | (uint32(v[2]) << 16) | (uint32(v[3]) << 24)
b.value = (b.value << 32) | uint64(low)
b.bitsRead -= 32
b.off -= 4
}
// fill() will make sure at least 32 bits are available.
func (b *bitReader) fill() {
if b.bitsRead < 32 {
return
}
if b.off > 4 {
v := b.in[b.off-4 : b.off]
low := (uint32(v[0])) | (uint32(v[1]) << 8) | (uint32(v[2]) << 16) | (uint32(v[3]) << 24)
b.value = (b.value << 32) | uint64(low)
b.bitsRead -= 32
b.off -= 4
return
}
for b.off > 0 {
b.value = (b.value << 8) | uint64(b.in[b.off-1])
b.bitsRead -= 8
b.off--
}
}
// finished returns true if all bits have been read from the bit stream.
func (b *bitReader) finished() bool {
return b.off == 0 && b.bitsRead >= 64
}
// close the bitstream and returns an error if out-of-buffer reads occurred.
func (b *bitReader) close() error {
// Release reference.
b.in = nil
if b.bitsRead > 64 {
return io.ErrUnexpectedEOF
}
return nil
}

186
vendor/github.com/klauspost/compress/huff0/bitwriter.go generated vendored Normal file
View File

@@ -0,0 +1,186 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package huff0
import "fmt"
// bitWriter will write bits.
// First bit will be LSB of the first byte of output.
type bitWriter struct {
bitContainer uint64
nBits uint8
out []byte
}
// bitMask16 is bitmasks. Has extra to avoid bounds check.
var bitMask16 = [32]uint16{
0, 1, 3, 7, 0xF, 0x1F,
0x3F, 0x7F, 0xFF, 0x1FF, 0x3FF, 0x7FF,
0xFFF, 0x1FFF, 0x3FFF, 0x7FFF, 0xFFFF, 0xFFFF,
0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
0xFFFF, 0xFFFF} /* up to 16 bits */
// addBits16NC will add up to 16 bits.
// It will not check if there is space for them,
// so the caller must ensure that it has flushed recently.
func (b *bitWriter) addBits16NC(value uint16, bits uint8) {
b.bitContainer |= uint64(value&bitMask16[bits&31]) << (b.nBits & 63)
b.nBits += bits
}
// addBits16Clean will add up to 16 bits. value may not contain more set bits than indicated.
// It will not check if there is space for them, so the caller must ensure that it has flushed recently.
func (b *bitWriter) addBits16Clean(value uint16, bits uint8) {
b.bitContainer |= uint64(value) << (b.nBits & 63)
b.nBits += bits
}
// addBits16Clean will add up to 16 bits. value may not contain more set bits than indicated.
// It will not check if there is space for them, so the caller must ensure that it has flushed recently.
func (b *bitWriter) encSymbol(ct cTable, symbol byte) {
enc := ct[symbol]
b.bitContainer |= uint64(enc.val) << (b.nBits & 63)
b.nBits += enc.nBits
}
// addBits16ZeroNC will add up to 16 bits.
// It will not check if there is space for them,
// so the caller must ensure that it has flushed recently.
// This is fastest if bits can be zero.
func (b *bitWriter) addBits16ZeroNC(value uint16, bits uint8) {
if bits == 0 {
return
}
value <<= (16 - bits) & 15
value >>= (16 - bits) & 15
b.bitContainer |= uint64(value) << (b.nBits & 63)
b.nBits += bits
}
// flush will flush all pending full bytes.
// There will be at least 56 bits available for writing when this has been called.
// Using flush32 is faster, but leaves less space for writing.
func (b *bitWriter) flush() {
v := b.nBits >> 3
switch v {
case 0:
return
case 1:
b.out = append(b.out,
byte(b.bitContainer),
)
b.bitContainer >>= 1 << 3
case 2:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
)
b.bitContainer >>= 2 << 3
case 3:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
)
b.bitContainer >>= 3 << 3
case 4:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
)
b.bitContainer >>= 4 << 3
case 5:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
)
b.bitContainer >>= 5 << 3
case 6:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
)
b.bitContainer >>= 6 << 3
case 7:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
byte(b.bitContainer>>48),
)
b.bitContainer >>= 7 << 3
case 8:
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24),
byte(b.bitContainer>>32),
byte(b.bitContainer>>40),
byte(b.bitContainer>>48),
byte(b.bitContainer>>56),
)
b.bitContainer = 0
b.nBits = 0
return
default:
panic(fmt.Errorf("bits (%d) > 64", b.nBits))
}
b.nBits &= 7
}
// flush32 will flush out, so there are at least 32 bits available for writing.
func (b *bitWriter) flush32() {
if b.nBits < 32 {
return
}
b.out = append(b.out,
byte(b.bitContainer),
byte(b.bitContainer>>8),
byte(b.bitContainer>>16),
byte(b.bitContainer>>24))
b.nBits -= 32
b.bitContainer >>= 32
}
// flushAlign will flush remaining full bytes and align to next byte boundary.
func (b *bitWriter) flushAlign() {
nbBytes := (b.nBits + 7) >> 3
for i := uint8(0); i < nbBytes; i++ {
b.out = append(b.out, byte(b.bitContainer>>(i*8)))
}
b.nBits = 0
b.bitContainer = 0
}
// close will write the alignment bit and write the final byte(s)
// to the output.
func (b *bitWriter) close() error {
// End mark
b.addBits16Clean(1, 1)
// flush until next byte.
b.flushAlign()
return nil
}
// reset and continue writing by appending to out.
func (b *bitWriter) reset(out []byte) {
b.bitContainer = 0
b.nBits = 0
b.out = out
}

View File

@@ -0,0 +1,54 @@
// Copyright 2018 Klaus Post. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// Based on work Copyright (c) 2013, Yann Collet, released under BSD License.
package huff0
// byteReader provides a byte reader that reads
// little endian values from a byte stream.
// The input stream is manually advanced.
// The reader performs no bounds checks.
type byteReader struct {
b []byte
off int
}
// init will initialize the reader and set the input.
func (b *byteReader) init(in []byte) {
b.b = in
b.off = 0
}
// advance the stream b n bytes.
func (b *byteReader) advance(n uint) {
b.off += int(n)
}
// Int32 returns a little endian int32 starting at current offset.
func (b byteReader) Int32() int32 {
v3 := int32(b.b[b.off+3])
v2 := int32(b.b[b.off+2])
v1 := int32(b.b[b.off+1])
v0 := int32(b.b[b.off])
return (v3 << 24) | (v2 << 16) | (v1 << 8) | v0
}
// Uint32 returns a little endian uint32 starting at current offset.
func (b byteReader) Uint32() uint32 {
v3 := uint32(b.b[b.off+3])
v2 := uint32(b.b[b.off+2])
v1 := uint32(b.b[b.off+1])
v0 := uint32(b.b[b.off])
return (v3 << 24) | (v2 << 16) | (v1 << 8) | v0
}
// unread returns the unread portion of the input.
func (b byteReader) unread() []byte {
return b.b[b.off:]
}
// remain will return the number of bytes remaining.
func (b byteReader) remain() int {
return len(b.b) - b.off
}

618
vendor/github.com/klauspost/compress/huff0/compress.go generated vendored Normal file
View File

@@ -0,0 +1,618 @@
package huff0
import (
"fmt"
"runtime"
"sync"
)
// Compress1X will compress the input.
// The output can be decoded using Decompress1X.
// Supply a Scratch object. The scratch object contains state about re-use,
// So when sharing across independent encodes, be sure to set the re-use policy.
func Compress1X(in []byte, s *Scratch) (out []byte, reUsed bool, err error) {
s, err = s.prepare(in)
if err != nil {
return nil, false, err
}
return compress(in, s, s.compress1X)
}
// Compress4X will compress the input. The input is split into 4 independent blocks
// and compressed similar to Compress1X.
// The output can be decoded using Decompress4X.
// Supply a Scratch object. The scratch object contains state about re-use,
// So when sharing across independent encodes, be sure to set the re-use policy.
func Compress4X(in []byte, s *Scratch) (out []byte, reUsed bool, err error) {
s, err = s.prepare(in)
if err != nil {
return nil, false, err
}
if false {
// TODO: compress4Xp only slightly faster.
const parallelThreshold = 8 << 10
if len(in) < parallelThreshold || runtime.GOMAXPROCS(0) == 1 {
return compress(in, s, s.compress4X)
}
return compress(in, s, s.compress4Xp)
}
return compress(in, s, s.compress4X)
}
func compress(in []byte, s *Scratch, compressor func(src []byte) ([]byte, error)) (out []byte, reUsed bool, err error) {
// Nuke previous table if we cannot reuse anyway.
if s.Reuse == ReusePolicyNone {
s.prevTable = s.prevTable[:0]
}
// Create histogram, if none was provided.
maxCount := s.maxCount
var canReuse = false
if maxCount == 0 {
maxCount, canReuse = s.countSimple(in)
} else {
canReuse = s.canUseTable(s.prevTable)
}
// Reset for next run.
s.clearCount = true
s.maxCount = 0
if maxCount >= len(in) {
if maxCount > len(in) {
return nil, false, fmt.Errorf("maxCount (%d) > length (%d)", maxCount, len(in))
}
if len(in) == 1 {
return nil, false, ErrIncompressible
}
// One symbol, use RLE
return nil, false, ErrUseRLE
}
if maxCount == 1 || maxCount < (len(in)>>7) {
// Each symbol present maximum once or too well distributed.
return nil, false, ErrIncompressible
}
if s.Reuse == ReusePolicyPrefer && canReuse {
keepTable := s.cTable
s.cTable = s.prevTable
s.Out, err = compressor(in)
s.cTable = keepTable
if err == nil && len(s.Out) < len(in) {
s.OutData = s.Out
return s.Out, true, nil
}
// Do not attempt to re-use later.
s.prevTable = s.prevTable[:0]
}
// Calculate new table.
s.optimalTableLog()
err = s.buildCTable()
if err != nil {
return nil, false, err
}
if false && !s.canUseTable(s.cTable) {
panic("invalid table generated")
}
if s.Reuse == ReusePolicyAllow && canReuse {
hSize := len(s.Out)
oldSize := s.prevTable.estimateSize(s.count[:s.symbolLen])
newSize := s.cTable.estimateSize(s.count[:s.symbolLen])
if oldSize <= hSize+newSize || hSize+12 >= len(in) {
// Retain cTable even if we re-use.
keepTable := s.cTable
s.cTable = s.prevTable
s.Out, err = compressor(in)
s.cTable = keepTable
if len(s.Out) >= len(in) {
return nil, false, ErrIncompressible
}
s.OutData = s.Out
return s.Out, true, nil
}
}
// Use new table
err = s.cTable.write(s)
if err != nil {
s.OutTable = nil
return nil, false, err
}
s.OutTable = s.Out
// Compress using new table
s.Out, err = compressor(in)
if err != nil {
s.OutTable = nil
return nil, false, err
}
if len(s.Out) >= len(in) {
s.OutTable = nil
return nil, false, ErrIncompressible
}
// Move current table into previous.
s.prevTable, s.cTable = s.cTable, s.prevTable[:0]
s.OutData = s.Out[len(s.OutTable):]
return s.Out, false, nil
}
func (s *Scratch) compress1X(src []byte) ([]byte, error) {
return s.compress1xDo(s.Out, src)
}
func (s *Scratch) compress1xDo(dst, src []byte) ([]byte, error) {
var bw = bitWriter{out: dst}
// N is length divisible by 4.
n := len(src)
n -= n & 3
cTable := s.cTable[:256]
// Encode last bytes.
for i := len(src) & 3; i > 0; i-- {
bw.encSymbol(cTable, src[n+i-1])
}
if s.actualTableLog <= 8 {
n -= 4
for ; n >= 0; n -= 4 {
tmp := src[n : n+4]
// tmp should be len 4
bw.flush32()
bw.encSymbol(cTable, tmp[3])
bw.encSymbol(cTable, tmp[2])
bw.encSymbol(cTable, tmp[1])
bw.encSymbol(cTable, tmp[0])
}
} else {
n -= 4
for ; n >= 0; n -= 4 {
tmp := src[n : n+4]
// tmp should be len 4
bw.flush32()
bw.encSymbol(cTable, tmp[3])
bw.encSymbol(cTable, tmp[2])
bw.flush32()
bw.encSymbol(cTable, tmp[1])
bw.encSymbol(cTable, tmp[0])
}
}
err := bw.close()
return bw.out, err
}
var sixZeros [6]byte
func (s *Scratch) compress4X(src []byte) ([]byte, error) {
if len(src) < 12 {
return nil, ErrIncompressible
}
segmentSize := (len(src) + 3) / 4
// Add placeholder for output length
offsetIdx := len(s.Out)
s.Out = append(s.Out, sixZeros[:]...)
for i := 0; i < 4; i++ {
toDo := src
if len(toDo) > segmentSize {
toDo = toDo[:segmentSize]
}
src = src[len(toDo):]
var err error
idx := len(s.Out)
s.Out, err = s.compress1xDo(s.Out, toDo)
if err != nil {
return nil, err
}
// Write compressed length as little endian before block.
if i < 3 {
// Last length is not written.
length := len(s.Out) - idx
s.Out[i*2+offsetIdx] = byte(length)
s.Out[i*2+offsetIdx+1] = byte(length >> 8)
}
}
return s.Out, nil
}
// compress4Xp will compress 4 streams using separate goroutines.
func (s *Scratch) compress4Xp(src []byte) ([]byte, error) {
if len(src) < 12 {
return nil, ErrIncompressible
}
// Add placeholder for output length
s.Out = s.Out[:6]
segmentSize := (len(src) + 3) / 4
var wg sync.WaitGroup
var errs [4]error
wg.Add(4)
for i := 0; i < 4; i++ {
toDo := src
if len(toDo) > segmentSize {
toDo = toDo[:segmentSize]
}
src = src[len(toDo):]
// Separate goroutine for each block.
go func(i int) {
s.tmpOut[i], errs[i] = s.compress1xDo(s.tmpOut[i][:0], toDo)
wg.Done()
}(i)
}
wg.Wait()
for i := 0; i < 4; i++ {
if errs[i] != nil {
return nil, errs[i]
}
o := s.tmpOut[i]
// Write compressed length as little endian before block.
if i < 3 {
// Last length is not written.
s.Out[i*2] = byte(len(o))
s.Out[i*2+1] = byte(len(o) >> 8)
}
// Write output.
s.Out = append(s.Out, o...)
}
return s.Out, nil
}
// countSimple will create a simple histogram in s.count.
// Returns the biggest count.
// Does not update s.clearCount.
func (s *Scratch) countSimple(in []byte) (max int, reuse bool) {
reuse = true
for _, v := range in {
s.count[v]++
}
m := uint32(0)
if len(s.prevTable) > 0 {
for i, v := range s.count[:] {
if v > m {
m = v
}
if v > 0 {
s.symbolLen = uint16(i) + 1
if i >= len(s.prevTable) {
reuse = false
} else {
if s.prevTable[i].nBits == 0 {
reuse = false
}
}
}
}
return int(m), reuse
}
for i, v := range s.count[:] {
if v > m {
m = v
}
if v > 0 {
s.symbolLen = uint16(i) + 1
}
}
return int(m), false
}
func (s *Scratch) canUseTable(c cTable) bool {
if len(c) < int(s.symbolLen) {
return false
}
for i, v := range s.count[:s.symbolLen] {
if v != 0 && c[i].nBits == 0 {
return false
}
}
return true
}
// minTableLog provides the minimum logSize to safely represent a distribution.
func (s *Scratch) minTableLog() uint8 {
minBitsSrc := highBit32(uint32(s.br.remain()-1)) + 1
minBitsSymbols := highBit32(uint32(s.symbolLen-1)) + 2
if minBitsSrc < minBitsSymbols {
return uint8(minBitsSrc)
}
return uint8(minBitsSymbols)
}
// optimalTableLog calculates and sets the optimal tableLog in s.actualTableLog
func (s *Scratch) optimalTableLog() {
tableLog := s.TableLog
minBits := s.minTableLog()
maxBitsSrc := uint8(highBit32(uint32(s.br.remain()-1))) - 2
if maxBitsSrc < tableLog {
// Accuracy can be reduced
tableLog = maxBitsSrc
}
if minBits > tableLog {
tableLog = minBits
}
// Need a minimum to safely represent all symbol values
if tableLog < minTablelog {
tableLog = minTablelog
}
if tableLog > tableLogMax {
tableLog = tableLogMax
}
s.actualTableLog = tableLog
}
type cTableEntry struct {
val uint16
nBits uint8
// We have 8 bits extra
}
const huffNodesMask = huffNodesLen - 1
func (s *Scratch) buildCTable() error {
s.huffSort()
if cap(s.cTable) < maxSymbolValue+1 {
s.cTable = make([]cTableEntry, s.symbolLen, maxSymbolValue+1)
} else {
s.cTable = s.cTable[:s.symbolLen]
for i := range s.cTable {
s.cTable[i] = cTableEntry{}
}
}
var startNode = int16(s.symbolLen)
nonNullRank := s.symbolLen - 1
nodeNb := int16(startNode)
huffNode := s.nodes[1 : huffNodesLen+1]
// This overlays the slice above, but allows "-1" index lookups.
// Different from reference implementation.
huffNode0 := s.nodes[0 : huffNodesLen+1]
for huffNode[nonNullRank].count == 0 {
nonNullRank--
}
lowS := int16(nonNullRank)
nodeRoot := nodeNb + lowS - 1
lowN := nodeNb
huffNode[nodeNb].count = huffNode[lowS].count + huffNode[lowS-1].count
huffNode[lowS].parent, huffNode[lowS-1].parent = uint16(nodeNb), uint16(nodeNb)
nodeNb++
lowS -= 2
for n := nodeNb; n <= nodeRoot; n++ {
huffNode[n].count = 1 << 30
}
// fake entry, strong barrier
huffNode0[0].count = 1 << 31
// create parents
for nodeNb <= nodeRoot {
var n1, n2 int16
if huffNode0[lowS+1].count < huffNode0[lowN+1].count {
n1 = lowS
lowS--
} else {
n1 = lowN
lowN++
}
if huffNode0[lowS+1].count < huffNode0[lowN+1].count {
n2 = lowS
lowS--
} else {
n2 = lowN
lowN++
}
huffNode[nodeNb].count = huffNode0[n1+1].count + huffNode0[n2+1].count
huffNode0[n1+1].parent, huffNode0[n2+1].parent = uint16(nodeNb), uint16(nodeNb)
nodeNb++
}
// distribute weights (unlimited tree height)
huffNode[nodeRoot].nbBits = 0
for n := nodeRoot - 1; n >= startNode; n-- {
huffNode[n].nbBits = huffNode[huffNode[n].parent].nbBits + 1
}
for n := uint16(0); n <= nonNullRank; n++ {
huffNode[n].nbBits = huffNode[huffNode[n].parent].nbBits + 1
}
s.actualTableLog = s.setMaxHeight(int(nonNullRank))
maxNbBits := s.actualTableLog
// fill result into tree (val, nbBits)
if maxNbBits > tableLogMax {
return fmt.Errorf("internal error: maxNbBits (%d) > tableLogMax (%d)", maxNbBits, tableLogMax)
}
var nbPerRank [tableLogMax + 1]uint16
var valPerRank [tableLogMax + 1]uint16
for _, v := range huffNode[:nonNullRank+1] {
nbPerRank[v.nbBits]++
}
// determine stating value per rank
{
min := uint16(0)
for n := maxNbBits; n > 0; n-- {
// get starting value within each rank
valPerRank[n] = min
min += nbPerRank[n]
min >>= 1
}
}
// push nbBits per symbol, symbol order
// TODO: changed `s.symbolLen` -> `nonNullRank+1` (micro-opt)
for _, v := range huffNode[:nonNullRank+1] {
s.cTable[v.symbol].nBits = v.nbBits
}
// assign value within rank, symbol order
for n, val := range s.cTable[:s.symbolLen] {
v := valPerRank[val.nBits]
s.cTable[n].val = v
valPerRank[val.nBits] = v + 1
}
return nil
}
// huffSort will sort symbols, decreasing order.
func (s *Scratch) huffSort() {
type rankPos struct {
base uint32
current uint32
}
// Clear nodes
nodes := s.nodes[:huffNodesLen+1]
s.nodes = nodes
nodes = nodes[1 : huffNodesLen+1]
// Sort into buckets based on length of symbol count.
var rank [32]rankPos
for _, v := range s.count[:s.symbolLen] {
r := highBit32(v+1) & 31
rank[r].base++
}
for n := 30; n > 0; n-- {
rank[n-1].base += rank[n].base
}
for n := range rank[:] {
rank[n].current = rank[n].base
}
for n, c := range s.count[:s.symbolLen] {
r := (highBit32(c+1) + 1) & 31
pos := rank[r].current
rank[r].current++
prev := nodes[(pos-1)&huffNodesMask]
for pos > rank[r].base && c > prev.count {
nodes[pos&huffNodesMask] = prev
pos--
prev = nodes[(pos-1)&huffNodesMask]
}
nodes[pos&huffNodesMask] = nodeElt{count: c, symbol: byte(n)}
}
return
}
func (s *Scratch) setMaxHeight(lastNonNull int) uint8 {
maxNbBits := s.TableLog
huffNode := s.nodes[1 : huffNodesLen+1]
//huffNode = huffNode[: huffNodesLen]
largestBits := huffNode[lastNonNull].nbBits
// early exit : no elt > maxNbBits
if largestBits <= maxNbBits {
return largestBits
}
totalCost := int(0)
baseCost := int(1) << (largestBits - maxNbBits)
n := uint32(lastNonNull)
for huffNode[n].nbBits > maxNbBits {
totalCost += baseCost - (1 << (largestBits - huffNode[n].nbBits))
huffNode[n].nbBits = maxNbBits
n--
}
// n stops at huffNode[n].nbBits <= maxNbBits
for huffNode[n].nbBits == maxNbBits {
n--
}
// n end at index of smallest symbol using < maxNbBits
// renorm totalCost
totalCost >>= largestBits - maxNbBits /* note : totalCost is necessarily a multiple of baseCost */
// repay normalized cost
{
const noSymbol = 0xF0F0F0F0
var rankLast [tableLogMax + 2]uint32
for i := range rankLast[:] {
rankLast[i] = noSymbol
}
// Get pos of last (smallest) symbol per rank
{
currentNbBits := uint8(maxNbBits)
for pos := int(n); pos >= 0; pos-- {
if huffNode[pos].nbBits >= currentNbBits {
continue
}
currentNbBits = huffNode[pos].nbBits // < maxNbBits
rankLast[maxNbBits-currentNbBits] = uint32(pos)
}
}
for totalCost > 0 {
nBitsToDecrease := uint8(highBit32(uint32(totalCost))) + 1
for ; nBitsToDecrease > 1; nBitsToDecrease-- {
highPos := rankLast[nBitsToDecrease]
lowPos := rankLast[nBitsToDecrease-1]
if highPos == noSymbol {
continue
}
if lowPos == noSymbol {
break
}
highTotal := huffNode[highPos].count
lowTotal := 2 * huffNode[lowPos].count
if highTotal <= lowTotal {
break
}
}
// only triggered when no more rank 1 symbol left => find closest one (note : there is necessarily at least one !)
// HUF_MAX_TABLELOG test just to please gcc 5+; but it should not be necessary
// FIXME: try to remove
for (nBitsToDecrease <= tableLogMax) && (rankLast[nBitsToDecrease] == noSymbol) {
nBitsToDecrease++
}
totalCost -= 1 << (nBitsToDecrease - 1)
if rankLast[nBitsToDecrease-1] == noSymbol {
// this rank is no longer empty
rankLast[nBitsToDecrease-1] = rankLast[nBitsToDecrease]
}
huffNode[rankLast[nBitsToDecrease]].nbBits++
if rankLast[nBitsToDecrease] == 0 {
/* special case, reached largest symbol */
rankLast[nBitsToDecrease] = noSymbol
} else {
rankLast[nBitsToDecrease]--
if huffNode[rankLast[nBitsToDecrease]].nbBits != maxNbBits-nBitsToDecrease {
rankLast[nBitsToDecrease] = noSymbol /* this rank is now empty */
}
}
}
for totalCost < 0 { /* Sometimes, cost correction overshoot */
if rankLast[1] == noSymbol { /* special case : no rank 1 symbol (using maxNbBits-1); let's create one from largest rank 0 (using maxNbBits) */
for huffNode[n].nbBits == maxNbBits {
n--
}
huffNode[n+1].nbBits--
rankLast[1] = n + 1
totalCost++
continue
}
huffNode[rankLast[1]+1].nbBits--
rankLast[1]++
totalCost++
}
}
return maxNbBits
}
type nodeElt struct {
count uint32
parent uint16
symbol byte
nbBits uint8
}

View File

@@ -0,0 +1,401 @@
package huff0
import (
"errors"
"fmt"
"io"
"github.com/klauspost/compress/fse"
)
type dTable struct {
single []dEntrySingle
double []dEntryDouble
}
// single-symbols decoding
type dEntrySingle struct {
byte uint8
nBits uint8
}
// double-symbols decoding
type dEntryDouble struct {
seq uint16
nBits uint8
len uint8
}
// ReadTable will read a table from the input.
// The size of the input may be larger than the table definition.
// Any content remaining after the table definition will be returned.
// If no Scratch is provided a new one is allocated.
// The returned Scratch can be used for decoding input using this table.
func ReadTable(in []byte, s *Scratch) (s2 *Scratch, remain []byte, err error) {
s, err = s.prepare(in)
if err != nil {
return s, nil, err
}
if len(in) <= 1 {
return s, nil, errors.New("input too small for table")
}
iSize := in[0]
in = in[1:]
if iSize >= 128 {
// Uncompressed
oSize := iSize - 127
iSize = (oSize + 1) / 2
if int(iSize) > len(in) {
return s, nil, errors.New("input too small for table")
}
for n := uint8(0); n < oSize; n += 2 {
v := in[n/2]
s.huffWeight[n] = v >> 4
s.huffWeight[n+1] = v & 15
}
s.symbolLen = uint16(oSize)
in = in[iSize:]
} else {
if len(in) <= int(iSize) {
return s, nil, errors.New("input too small for table")
}
// FSE compressed weights
s.fse.DecompressLimit = 255
hw := s.huffWeight[:]
s.fse.Out = hw
b, err := fse.Decompress(in[:iSize], s.fse)
s.fse.Out = nil
if err != nil {
return s, nil, err
}
if len(b) > 255 {
return s, nil, errors.New("corrupt input: output table too large")
}
s.symbolLen = uint16(len(b))
in = in[iSize:]
}
// collect weight stats
var rankStats [tableLogMax + 1]uint32
weightTotal := uint32(0)
for _, v := range s.huffWeight[:s.symbolLen] {
if v > tableLogMax {
return s, nil, errors.New("corrupt input: weight too large")
}
rankStats[v]++
weightTotal += (1 << (v & 15)) >> 1
}
if weightTotal == 0 {
return s, nil, errors.New("corrupt input: weights zero")
}
// get last non-null symbol weight (implied, total must be 2^n)
{
tableLog := highBit32(weightTotal) + 1
if tableLog > tableLogMax {
return s, nil, errors.New("corrupt input: tableLog too big")
}
s.actualTableLog = uint8(tableLog)
// determine last weight
{
total := uint32(1) << tableLog
rest := total - weightTotal
verif := uint32(1) << highBit32(rest)
lastWeight := highBit32(rest) + 1
if verif != rest {
// last value must be a clean power of 2
return s, nil, errors.New("corrupt input: last value not power of two")
}
s.huffWeight[s.symbolLen] = uint8(lastWeight)
s.symbolLen++
rankStats[lastWeight]++
}
}
if (rankStats[1] < 2) || (rankStats[1]&1 != 0) {
// by construction : at least 2 elts of rank 1, must be even
return s, nil, errors.New("corrupt input: min elt size, even check failed ")
}
// TODO: Choose between single/double symbol decoding
// Calculate starting value for each rank
{
var nextRankStart uint32
for n := uint8(1); n < s.actualTableLog+1; n++ {
current := nextRankStart
nextRankStart += rankStats[n] << (n - 1)
rankStats[n] = current
}
}
// fill DTable (always full size)
tSize := 1 << tableLogMax
if len(s.dt.single) != tSize {
s.dt.single = make([]dEntrySingle, tSize)
}
for n, w := range s.huffWeight[:s.symbolLen] {
length := (uint32(1) << w) >> 1
d := dEntrySingle{
byte: uint8(n),
nBits: s.actualTableLog + 1 - w,
}
for u := rankStats[w]; u < rankStats[w]+length; u++ {
s.dt.single[u] = d
}
rankStats[w] += length
}
return s, in, nil
}
// Decompress1X will decompress a 1X encoded stream.
// The length of the supplied input must match the end of a block exactly.
// Before this is called, the table must be initialized with ReadTable unless
// the encoder re-used the table.
func (s *Scratch) Decompress1X(in []byte) (out []byte, err error) {
if len(s.dt.single) == 0 {
return nil, errors.New("no table loaded")
}
var br bitReader
err = br.init(in)
if err != nil {
return nil, err
}
s.Out = s.Out[:0]
decode := func() byte {
val := br.peekBitsFast(s.actualTableLog) /* note : actualTableLog >= 1 */
v := s.dt.single[val]
br.bitsRead += v.nBits
return v.byte
}
hasDec := func(v dEntrySingle) byte {
br.bitsRead += v.nBits
return v.byte
}
// Avoid bounds check by always having full sized table.
const tlSize = 1 << tableLogMax
const tlMask = tlSize - 1
dt := s.dt.single[:tlSize]
// Use temp table to avoid bound checks/append penalty.
var tmp = s.huffWeight[:256]
var off uint8
for br.off >= 8 {
br.fillFast()
tmp[off+0] = hasDec(dt[br.peekBitsFast(s.actualTableLog)&tlMask])
tmp[off+1] = hasDec(dt[br.peekBitsFast(s.actualTableLog)&tlMask])
br.fillFast()
tmp[off+2] = hasDec(dt[br.peekBitsFast(s.actualTableLog)&tlMask])
tmp[off+3] = hasDec(dt[br.peekBitsFast(s.actualTableLog)&tlMask])
off += 4
if off == 0 {
s.Out = append(s.Out, tmp...)
}
}
s.Out = append(s.Out, tmp[:off]...)
for !br.finished() {
br.fill()
s.Out = append(s.Out, decode())
}
return s.Out, br.close()
}
// Decompress4X will decompress a 4X encoded stream.
// Before this is called, the table must be initialized with ReadTable unless
// the encoder re-used the table.
// The length of the supplied input must match the end of a block exactly.
// The destination size of the uncompressed data must be known and provided.
func (s *Scratch) Decompress4X(in []byte, dstSize int) (out []byte, err error) {
if len(s.dt.single) == 0 {
return nil, errors.New("no table loaded")
}
if len(in) < 6+(4*1) {
return nil, errors.New("input too small")
}
// TODO: We do not detect when we overrun a buffer, except if the last one does.
var br [4]bitReader
start := 6
for i := 0; i < 3; i++ {
length := int(in[i*2]) | (int(in[i*2+1]) << 8)
if start+length >= len(in) {
return nil, errors.New("truncated input (or invalid offset)")
}
err = br[i].init(in[start : start+length])
if err != nil {
return nil, err
}
start += length
}
err = br[3].init(in[start:])
if err != nil {
return nil, err
}
// Prepare output
if cap(s.Out) < dstSize {
s.Out = make([]byte, 0, dstSize)
}
s.Out = s.Out[:dstSize]
// destination, offset to match first output
dstOut := s.Out
dstEvery := (dstSize + 3) / 4
const tlSize = 1 << tableLogMax
const tlMask = tlSize - 1
single := s.dt.single[:tlSize]
decode := func(br *bitReader) byte {
val := br.peekBitsFast(s.actualTableLog) /* note : actualTableLog >= 1 */
v := single[val&tlMask]
br.bitsRead += v.nBits
return v.byte
}
// Use temp table to avoid bound checks/append penalty.
var tmp = s.huffWeight[:256]
var off uint8
// Decode 2 values from each decoder/loop.
const bufoff = 256 / 4
bigloop:
for {
for i := range br {
if br[i].off < 4 {
break bigloop
}
br[i].fillFast()
}
tmp[off] = decode(&br[0])
tmp[off+bufoff] = decode(&br[1])
tmp[off+bufoff*2] = decode(&br[2])
tmp[off+bufoff*3] = decode(&br[3])
tmp[off+1] = decode(&br[0])
tmp[off+1+bufoff] = decode(&br[1])
tmp[off+1+bufoff*2] = decode(&br[2])
tmp[off+1+bufoff*3] = decode(&br[3])
off += 2
if off == bufoff {
if bufoff > dstEvery {
return nil, errors.New("corruption detected: stream overrun")
}
copy(dstOut, tmp[:bufoff])
copy(dstOut[dstEvery:], tmp[bufoff:bufoff*2])
copy(dstOut[dstEvery*2:], tmp[bufoff*2:bufoff*3])
copy(dstOut[dstEvery*3:], tmp[bufoff*3:bufoff*4])
off = 0
dstOut = dstOut[bufoff:]
// There must at least be 3 buffers left.
if len(dstOut) < dstEvery*3+3 {
return nil, errors.New("corruption detected: stream overrun")
}
}
}
if off > 0 {
ioff := int(off)
if len(dstOut) < dstEvery*3+ioff {
return nil, errors.New("corruption detected: stream overrun")
}
copy(dstOut, tmp[:off])
copy(dstOut[dstEvery:dstEvery+ioff], tmp[bufoff:bufoff*2])
copy(dstOut[dstEvery*2:dstEvery*2+ioff], tmp[bufoff*2:bufoff*3])
copy(dstOut[dstEvery*3:dstEvery*3+ioff], tmp[bufoff*3:bufoff*4])
dstOut = dstOut[off:]
}
for i := range br {
offset := dstEvery * i
br := &br[i]
for !br.finished() {
br.fill()
if offset >= len(dstOut) {
return nil, errors.New("corruption detected: stream overrun")
}
dstOut[offset] = decode(br)
offset++
}
err = br.close()
if err != nil {
return nil, err
}
}
return s.Out, nil
}
// matches will compare a decoding table to a coding table.
// Errors are written to the writer.
// Nothing will be written if table is ok.
func (s *Scratch) matches(ct cTable, w io.Writer) {
if s == nil || len(s.dt.single) == 0 {
return
}
dt := s.dt.single[:1<<s.actualTableLog]
tablelog := s.actualTableLog
ok := 0
broken := 0
for sym, enc := range ct {
errs := 0
broken++
if enc.nBits == 0 {
for _, dec := range dt {
if dec.byte == byte(sym) {
fmt.Fprintf(w, "symbol %x has decoder, but no encoder\n", sym)
errs++
break
}
}
if errs == 0 {
broken--
}
continue
}
// Unused bits in input
ub := tablelog - enc.nBits
top := enc.val << ub
// decoder looks at top bits.
dec := dt[top]
if dec.nBits != enc.nBits {
fmt.Fprintf(w, "symbol 0x%x bit size mismatch (enc: %d, dec:%d).\n", sym, enc.nBits, dec.nBits)
errs++
}
if dec.byte != uint8(sym) {
fmt.Fprintf(w, "symbol 0x%x decoder output mismatch (enc: %d, dec:%d).\n", sym, sym, dec.byte)
errs++
}
if errs > 0 {
fmt.Fprintf(w, "%d errros in base, stopping\n", errs)
continue
}
// Ensure that all combinations are covered.
for i := uint16(0); i < (1 << ub); i++ {
vval := top | i
dec := dt[vval]
if dec.nBits != enc.nBits {
fmt.Fprintf(w, "symbol 0x%x bit size mismatch (enc: %d, dec:%d).\n", vval, enc.nBits, dec.nBits)
errs++
}
if dec.byte != uint8(sym) {
fmt.Fprintf(w, "symbol 0x%x decoder output mismatch (enc: %d, dec:%d).\n", vval, sym, dec.byte)
errs++
}
if errs > 20 {
fmt.Fprintf(w, "%d errros, stopping\n", errs)
break
}
}
if errs == 0 {
ok++
broken--
}
}
if broken > 0 {
fmt.Fprintf(w, "%d broken, %d ok\n", broken, ok)
}
}

241
vendor/github.com/klauspost/compress/huff0/huff0.go generated vendored Normal file
View File

@@ -0,0 +1,241 @@
// Package huff0 provides fast huffman encoding as used in zstd.
//
// See README.md at https://github.com/klauspost/compress/tree/master/huff0 for details.
package huff0
import (
"errors"
"fmt"
"math"
"math/bits"
"github.com/klauspost/compress/fse"
)
const (
maxSymbolValue = 255
// zstandard limits tablelog to 11, see:
// https://github.com/facebook/zstd/blob/dev/doc/zstd_compression_format.md#huffman-tree-description
tableLogMax = 11
tableLogDefault = 11
minTablelog = 5
huffNodesLen = 512
// BlockSizeMax is maximum input size for a single block uncompressed.
BlockSizeMax = 1<<18 - 1
)
var (
// ErrIncompressible is returned when input is judged to be too hard to compress.
ErrIncompressible = errors.New("input is not compressible")
// ErrUseRLE is returned from the compressor when the input is a single byte value repeated.
ErrUseRLE = errors.New("input is single value repeated")
// ErrTooBig is return if input is too large for a single block.
ErrTooBig = errors.New("input too big")
)
type ReusePolicy uint8
const (
// ReusePolicyAllow will allow reuse if it produces smaller output.
ReusePolicyAllow ReusePolicy = iota
// ReusePolicyPrefer will re-use aggressively if possible.
// This will not check if a new table will produce smaller output,
// except if the current table is impossible to use or
// compressed output is bigger than input.
ReusePolicyPrefer
// ReusePolicyNone will disable re-use of tables.
// This is slightly faster than ReusePolicyAllow but may produce larger output.
ReusePolicyNone
)
type Scratch struct {
count [maxSymbolValue + 1]uint32
// Per block parameters.
// These can be used to override compression parameters of the block.
// Do not touch, unless you know what you are doing.
// Out is output buffer.
// If the scratch is re-used before the caller is done processing the output,
// set this field to nil.
// Otherwise the output buffer will be re-used for next Compression/Decompression step
// and allocation will be avoided.
Out []byte
// OutTable will contain the table data only, if a new table has been generated.
// Slice of the returned data.
OutTable []byte
// OutData will contain the compressed data.
// Slice of the returned data.
OutData []byte
// MaxSymbolValue will override the maximum symbol value of the next block.
MaxSymbolValue uint8
// TableLog will attempt to override the tablelog for the next block.
// Must be <= 11.
TableLog uint8
// Reuse will specify the reuse policy
Reuse ReusePolicy
br byteReader
symbolLen uint16 // Length of active part of the symbol table.
maxCount int // count of the most probable symbol
clearCount bool // clear count
actualTableLog uint8 // Selected tablelog.
prevTable cTable // Table used for previous compression.
cTable cTable // compression table
dt dTable // decompression table
nodes []nodeElt
tmpOut [4][]byte
fse *fse.Scratch
huffWeight [maxSymbolValue + 1]byte
}
func (s *Scratch) prepare(in []byte) (*Scratch, error) {
if len(in) > BlockSizeMax {
return nil, ErrTooBig
}
if s == nil {
s = &Scratch{}
}
if s.MaxSymbolValue == 0 {
s.MaxSymbolValue = maxSymbolValue
}
if s.TableLog == 0 {
s.TableLog = tableLogDefault
}
if s.TableLog > tableLogMax {
return nil, fmt.Errorf("tableLog (%d) > maxTableLog (%d)", s.TableLog, tableLogMax)
}
if s.clearCount && s.maxCount == 0 {
for i := range s.count {
s.count[i] = 0
}
s.clearCount = false
}
if cap(s.Out) == 0 {
s.Out = make([]byte, 0, len(in))
}
s.Out = s.Out[:0]
s.OutTable = nil
s.OutData = nil
if cap(s.nodes) < huffNodesLen+1 {
s.nodes = make([]nodeElt, 0, huffNodesLen+1)
}
s.nodes = s.nodes[:0]
if s.fse == nil {
s.fse = &fse.Scratch{}
}
s.br.init(in)
return s, nil
}
type cTable []cTableEntry
func (c cTable) write(s *Scratch) error {
var (
// precomputed conversion table
bitsToWeight [tableLogMax + 1]byte
huffLog = s.actualTableLog
// last weight is not saved.
maxSymbolValue = uint8(s.symbolLen - 1)
huffWeight = s.huffWeight[:256]
)
const (
maxFSETableLog = 6
)
// convert to weight
bitsToWeight[0] = 0
for n := uint8(1); n < huffLog+1; n++ {
bitsToWeight[n] = huffLog + 1 - n
}
// Acquire histogram for FSE.
hist := s.fse.Histogram()
hist = hist[:256]
for i := range hist[:16] {
hist[i] = 0
}
for n := uint8(0); n < maxSymbolValue; n++ {
v := bitsToWeight[c[n].nBits] & 15
huffWeight[n] = v
hist[v]++
}
// FSE compress if feasible.
if maxSymbolValue >= 2 {
huffMaxCnt := uint32(0)
huffMax := uint8(0)
for i, v := range hist[:16] {
if v == 0 {
continue
}
huffMax = byte(i)
if v > huffMaxCnt {
huffMaxCnt = v
}
}
s.fse.HistogramFinished(huffMax, int(huffMaxCnt))
s.fse.TableLog = maxFSETableLog
b, err := fse.Compress(huffWeight[:maxSymbolValue], s.fse)
if err == nil && len(b) < int(s.symbolLen>>1) {
s.Out = append(s.Out, uint8(len(b)))
s.Out = append(s.Out, b...)
return nil
}
// Unable to compress (RLE/uncompressible)
}
// write raw values as 4-bits (max : 15)
if maxSymbolValue > (256 - 128) {
// should not happen : likely means source cannot be compressed
return ErrIncompressible
}
op := s.Out
// special case, pack weights 4 bits/weight.
op = append(op, 128|(maxSymbolValue-1))
// be sure it doesn't cause msan issue in final combination
huffWeight[maxSymbolValue] = 0
for n := uint16(0); n < uint16(maxSymbolValue); n += 2 {
op = append(op, (huffWeight[n]<<4)|huffWeight[n+1])
}
s.Out = op
return nil
}
// estimateSize returns the estimated size in bytes of the input represented in the
// histogram supplied.
func (c cTable) estimateSize(hist []uint32) int {
nbBits := uint32(7)
for i, v := range c[:len(hist)] {
nbBits += uint32(v.nBits) * hist[i]
}
return int(nbBits >> 3)
}
// minSize returns the minimum possible size considering the shannon limit.
func (s *Scratch) minSize(total int) int {
nbBits := float64(7)
fTotal := float64(total)
for _, v := range s.count[:s.symbolLen] {
n := float64(v)
if n > 0 {
nbBits += math.Log2(fTotal/n) * n
}
}
return int(nbBits) >> 3
}
func highBit32(val uint32) (n uint32) {
return uint32(bits.Len32(val) - 1)
}

16
vendor/github.com/klauspost/compress/snappy/.gitignore generated vendored Normal file
View File

@@ -0,0 +1,16 @@
cmd/snappytool/snappytool
testdata/bench
# These explicitly listed benchmark data files are for an obsolete version of
# snappy_test.go.
testdata/alice29.txt
testdata/asyoulik.txt
testdata/fireworks.jpeg
testdata/geo.protodata
testdata/html
testdata/html_x_4
testdata/kppkn.gtb
testdata/lcet10.txt
testdata/paper-100k.pdf
testdata/plrabn12.txt
testdata/urls.10K

15
vendor/github.com/klauspost/compress/snappy/AUTHORS generated vendored Normal file
View File

@@ -0,0 +1,15 @@
# This is the official list of Snappy-Go authors for copyright purposes.
# This file is distinct from the CONTRIBUTORS files.
# See the latter for an explanation.
# Names should be added to this file as
# Name or Organization <email address>
# The email address is not required for organizations.
# Please keep the list sorted.
Damian Gryski <dgryski@gmail.com>
Google Inc.
Jan Mercl <0xjnml@gmail.com>
Rodolfo Carvalho <rhcarvalho@gmail.com>
Sebastien Binet <seb.binet@gmail.com>

View File

@@ -0,0 +1,37 @@
# This is the official list of people who can contribute
# (and typically have contributed) code to the Snappy-Go repository.
# The AUTHORS file lists the copyright holders; this file
# lists people. For example, Google employees are listed here
# but not in AUTHORS, because Google holds the copyright.
#
# The submission process automatically checks to make sure
# that people submitting code are listed in this file (by email address).
#
# Names should be added to this file only after verifying that
# the individual or the individual's organization has agreed to
# the appropriate Contributor License Agreement, found here:
#
# http://code.google.com/legal/individual-cla-v1.0.html
# http://code.google.com/legal/corporate-cla-v1.0.html
#
# The agreement for individuals can be filled out on the web.
#
# When adding J Random Contributor's name to this file,
# either J's name or J's organization's name should be
# added to the AUTHORS file, depending on whether the
# individual or corporate CLA was used.
# Names should be added to this file like so:
# Name <email address>
# Please keep the list sorted.
Damian Gryski <dgryski@gmail.com>
Jan Mercl <0xjnml@gmail.com>
Kai Backman <kaib@golang.org>
Marc-Antoine Ruel <maruel@chromium.org>
Nigel Tao <nigeltao@golang.org>
Rob Pike <r@golang.org>
Rodolfo Carvalho <rhcarvalho@gmail.com>
Russ Cox <rsc@golang.org>
Sebastien Binet <seb.binet@gmail.com>

27
vendor/github.com/klauspost/compress/snappy/LICENSE generated vendored Normal file
View File

@@ -0,0 +1,27 @@
Copyright (c) 2011 The Snappy-Go Authors. All rights reserved.
Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are
met:
* Redistributions of source code must retain the above copyright
notice, this list of conditions and the following disclaimer.
* Redistributions in binary form must reproduce the above
copyright notice, this list of conditions and the following disclaimer
in the documentation and/or other materials provided with the
distribution.
* Neither the name of Google Inc. nor the names of its
contributors may be used to endorse or promote products derived from
this software without specific prior written permission.
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
"AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

107
vendor/github.com/klauspost/compress/snappy/README generated vendored Normal file
View File

@@ -0,0 +1,107 @@
The Snappy compression format in the Go programming language.
To download and install from source:
$ go get github.com/golang/snappy
Unless otherwise noted, the Snappy-Go source files are distributed
under the BSD-style license found in the LICENSE file.
Benchmarks.
The golang/snappy benchmarks include compressing (Z) and decompressing (U) ten
or so files, the same set used by the C++ Snappy code (github.com/google/snappy
and note the "google", not "golang"). On an "Intel(R) Core(TM) i7-3770 CPU @
3.40GHz", Go's GOARCH=amd64 numbers as of 2016-05-29:
"go test -test.bench=."
_UFlat0-8 2.19GB/s ± 0% html
_UFlat1-8 1.41GB/s ± 0% urls
_UFlat2-8 23.5GB/s ± 2% jpg
_UFlat3-8 1.91GB/s ± 0% jpg_200
_UFlat4-8 14.0GB/s ± 1% pdf
_UFlat5-8 1.97GB/s ± 0% html4
_UFlat6-8 814MB/s ± 0% txt1
_UFlat7-8 785MB/s ± 0% txt2
_UFlat8-8 857MB/s ± 0% txt3
_UFlat9-8 719MB/s ± 1% txt4
_UFlat10-8 2.84GB/s ± 0% pb
_UFlat11-8 1.05GB/s ± 0% gaviota
_ZFlat0-8 1.04GB/s ± 0% html
_ZFlat1-8 534MB/s ± 0% urls
_ZFlat2-8 15.7GB/s ± 1% jpg
_ZFlat3-8 740MB/s ± 3% jpg_200
_ZFlat4-8 9.20GB/s ± 1% pdf
_ZFlat5-8 991MB/s ± 0% html4
_ZFlat6-8 379MB/s ± 0% txt1
_ZFlat7-8 352MB/s ± 0% txt2
_ZFlat8-8 396MB/s ± 1% txt3
_ZFlat9-8 327MB/s ± 1% txt4
_ZFlat10-8 1.33GB/s ± 1% pb
_ZFlat11-8 605MB/s ± 1% gaviota
"go test -test.bench=. -tags=noasm"
_UFlat0-8 621MB/s ± 2% html
_UFlat1-8 494MB/s ± 1% urls
_UFlat2-8 23.2GB/s ± 1% jpg
_UFlat3-8 1.12GB/s ± 1% jpg_200
_UFlat4-8 4.35GB/s ± 1% pdf
_UFlat5-8 609MB/s ± 0% html4
_UFlat6-8 296MB/s ± 0% txt1
_UFlat7-8 288MB/s ± 0% txt2
_UFlat8-8 309MB/s ± 1% txt3
_UFlat9-8 280MB/s ± 1% txt4
_UFlat10-8 753MB/s ± 0% pb
_UFlat11-8 400MB/s ± 0% gaviota
_ZFlat0-8 409MB/s ± 1% html
_ZFlat1-8 250MB/s ± 1% urls
_ZFlat2-8 12.3GB/s ± 1% jpg
_ZFlat3-8 132MB/s ± 0% jpg_200
_ZFlat4-8 2.92GB/s ± 0% pdf
_ZFlat5-8 405MB/s ± 1% html4
_ZFlat6-8 179MB/s ± 1% txt1
_ZFlat7-8 170MB/s ± 1% txt2
_ZFlat8-8 189MB/s ± 1% txt3
_ZFlat9-8 164MB/s ± 1% txt4
_ZFlat10-8 479MB/s ± 1% pb
_ZFlat11-8 270MB/s ± 1% gaviota
For comparison (Go's encoded output is byte-for-byte identical to C++'s), here
are the numbers from C++ Snappy's
make CXXFLAGS="-O2 -DNDEBUG -g" clean snappy_unittest.log && cat snappy_unittest.log
BM_UFlat/0 2.4GB/s html
BM_UFlat/1 1.4GB/s urls
BM_UFlat/2 21.8GB/s jpg
BM_UFlat/3 1.5GB/s jpg_200
BM_UFlat/4 13.3GB/s pdf
BM_UFlat/5 2.1GB/s html4
BM_UFlat/6 1.0GB/s txt1
BM_UFlat/7 959.4MB/s txt2
BM_UFlat/8 1.0GB/s txt3
BM_UFlat/9 864.5MB/s txt4
BM_UFlat/10 2.9GB/s pb
BM_UFlat/11 1.2GB/s gaviota
BM_ZFlat/0 944.3MB/s html (22.31 %)
BM_ZFlat/1 501.6MB/s urls (47.78 %)
BM_ZFlat/2 14.3GB/s jpg (99.95 %)
BM_ZFlat/3 538.3MB/s jpg_200 (73.00 %)
BM_ZFlat/4 8.3GB/s pdf (83.30 %)
BM_ZFlat/5 903.5MB/s html4 (22.52 %)
BM_ZFlat/6 336.0MB/s txt1 (57.88 %)
BM_ZFlat/7 312.3MB/s txt2 (61.91 %)
BM_ZFlat/8 353.1MB/s txt3 (54.99 %)
BM_ZFlat/9 289.9MB/s txt4 (66.26 %)
BM_ZFlat/10 1.2GB/s pb (19.68 %)
BM_ZFlat/11 527.4MB/s gaviota (37.72 %)

237
vendor/github.com/klauspost/compress/snappy/decode.go generated vendored Normal file
View File

@@ -0,0 +1,237 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
"errors"
"io"
)
var (
// ErrCorrupt reports that the input is invalid.
ErrCorrupt = errors.New("snappy: corrupt input")
// ErrTooLarge reports that the uncompressed length is too large.
ErrTooLarge = errors.New("snappy: decoded block is too large")
// ErrUnsupported reports that the input isn't supported.
ErrUnsupported = errors.New("snappy: unsupported input")
errUnsupportedLiteralLength = errors.New("snappy: unsupported literal length")
)
// DecodedLen returns the length of the decoded block.
func DecodedLen(src []byte) (int, error) {
v, _, err := decodedLen(src)
return v, err
}
// decodedLen returns the length of the decoded block and the number of bytes
// that the length header occupied.
func decodedLen(src []byte) (blockLen, headerLen int, err error) {
v, n := binary.Uvarint(src)
if n <= 0 || v > 0xffffffff {
return 0, 0, ErrCorrupt
}
const wordSize = 32 << (^uint(0) >> 32 & 1)
if wordSize == 32 && v > 0x7fffffff {
return 0, 0, ErrTooLarge
}
return int(v), n, nil
}
const (
decodeErrCodeCorrupt = 1
decodeErrCodeUnsupportedLiteralLength = 2
)
// Decode returns the decoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire decoded block.
// Otherwise, a newly allocated slice will be returned.
//
// The dst and src must not overlap. It is valid to pass a nil dst.
func Decode(dst, src []byte) ([]byte, error) {
dLen, s, err := decodedLen(src)
if err != nil {
return nil, err
}
if dLen <= len(dst) {
dst = dst[:dLen]
} else {
dst = make([]byte, dLen)
}
switch decode(dst, src[s:]) {
case 0:
return dst, nil
case decodeErrCodeUnsupportedLiteralLength:
return nil, errUnsupportedLiteralLength
}
return nil, ErrCorrupt
}
// NewReader returns a new Reader that decompresses from r, using the framing
// format described at
// https://github.com/google/snappy/blob/master/framing_format.txt
func NewReader(r io.Reader) *Reader {
return &Reader{
r: r,
decoded: make([]byte, maxBlockSize),
buf: make([]byte, maxEncodedLenOfMaxBlockSize+checksumSize),
}
}
// Reader is an io.Reader that can read Snappy-compressed bytes.
type Reader struct {
r io.Reader
err error
decoded []byte
buf []byte
// decoded[i:j] contains decoded bytes that have not yet been passed on.
i, j int
readHeader bool
}
// Reset discards any buffered data, resets all state, and switches the Snappy
// reader to read from r. This permits reusing a Reader rather than allocating
// a new one.
func (r *Reader) Reset(reader io.Reader) {
r.r = reader
r.err = nil
r.i = 0
r.j = 0
r.readHeader = false
}
func (r *Reader) readFull(p []byte, allowEOF bool) (ok bool) {
if _, r.err = io.ReadFull(r.r, p); r.err != nil {
if r.err == io.ErrUnexpectedEOF || (r.err == io.EOF && !allowEOF) {
r.err = ErrCorrupt
}
return false
}
return true
}
// Read satisfies the io.Reader interface.
func (r *Reader) Read(p []byte) (int, error) {
if r.err != nil {
return 0, r.err
}
for {
if r.i < r.j {
n := copy(p, r.decoded[r.i:r.j])
r.i += n
return n, nil
}
if !r.readFull(r.buf[:4], true) {
return 0, r.err
}
chunkType := r.buf[0]
if !r.readHeader {
if chunkType != chunkTypeStreamIdentifier {
r.err = ErrCorrupt
return 0, r.err
}
r.readHeader = true
}
chunkLen := int(r.buf[1]) | int(r.buf[2])<<8 | int(r.buf[3])<<16
if chunkLen > len(r.buf) {
r.err = ErrUnsupported
return 0, r.err
}
// The chunk types are specified at
// https://github.com/google/snappy/blob/master/framing_format.txt
switch chunkType {
case chunkTypeCompressedData:
// Section 4.2. Compressed data (chunk type 0x00).
if chunkLen < checksumSize {
r.err = ErrCorrupt
return 0, r.err
}
buf := r.buf[:chunkLen]
if !r.readFull(buf, false) {
return 0, r.err
}
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
buf = buf[checksumSize:]
n, err := DecodedLen(buf)
if err != nil {
r.err = err
return 0, r.err
}
if n > len(r.decoded) {
r.err = ErrCorrupt
return 0, r.err
}
if _, err := Decode(r.decoded, buf); err != nil {
r.err = err
return 0, r.err
}
if crc(r.decoded[:n]) != checksum {
r.err = ErrCorrupt
return 0, r.err
}
r.i, r.j = 0, n
continue
case chunkTypeUncompressedData:
// Section 4.3. Uncompressed data (chunk type 0x01).
if chunkLen < checksumSize {
r.err = ErrCorrupt
return 0, r.err
}
buf := r.buf[:checksumSize]
if !r.readFull(buf, false) {
return 0, r.err
}
checksum := uint32(buf[0]) | uint32(buf[1])<<8 | uint32(buf[2])<<16 | uint32(buf[3])<<24
// Read directly into r.decoded instead of via r.buf.
n := chunkLen - checksumSize
if n > len(r.decoded) {
r.err = ErrCorrupt
return 0, r.err
}
if !r.readFull(r.decoded[:n], false) {
return 0, r.err
}
if crc(r.decoded[:n]) != checksum {
r.err = ErrCorrupt
return 0, r.err
}
r.i, r.j = 0, n
continue
case chunkTypeStreamIdentifier:
// Section 4.1. Stream identifier (chunk type 0xff).
if chunkLen != len(magicBody) {
r.err = ErrCorrupt
return 0, r.err
}
if !r.readFull(r.buf[:len(magicBody)], false) {
return 0, r.err
}
for i := 0; i < len(magicBody); i++ {
if r.buf[i] != magicBody[i] {
r.err = ErrCorrupt
return 0, r.err
}
}
continue
}
if chunkType <= 0x7f {
// Section 4.5. Reserved unskippable chunks (chunk types 0x02-0x7f).
r.err = ErrUnsupported
return 0, r.err
}
// Section 4.4 Padding (chunk type 0xfe).
// Section 4.6. Reserved skippable chunks (chunk types 0x80-0xfd).
if !r.readFull(r.buf[:chunkLen], false) {
return 0, r.err
}
}
}

View File

@@ -0,0 +1,14 @@
// Copyright 2016 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !appengine
// +build gc
// +build !noasm
package snappy
// decode has the same semantics as in decode_other.go.
//
//go:noescape
func decode(dst, src []byte) int

View File

@@ -0,0 +1,490 @@
// Copyright 2016 The Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !appengine
// +build gc
// +build !noasm
#include "textflag.h"
// The asm code generally follows the pure Go code in decode_other.go, except
// where marked with a "!!!".
// func decode(dst, src []byte) int
//
// All local variables fit into registers. The non-zero stack size is only to
// spill registers and push args when issuing a CALL. The register allocation:
// - AX scratch
// - BX scratch
// - CX length or x
// - DX offset
// - SI &src[s]
// - DI &dst[d]
// + R8 dst_base
// + R9 dst_len
// + R10 dst_base + dst_len
// + R11 src_base
// + R12 src_len
// + R13 src_base + src_len
// - R14 used by doCopy
// - R15 used by doCopy
//
// The registers R8-R13 (marked with a "+") are set at the start of the
// function, and after a CALL returns, and are not otherwise modified.
//
// The d variable is implicitly DI - R8, and len(dst)-d is R10 - DI.
// The s variable is implicitly SI - R11, and len(src)-s is R13 - SI.
TEXT ·decode(SB), NOSPLIT, $48-56
// Initialize SI, DI and R8-R13.
MOVQ dst_base+0(FP), R8
MOVQ dst_len+8(FP), R9
MOVQ R8, DI
MOVQ R8, R10
ADDQ R9, R10
MOVQ src_base+24(FP), R11
MOVQ src_len+32(FP), R12
MOVQ R11, SI
MOVQ R11, R13
ADDQ R12, R13
loop:
// for s < len(src)
CMPQ SI, R13
JEQ end
// CX = uint32(src[s])
//
// switch src[s] & 0x03
MOVBLZX (SI), CX
MOVL CX, BX
ANDL $3, BX
CMPL BX, $1
JAE tagCopy
// ----------------------------------------
// The code below handles literal tags.
// case tagLiteral:
// x := uint32(src[s] >> 2)
// switch
SHRL $2, CX
CMPL CX, $60
JAE tagLit60Plus
// case x < 60:
// s++
INCQ SI
doLit:
// This is the end of the inner "switch", when we have a literal tag.
//
// We assume that CX == x and x fits in a uint32, where x is the variable
// used in the pure Go decode_other.go code.
// length = int(x) + 1
//
// Unlike the pure Go code, we don't need to check if length <= 0 because
// CX can hold 64 bits, so the increment cannot overflow.
INCQ CX
// Prepare to check if copying length bytes will run past the end of dst or
// src.
//
// AX = len(dst) - d
// BX = len(src) - s
MOVQ R10, AX
SUBQ DI, AX
MOVQ R13, BX
SUBQ SI, BX
// !!! Try a faster technique for short (16 or fewer bytes) copies.
//
// if length > 16 || len(dst)-d < 16 || len(src)-s < 16 {
// goto callMemmove // Fall back on calling runtime·memmove.
// }
//
// The C++ snappy code calls this TryFastAppend. It also checks len(src)-s
// against 21 instead of 16, because it cannot assume that all of its input
// is contiguous in memory and so it needs to leave enough source bytes to
// read the next tag without refilling buffers, but Go's Decode assumes
// contiguousness (the src argument is a []byte).
CMPQ CX, $16
JGT callMemmove
CMPQ AX, $16
JLT callMemmove
CMPQ BX, $16
JLT callMemmove
// !!! Implement the copy from src to dst as a 16-byte load and store.
// (Decode's documentation says that dst and src must not overlap.)
//
// This always copies 16 bytes, instead of only length bytes, but that's
// OK. If the input is a valid Snappy encoding then subsequent iterations
// will fix up the overrun. Otherwise, Decode returns a nil []byte (and a
// non-nil error), so the overrun will be ignored.
//
// Note that on amd64, it is legal and cheap to issue unaligned 8-byte or
// 16-byte loads and stores. This technique probably wouldn't be as
// effective on architectures that are fussier about alignment.
MOVOU 0(SI), X0
MOVOU X0, 0(DI)
// d += length
// s += length
ADDQ CX, DI
ADDQ CX, SI
JMP loop
callMemmove:
// if length > len(dst)-d || length > len(src)-s { etc }
CMPQ CX, AX
JGT errCorrupt
CMPQ CX, BX
JGT errCorrupt
// copy(dst[d:], src[s:s+length])
//
// This means calling runtime·memmove(&dst[d], &src[s], length), so we push
// DI, SI and CX as arguments. Coincidentally, we also need to spill those
// three registers to the stack, to save local variables across the CALL.
MOVQ DI, 0(SP)
MOVQ SI, 8(SP)
MOVQ CX, 16(SP)
MOVQ DI, 24(SP)
MOVQ SI, 32(SP)
MOVQ CX, 40(SP)
CALL runtime·memmove(SB)
// Restore local variables: unspill registers from the stack and
// re-calculate R8-R13.
MOVQ 24(SP), DI
MOVQ 32(SP), SI
MOVQ 40(SP), CX
MOVQ dst_base+0(FP), R8
MOVQ dst_len+8(FP), R9
MOVQ R8, R10
ADDQ R9, R10
MOVQ src_base+24(FP), R11
MOVQ src_len+32(FP), R12
MOVQ R11, R13
ADDQ R12, R13
// d += length
// s += length
ADDQ CX, DI
ADDQ CX, SI
JMP loop
tagLit60Plus:
// !!! This fragment does the
//
// s += x - 58; if uint(s) > uint(len(src)) { etc }
//
// checks. In the asm version, we code it once instead of once per switch case.
ADDQ CX, SI
SUBQ $58, SI
MOVQ SI, BX
SUBQ R11, BX
CMPQ BX, R12
JA errCorrupt
// case x == 60:
CMPL CX, $61
JEQ tagLit61
JA tagLit62Plus
// x = uint32(src[s-1])
MOVBLZX -1(SI), CX
JMP doLit
tagLit61:
// case x == 61:
// x = uint32(src[s-2]) | uint32(src[s-1])<<8
MOVWLZX -2(SI), CX
JMP doLit
tagLit62Plus:
CMPL CX, $62
JA tagLit63
// case x == 62:
// x = uint32(src[s-3]) | uint32(src[s-2])<<8 | uint32(src[s-1])<<16
MOVWLZX -3(SI), CX
MOVBLZX -1(SI), BX
SHLL $16, BX
ORL BX, CX
JMP doLit
tagLit63:
// case x == 63:
// x = uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24
MOVL -4(SI), CX
JMP doLit
// The code above handles literal tags.
// ----------------------------------------
// The code below handles copy tags.
tagCopy4:
// case tagCopy4:
// s += 5
ADDQ $5, SI
// if uint(s) > uint(len(src)) { etc }
MOVQ SI, BX
SUBQ R11, BX
CMPQ BX, R12
JA errCorrupt
// length = 1 + int(src[s-5])>>2
SHRQ $2, CX
INCQ CX
// offset = int(uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24)
MOVLQZX -4(SI), DX
JMP doCopy
tagCopy2:
// case tagCopy2:
// s += 3
ADDQ $3, SI
// if uint(s) > uint(len(src)) { etc }
MOVQ SI, BX
SUBQ R11, BX
CMPQ BX, R12
JA errCorrupt
// length = 1 + int(src[s-3])>>2
SHRQ $2, CX
INCQ CX
// offset = int(uint32(src[s-2]) | uint32(src[s-1])<<8)
MOVWQZX -2(SI), DX
JMP doCopy
tagCopy:
// We have a copy tag. We assume that:
// - BX == src[s] & 0x03
// - CX == src[s]
CMPQ BX, $2
JEQ tagCopy2
JA tagCopy4
// case tagCopy1:
// s += 2
ADDQ $2, SI
// if uint(s) > uint(len(src)) { etc }
MOVQ SI, BX
SUBQ R11, BX
CMPQ BX, R12
JA errCorrupt
// offset = int(uint32(src[s-2])&0xe0<<3 | uint32(src[s-1]))
MOVQ CX, DX
ANDQ $0xe0, DX
SHLQ $3, DX
MOVBQZX -1(SI), BX
ORQ BX, DX
// length = 4 + int(src[s-2])>>2&0x7
SHRQ $2, CX
ANDQ $7, CX
ADDQ $4, CX
doCopy:
// This is the end of the outer "switch", when we have a copy tag.
//
// We assume that:
// - CX == length && CX > 0
// - DX == offset
// if offset <= 0 { etc }
CMPQ DX, $0
JLE errCorrupt
// if d < offset { etc }
MOVQ DI, BX
SUBQ R8, BX
CMPQ BX, DX
JLT errCorrupt
// if length > len(dst)-d { etc }
MOVQ R10, BX
SUBQ DI, BX
CMPQ CX, BX
JGT errCorrupt
// forwardCopy(dst[d:d+length], dst[d-offset:]); d += length
//
// Set:
// - R14 = len(dst)-d
// - R15 = &dst[d-offset]
MOVQ R10, R14
SUBQ DI, R14
MOVQ DI, R15
SUBQ DX, R15
// !!! Try a faster technique for short (16 or fewer bytes) forward copies.
//
// First, try using two 8-byte load/stores, similar to the doLit technique
// above. Even if dst[d:d+length] and dst[d-offset:] can overlap, this is
// still OK if offset >= 8. Note that this has to be two 8-byte load/stores
// and not one 16-byte load/store, and the first store has to be before the
// second load, due to the overlap if offset is in the range [8, 16).
//
// if length > 16 || offset < 8 || len(dst)-d < 16 {
// goto slowForwardCopy
// }
// copy 16 bytes
// d += length
CMPQ CX, $16
JGT slowForwardCopy
CMPQ DX, $8
JLT slowForwardCopy
CMPQ R14, $16
JLT slowForwardCopy
MOVQ 0(R15), AX
MOVQ AX, 0(DI)
MOVQ 8(R15), BX
MOVQ BX, 8(DI)
ADDQ CX, DI
JMP loop
slowForwardCopy:
// !!! If the forward copy is longer than 16 bytes, or if offset < 8, we
// can still try 8-byte load stores, provided we can overrun up to 10 extra
// bytes. As above, the overrun will be fixed up by subsequent iterations
// of the outermost loop.
//
// The C++ snappy code calls this technique IncrementalCopyFastPath. Its
// commentary says:
//
// ----
//
// The main part of this loop is a simple copy of eight bytes at a time
// until we've copied (at least) the requested amount of bytes. However,
// if d and d-offset are less than eight bytes apart (indicating a
// repeating pattern of length < 8), we first need to expand the pattern in
// order to get the correct results. For instance, if the buffer looks like
// this, with the eight-byte <d-offset> and <d> patterns marked as
// intervals:
//
// abxxxxxxxxxxxx
// [------] d-offset
// [------] d
//
// a single eight-byte copy from <d-offset> to <d> will repeat the pattern
// once, after which we can move <d> two bytes without moving <d-offset>:
//
// ababxxxxxxxxxx
// [------] d-offset
// [------] d
//
// and repeat the exercise until the two no longer overlap.
//
// This allows us to do very well in the special case of one single byte
// repeated many times, without taking a big hit for more general cases.
//
// The worst case of extra writing past the end of the match occurs when
// offset == 1 and length == 1; the last copy will read from byte positions
// [0..7] and write to [4..11], whereas it was only supposed to write to
// position 1. Thus, ten excess bytes.
//
// ----
//
// That "10 byte overrun" worst case is confirmed by Go's
// TestSlowForwardCopyOverrun, which also tests the fixUpSlowForwardCopy
// and finishSlowForwardCopy algorithm.
//
// if length > len(dst)-d-10 {
// goto verySlowForwardCopy
// }
SUBQ $10, R14
CMPQ CX, R14
JGT verySlowForwardCopy
makeOffsetAtLeast8:
// !!! As above, expand the pattern so that offset >= 8 and we can use
// 8-byte load/stores.
//
// for offset < 8 {
// copy 8 bytes from dst[d-offset:] to dst[d:]
// length -= offset
// d += offset
// offset += offset
// // The two previous lines together means that d-offset, and therefore
// // R15, is unchanged.
// }
CMPQ DX, $8
JGE fixUpSlowForwardCopy
MOVQ (R15), BX
MOVQ BX, (DI)
SUBQ DX, CX
ADDQ DX, DI
ADDQ DX, DX
JMP makeOffsetAtLeast8
fixUpSlowForwardCopy:
// !!! Add length (which might be negative now) to d (implied by DI being
// &dst[d]) so that d ends up at the right place when we jump back to the
// top of the loop. Before we do that, though, we save DI to AX so that, if
// length is positive, copying the remaining length bytes will write to the
// right place.
MOVQ DI, AX
ADDQ CX, DI
finishSlowForwardCopy:
// !!! Repeat 8-byte load/stores until length <= 0. Ending with a negative
// length means that we overrun, but as above, that will be fixed up by
// subsequent iterations of the outermost loop.
CMPQ CX, $0
JLE loop
MOVQ (R15), BX
MOVQ BX, (AX)
ADDQ $8, R15
ADDQ $8, AX
SUBQ $8, CX
JMP finishSlowForwardCopy
verySlowForwardCopy:
// verySlowForwardCopy is a simple implementation of forward copy. In C
// parlance, this is a do/while loop instead of a while loop, since we know
// that length > 0. In Go syntax:
//
// for {
// dst[d] = dst[d - offset]
// d++
// length--
// if length == 0 {
// break
// }
// }
MOVB (R15), BX
MOVB BX, (DI)
INCQ R15
INCQ DI
DECQ CX
JNZ verySlowForwardCopy
JMP loop
// The code above handles copy tags.
// ----------------------------------------
end:
// This is the end of the "for s < len(src)".
//
// if d != len(dst) { etc }
CMPQ DI, R10
JNE errCorrupt
// return 0
MOVQ $0, ret+48(FP)
RET
errCorrupt:
// return decodeErrCodeCorrupt
MOVQ $1, ret+48(FP)
RET

View File

@@ -0,0 +1,101 @@
// Copyright 2016 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !amd64 appengine !gc noasm
package snappy
// decode writes the decoding of src to dst. It assumes that the varint-encoded
// length of the decompressed bytes has already been read, and that len(dst)
// equals that length.
//
// It returns 0 on success or a decodeErrCodeXxx error code on failure.
func decode(dst, src []byte) int {
var d, s, offset, length int
for s < len(src) {
switch src[s] & 0x03 {
case tagLiteral:
x := uint32(src[s] >> 2)
switch {
case x < 60:
s++
case x == 60:
s += 2
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
x = uint32(src[s-1])
case x == 61:
s += 3
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
x = uint32(src[s-2]) | uint32(src[s-1])<<8
case x == 62:
s += 4
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
x = uint32(src[s-3]) | uint32(src[s-2])<<8 | uint32(src[s-1])<<16
case x == 63:
s += 5
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
x = uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24
}
length = int(x) + 1
if length <= 0 {
return decodeErrCodeUnsupportedLiteralLength
}
if length > len(dst)-d || length > len(src)-s {
return decodeErrCodeCorrupt
}
copy(dst[d:], src[s:s+length])
d += length
s += length
continue
case tagCopy1:
s += 2
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
length = 4 + int(src[s-2])>>2&0x7
offset = int(uint32(src[s-2])&0xe0<<3 | uint32(src[s-1]))
case tagCopy2:
s += 3
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
length = 1 + int(src[s-3])>>2
offset = int(uint32(src[s-2]) | uint32(src[s-1])<<8)
case tagCopy4:
s += 5
if uint(s) > uint(len(src)) { // The uint conversions catch overflow from the previous line.
return decodeErrCodeCorrupt
}
length = 1 + int(src[s-5])>>2
offset = int(uint32(src[s-4]) | uint32(src[s-3])<<8 | uint32(src[s-2])<<16 | uint32(src[s-1])<<24)
}
if offset <= 0 || d < offset || length > len(dst)-d {
return decodeErrCodeCorrupt
}
// Copy from an earlier sub-slice of dst to a later sub-slice. Unlike
// the built-in copy function, this byte-by-byte copy always runs
// forwards, even if the slices overlap. Conceptually, this is:
//
// d += forwardCopy(dst[d:d+length], dst[d-offset:])
for end := d + length; d != end; d++ {
dst[d] = dst[d-offset]
}
}
if d != len(dst) {
return decodeErrCodeCorrupt
}
return 0
}

285
vendor/github.com/klauspost/compress/snappy/encode.go generated vendored Normal file
View File

@@ -0,0 +1,285 @@
// Copyright 2011 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
package snappy
import (
"encoding/binary"
"errors"
"io"
)
// Encode returns the encoded form of src. The returned slice may be a sub-
// slice of dst if dst was large enough to hold the entire encoded block.
// Otherwise, a newly allocated slice will be returned.
//
// The dst and src must not overlap. It is valid to pass a nil dst.
func Encode(dst, src []byte) []byte {
if n := MaxEncodedLen(len(src)); n < 0 {
panic(ErrTooLarge)
} else if len(dst) < n {
dst = make([]byte, n)
}
// The block starts with the varint-encoded length of the decompressed bytes.
d := binary.PutUvarint(dst, uint64(len(src)))
for len(src) > 0 {
p := src
src = nil
if len(p) > maxBlockSize {
p, src = p[:maxBlockSize], p[maxBlockSize:]
}
if len(p) < minNonLiteralBlockSize {
d += emitLiteral(dst[d:], p)
} else {
d += encodeBlock(dst[d:], p)
}
}
return dst[:d]
}
// inputMargin is the minimum number of extra input bytes to keep, inside
// encodeBlock's inner loop. On some architectures, this margin lets us
// implement a fast path for emitLiteral, where the copy of short (<= 16 byte)
// literals can be implemented as a single load to and store from a 16-byte
// register. That literal's actual length can be as short as 1 byte, so this
// can copy up to 15 bytes too much, but that's OK as subsequent iterations of
// the encoding loop will fix up the copy overrun, and this inputMargin ensures
// that we don't overrun the dst and src buffers.
const inputMargin = 16 - 1
// minNonLiteralBlockSize is the minimum size of the input to encodeBlock that
// could be encoded with a copy tag. This is the minimum with respect to the
// algorithm used by encodeBlock, not a minimum enforced by the file format.
//
// The encoded output must start with at least a 1 byte literal, as there are
// no previous bytes to copy. A minimal (1 byte) copy after that, generated
// from an emitCopy call in encodeBlock's main loop, would require at least
// another inputMargin bytes, for the reason above: we want any emitLiteral
// calls inside encodeBlock's main loop to use the fast path if possible, which
// requires being able to overrun by inputMargin bytes. Thus,
// minNonLiteralBlockSize equals 1 + 1 + inputMargin.
//
// The C++ code doesn't use this exact threshold, but it could, as discussed at
// https://groups.google.com/d/topic/snappy-compression/oGbhsdIJSJ8/discussion
// The difference between Go (2+inputMargin) and C++ (inputMargin) is purely an
// optimization. It should not affect the encoded form. This is tested by
// TestSameEncodingAsCppShortCopies.
const minNonLiteralBlockSize = 1 + 1 + inputMargin
// MaxEncodedLen returns the maximum length of a snappy block, given its
// uncompressed length.
//
// It will return a negative value if srcLen is too large to encode.
func MaxEncodedLen(srcLen int) int {
n := uint64(srcLen)
if n > 0xffffffff {
return -1
}
// Compressed data can be defined as:
// compressed := item* literal*
// item := literal* copy
//
// The trailing literal sequence has a space blowup of at most 62/60
// since a literal of length 60 needs one tag byte + one extra byte
// for length information.
//
// Item blowup is trickier to measure. Suppose the "copy" op copies
// 4 bytes of data. Because of a special check in the encoding code,
// we produce a 4-byte copy only if the offset is < 65536. Therefore
// the copy op takes 3 bytes to encode, and this type of item leads
// to at most the 62/60 blowup for representing literals.
//
// Suppose the "copy" op copies 5 bytes of data. If the offset is big
// enough, it will take 5 bytes to encode the copy op. Therefore the
// worst case here is a one-byte literal followed by a five-byte copy.
// That is, 6 bytes of input turn into 7 bytes of "compressed" data.
//
// This last factor dominates the blowup, so the final estimate is:
n = 32 + n + n/6
if n > 0xffffffff {
return -1
}
return int(n)
}
var errClosed = errors.New("snappy: Writer is closed")
// NewWriter returns a new Writer that compresses to w.
//
// The Writer returned does not buffer writes. There is no need to Flush or
// Close such a Writer.
//
// Deprecated: the Writer returned is not suitable for many small writes, only
// for few large writes. Use NewBufferedWriter instead, which is efficient
// regardless of the frequency and shape of the writes, and remember to Close
// that Writer when done.
func NewWriter(w io.Writer) *Writer {
return &Writer{
w: w,
obuf: make([]byte, obufLen),
}
}
// NewBufferedWriter returns a new Writer that compresses to w, using the
// framing format described at
// https://github.com/google/snappy/blob/master/framing_format.txt
//
// The Writer returned buffers writes. Users must call Close to guarantee all
// data has been forwarded to the underlying io.Writer. They may also call
// Flush zero or more times before calling Close.
func NewBufferedWriter(w io.Writer) *Writer {
return &Writer{
w: w,
ibuf: make([]byte, 0, maxBlockSize),
obuf: make([]byte, obufLen),
}
}
// Writer is an io.Writer that can write Snappy-compressed bytes.
type Writer struct {
w io.Writer
err error
// ibuf is a buffer for the incoming (uncompressed) bytes.
//
// Its use is optional. For backwards compatibility, Writers created by the
// NewWriter function have ibuf == nil, do not buffer incoming bytes, and
// therefore do not need to be Flush'ed or Close'd.
ibuf []byte
// obuf is a buffer for the outgoing (compressed) bytes.
obuf []byte
// wroteStreamHeader is whether we have written the stream header.
wroteStreamHeader bool
}
// Reset discards the writer's state and switches the Snappy writer to write to
// w. This permits reusing a Writer rather than allocating a new one.
func (w *Writer) Reset(writer io.Writer) {
w.w = writer
w.err = nil
if w.ibuf != nil {
w.ibuf = w.ibuf[:0]
}
w.wroteStreamHeader = false
}
// Write satisfies the io.Writer interface.
func (w *Writer) Write(p []byte) (nRet int, errRet error) {
if w.ibuf == nil {
// Do not buffer incoming bytes. This does not perform or compress well
// if the caller of Writer.Write writes many small slices. This
// behavior is therefore deprecated, but still supported for backwards
// compatibility with code that doesn't explicitly Flush or Close.
return w.write(p)
}
// The remainder of this method is based on bufio.Writer.Write from the
// standard library.
for len(p) > (cap(w.ibuf)-len(w.ibuf)) && w.err == nil {
var n int
if len(w.ibuf) == 0 {
// Large write, empty buffer.
// Write directly from p to avoid copy.
n, _ = w.write(p)
} else {
n = copy(w.ibuf[len(w.ibuf):cap(w.ibuf)], p)
w.ibuf = w.ibuf[:len(w.ibuf)+n]
w.Flush()
}
nRet += n
p = p[n:]
}
if w.err != nil {
return nRet, w.err
}
n := copy(w.ibuf[len(w.ibuf):cap(w.ibuf)], p)
w.ibuf = w.ibuf[:len(w.ibuf)+n]
nRet += n
return nRet, nil
}
func (w *Writer) write(p []byte) (nRet int, errRet error) {
if w.err != nil {
return 0, w.err
}
for len(p) > 0 {
obufStart := len(magicChunk)
if !w.wroteStreamHeader {
w.wroteStreamHeader = true
copy(w.obuf, magicChunk)
obufStart = 0
}
var uncompressed []byte
if len(p) > maxBlockSize {
uncompressed, p = p[:maxBlockSize], p[maxBlockSize:]
} else {
uncompressed, p = p, nil
}
checksum := crc(uncompressed)
// Compress the buffer, discarding the result if the improvement
// isn't at least 12.5%.
compressed := Encode(w.obuf[obufHeaderLen:], uncompressed)
chunkType := uint8(chunkTypeCompressedData)
chunkLen := 4 + len(compressed)
obufEnd := obufHeaderLen + len(compressed)
if len(compressed) >= len(uncompressed)-len(uncompressed)/8 {
chunkType = chunkTypeUncompressedData
chunkLen = 4 + len(uncompressed)
obufEnd = obufHeaderLen
}
// Fill in the per-chunk header that comes before the body.
w.obuf[len(magicChunk)+0] = chunkType
w.obuf[len(magicChunk)+1] = uint8(chunkLen >> 0)
w.obuf[len(magicChunk)+2] = uint8(chunkLen >> 8)
w.obuf[len(magicChunk)+3] = uint8(chunkLen >> 16)
w.obuf[len(magicChunk)+4] = uint8(checksum >> 0)
w.obuf[len(magicChunk)+5] = uint8(checksum >> 8)
w.obuf[len(magicChunk)+6] = uint8(checksum >> 16)
w.obuf[len(magicChunk)+7] = uint8(checksum >> 24)
if _, err := w.w.Write(w.obuf[obufStart:obufEnd]); err != nil {
w.err = err
return nRet, err
}
if chunkType == chunkTypeUncompressedData {
if _, err := w.w.Write(uncompressed); err != nil {
w.err = err
return nRet, err
}
}
nRet += len(uncompressed)
}
return nRet, nil
}
// Flush flushes the Writer to its underlying io.Writer.
func (w *Writer) Flush() error {
if w.err != nil {
return w.err
}
if len(w.ibuf) == 0 {
return nil
}
w.write(w.ibuf)
w.ibuf = w.ibuf[:0]
return w.err
}
// Close calls Flush and then closes the Writer.
func (w *Writer) Close() error {
w.Flush()
ret := w.err
if w.err == nil {
w.err = errClosed
}
return ret
}

View File

@@ -0,0 +1,29 @@
// Copyright 2016 The Snappy-Go Authors. All rights reserved.
// Use of this source code is governed by a BSD-style
// license that can be found in the LICENSE file.
// +build !appengine
// +build gc
// +build !noasm
package snappy
// emitLiteral has the same semantics as in encode_other.go.
//
//go:noescape
func emitLiteral(dst, lit []byte) int
// emitCopy has the same semantics as in encode_other.go.
//
//go:noescape
func emitCopy(dst []byte, offset, length int) int
// extendMatch has the same semantics as in encode_other.go.
//
//go:noescape
func extendMatch(src []byte, i, j int) int
// encodeBlock has the same semantics as in encode_other.go.
//
//go:noescape
func encodeBlock(dst, src []byte) (d int)

Some files were not shown because too many files have changed in this diff Show More