Compare commits

..

15 Commits

Author SHA1 Message Date
Alexander Marshalov
26144f6d87 Merge branch 'master' into streaming-aggregation-ui 2024-01-18 10:33:14 +01:00
Alexander Marshalov
1fa619170b Merge branch 'master' into streaming-aggregation-ui 2023-12-19 14:58:36 +01:00
Alexander Marshalov
cecee0e05f Merge branch 'master' into streaming-aggregation-ui 2023-12-18 18:12:03 +01:00
Alexander Marshalov
3cb5f88325 Merge branch 'master' into streaming-aggregation-ui 2023-12-13 08:01:36 +01:00
Alexander Marshalov
99384fce4f Fix linter comments 2023-12-12 16:59:31 +01:00
Alexander Marshalov
f7ef3d4aa1 Fix linter comments 2023-12-12 16:46:52 +01:00
Alexander Marshalov
92cadfff65 Fix for PR 2023-12-12 16:34:07 +01:00
Alexander Marshalov
8f946a927c Fix image format 2023-12-12 16:16:22 +01:00
Alexander Marshalov
075bb8748f Merge branch 'master' into streaming-aggregation-ui
# Conflicts:
#	app/vmagent/main.go
2023-12-12 16:05:00 +01:00
Alexander Marshalov
fb9a9d1463 Prepare for PR - refactoring 2023-12-12 16:02:12 +01:00
Alexander Marshalov
715bf3af82 Prepare for PR - refactoring 2023-12-11 19:53:28 +01:00
Alexander Marshalov
8725b5e049 Streaming aggregation branch clean 2023-11-22 15:11:13 +01:00
Alexander Marshalov
5bc3488538 Merge branch 'master' into streaming-aggregation 2023-10-26 16:54:09 +02:00
Alexander Marshalov
1cd6232537 WIP 2023-10-23 13:14:48 +02:00
Alexander Marshalov
ed1bef0e2d WIP 2023-10-18 14:48:49 +02:00
1671 changed files with 62352 additions and 84922 deletions

View File

@@ -60,8 +60,8 @@ body:
For VictoriaMetrics health-state issues please provide full-length screenshots
of Grafana dashboards if possible:
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229/)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176/)
* [Grafana dashboard for single-node VictoriaMetrics](https://grafana.com/grafana/dashboards/10229-victoriametrics-single-node/)
* [Grafana dashboard for VictoriaMetrics cluster](https://grafana.com/grafana/dashboards/11176-victoriametrics-cluster/)
See how to setup monitoring here:
* [monitoring for single-node VictoriaMetrics](https://docs.victoriametrics.com/#monitoring)

View File

@@ -1,35 +0,0 @@
### Describe Your Changes
Please provide a brief description of the changes you made. Be as specific as possible to help others understand the purpose and impact of your modifications.
### Checklist
The following checks are mandatory:
- [ ] I have read the [Contributing Guidelines](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/CONTRIBUTING.md)
- [ ] All commits are signed and include `Signed-off-by` line. Use `git commit -s` to include `Signed-off-by` your commits. See this [doc](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) about how to sign your commits.
- [ ] Tests are passing locally. Use `make test` to run all tests locally.
- [ ] Linting is passing locally. Use `make check-all` to run all linters locally.
Further checks are optional for External Contributions:
- [ ] Include a link to the GitHub issue in the commit message, if issue exists.
- [ ] Mention the change in the [Changelog](https://github.com/VictoriaMetrics/VictoriaMetrics/blob/master/docs/CHANGELOG.md). Explain what has changed and why. If there is a related issue or documentation change - link them as well.
Tips for writing a good changelog message::
* Write a human-readable changelog message that describes the problem and solution.
* Include a link to the issue or pull request in your changelog message.
* Use specific language identifying the fix, such as an error message, metric name, or flag name.
* Provide a link to the relevant documentation for any new features you add or modify.
- [ ] After your pull request is merged, please add a message to the issue with instructions for how to test the fix or try the feature you added. Here is an [example](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4048#issuecomment-1546453726)
- [ ] Do not close the original issue before the change is released. Please note, in some cases Github can automatically close the issue once PR is merged. Re-open the issue in such case.
- [ ] If the change somehow affects public interfaces (a new flag was added or updated, or some behavior has changed) - add the corresponding change to documentation.
Examples of good changelog messages:
1. FEATURE: [vmagent](https://docs.victoriametrics.com/vmagent.html): add support for [VictoriaMetrics remote write protocol](https://docs.victoriametrics.com/vmagent.html#victoriametrics-remote-write-protocol) when [sending / receiving data to / from Kafka](https://docs.victoriametrics.com/vmagent.html#kafka-integration). This protocol allows saving egress network bandwidth costs when sending data from `vmagent` to `Kafka` located in another datacenter or availability zone. See [this feature request](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1225).
2. BUGFIX: [stream aggregation](https://docs.victoriametrics.com/stream-aggregation.html): suppress `series after dedup` error message in logs when `-remoteWrite.streamAggr.dedupInterval` command-line flag is set at [vmagent](https://docs.victoriametrics.com/vmgent.html) or when `-streamAggr.dedupInterval` command-line flag is set at [single-node VictoriaMetrics](https://docs.victoriametrics.com/).

View File

@@ -25,7 +25,7 @@ jobs:
cache: false
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v3
with:
path: |
~/.cache/go-build

View File

@@ -36,11 +36,11 @@ jobs:
uses: actions/checkout@v4
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v2
with:
category: "javascript"

View File

@@ -63,7 +63,7 @@ jobs:
if: ${{ matrix.language == 'go' }}
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
@@ -75,7 +75,7 @@ jobs:
# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v3
uses: github/codeql-action/init@v2
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
@@ -86,7 +86,7 @@ jobs:
# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
- name: Autobuild
uses: github/codeql-action/autobuild@v3
uses: github/codeql-action/autobuild@v2
# Command-line programs to run using the OS shell.
# 📚 https://git.io/JvXDl
@@ -100,4 +100,4 @@ jobs:
# make release
- name: Perform CodeQL Analysis
uses: github/codeql-action/analyze@v3
uses: github/codeql-action/analyze@v2

View File

@@ -41,7 +41,7 @@ jobs:
cache: false
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
@@ -71,7 +71,7 @@ jobs:
cache: false
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
@@ -102,7 +102,7 @@ jobs:
cache: false
- name: Cache Go artifacts
uses: actions/cache@v4
uses: actions/cache@v3
with:
path: |
~/.cache/go-build
@@ -115,6 +115,6 @@ jobs:
run: make ${{ matrix.scenario}}
- name: Publish coverage
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v3
with:
file: ./coverage.txt

View File

@@ -6,6 +6,9 @@ on:
paths:
- 'docs/**'
workflow_dispatch: {}
env:
PAGEFIND_VERSION: "1.0.4"
HUGO_VERSION: "latest"
permissions:
contents: read # This is required for actions/checkout and to commit back image update
deployments: write
@@ -24,6 +27,16 @@ jobs:
repository: VictoriaMetrics/vmdocs
token: ${{ secrets.VM_BOT_GH_TOKEN }}
path: docs
- uses: peaceiris/actions-hugo@v2
with:
hugo-version: ${{env.HUGO_VERSION}}
extended: true
- name: Install PageFind #install the static search engine for index build
uses: supplypike/setup-bin@v3
with:
uri: "https://github.com/CloudCannon/pagefind/releases/download/v${{env.PAGEFIND_VERSION}}/pagefind-v${{env.PAGEFIND_VERSION}}-x86_64-unknown-linux-musl.tar.gz"
name: "pagefind"
version: ${{env.PAGEFIND_VERSION}}
- name: Import GPG key
uses: crazy-max/ghaction-import-gpg@v5
with:
@@ -38,11 +51,13 @@ jobs:
calculatedSha=$(git rev-parse --short ${{ github.sha }})
echo "short_sha=$calculatedSha" >> $GITHUB_OUTPUT
working-directory: main
- name: update code and commit
run: |
rm -rf content
cp -r ../main/docs content
make clean-after-copy
make build-search-index
git config --global user.name "${{ steps.import-gpg.outputs.email }}"
git config --global user.email "${{ steps.import-gpg.outputs.email }}"
git add .

33
.github/workflows/wiki.yml vendored Normal file
View File

@@ -0,0 +1,33 @@
name: wiki
on:
push:
paths:
- 'docs/*'
branches:
- master
permissions:
contents: read
jobs:
build:
permissions:
contents: write # for Git to git push
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@master
- name: publish
shell: bash
env:
TOKEN: ${{secrets.CI_TOKEN}}
run: |
git clone https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git wiki
cp -r docs/* wiki
cd wiki
git config --local user.email "info@victoriametrics.com"
git config --local user.name "Vika"
git add .
git commit -m "update wiki pages"
remote_repo="https://vika:${TOKEN}@github.com/VictoriaMetrics/VictoriaMetrics.wiki.git"
git push "${remote_repo}"
cd ..
rm -rf wiki

1
.gitignore vendored
View File

@@ -22,4 +22,3 @@ Gemfile.lock
/_site
_site
*.tmp
/docs/.jekyll-metadata

View File

@@ -14,8 +14,3 @@ We are open to third-party pull requests provided they follow [KISS design princ
- Avoid automated decisions, which may hurt cluster availability, consistency or performance.
Adhering `KISS` principle simplifies the resulting code and architecture, so it can be reviewed, understood and verified by many people.
Before sending a pull request please check the following:
- [ ] All commits are signed and include `Signed-off-by` line. Use `git commit -s` to include `Signed-off-by` your commits. See this [doc](https://git-scm.com/book/en/v2/Git-Tools-Signing-Your-Work) about how to sign your commits.
- [ ] Tests are passing locally. Use `make test` to run all tests locally.
- [ ] Linting is passing locally. Use `make check-all` to run all linters locally.

View File

@@ -1,6 +1,6 @@
PKG_PREFIX := github.com/VictoriaMetrics/VictoriaMetrics
MAKE_CONCURRENCY ?= $(shell getconf _NPROCESSORS_ONLN)
MAKE_CONCURRENCY ?= $(shell cat /proc/cpuinfo | grep -c processor)
MAKE_PARALLEL := $(MAKE) -j $(MAKE_CONCURRENCY)
DATEINFO_TAG ?= $(shell date -u +'%Y%m%d-%H%M%S')
BUILDINFO_TAG ?= $(shell echo $$(git describe --long --all | tr '/' '-')$$( \
@@ -178,8 +178,7 @@ victoria-metrics-crossbuild: \
victoria-metrics-darwin-amd64 \
victoria-metrics-darwin-arm64 \
victoria-metrics-freebsd-amd64 \
victoria-metrics-openbsd-amd64 \
victoria-metrics-windows-amd64
victoria-metrics-openbsd-amd64
vmutils-crossbuild: \
vmutils-linux-386 \
@@ -466,7 +465,7 @@ benchmark-pure:
vendor-update:
go get -u -d ./lib/...
go get -u -d ./app/...
go mod tidy -compat=1.21
go mod tidy -compat=1.20
go mod vendor
app-local:
@@ -492,7 +491,7 @@ golangci-lint: install-golangci-lint
golangci-lint run
install-golangci-lint:
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.57.1
which golangci-lint || curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(shell go env GOPATH)/bin v1.55.1
govulncheck: install-govulncheck
govulncheck ./...

710
README.md

File diff suppressed because it is too large Load Diff

View File

@@ -2,17 +2,13 @@
## Supported Versions
The following versions of VictoriaMetrics receive regular security fixes:
| Version | Supported |
|---------|--------------------|
| [latest release](https://docs.victoriametrics.com/CHANGELOG.html) | :white_check_mark: |
| v1.97.x [LTS line](https://docs.victoriametrics.com/lts-releases/) | :white_check_mark: |
| v1.93.x [LTS line](https://docs.victoriametrics.com/lts-releases/) | :white_check_mark: |
| v1.93.x LTS release | :white_check_mark: |
| v1.87.x LTS release | :white_check_mark: |
| other releases | :x: |
See [this page](https://victoriametrics.com/security/) for more details.
## Reporting a Vulnerability
Please report any security issues to security@victoriametrics.com

View File

@@ -1,7 +1,7 @@
ARG base_image
FROM $base_image
EXPOSE 9428
EXPOSE 8428
ENTRYPOINT ["/victoria-logs-prod"]
ARG src_binary

View File

@@ -11,6 +11,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlselect"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vlstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
@@ -21,10 +22,11 @@ import (
)
var (
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "TCP address to listen for incoming http requests. See also -httpListenAddr.useProxyProtocol")
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the given -httpListenAddr . "+
httpListenAddr = flag.String("httpListenAddr", ":9428", "TCP address to listen for http connections. See also -httpListenAddr.useProxyProtocol")
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
gogc = flag.Int("gogc", 100, "GOGC to use. See https://tip.golang.org/doc/gc-guide")
)
func main() {
@@ -32,21 +34,18 @@ func main() {
flag.CommandLine.SetOutput(os.Stdout)
flag.Usage = usage
envflag.Parse()
cgroup.SetGOGC(*gogc)
buildinfo.Init()
logger.Init()
listenAddrs := *httpListenAddrs
if len(listenAddrs) == 0 {
listenAddrs = []string{":9428"}
}
logger.Infof("starting VictoriaLogs at %q...", listenAddrs)
logger.Infof("starting VictoriaLogs at %q...", *httpListenAddr)
startTime := time.Now()
vlstorage.Init()
vlselect.Init()
vlinsert.Init()
go httpserver.Serve(listenAddrs, useProxyProtocol, requestHandler)
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaLogs in %.3f seconds; see https://docs.victoriametrics.com/VictoriaLogs/", time.Since(startTime).Seconds())
pushmetrics.Init()
@@ -54,9 +53,9 @@ func main() {
logger.Infof("received signal %s", sig)
pushmetrics.Stop()
logger.Infof("gracefully shutting down webservice at %q", listenAddrs)
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(listenAddrs); err != nil {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())

View File

@@ -6,7 +6,7 @@ RUN apk update && apk upgrade && apk --update --no-cache add ca-certificates
FROM $root_image
COPY --from=certs /etc/ssl/certs/ca-certificates.crt /etc/ssl/certs/ca-certificates.crt
EXPOSE 9428
EXPOSE 8428
ENTRYPOINT ["/victoria-logs-prod"]
ARG TARGETARCH
COPY victoria-logs-linux-${TARGETARCH}-prod ./victoria-logs-prod

View File

@@ -26,12 +26,12 @@ import (
)
var (
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "TCP addresses to listen for incoming http requests. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the corresponding -httpListenAddr . "+
httpListenAddr = flag.String("httpListenAddr", ":8428", "TCP address to listen for http connections. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
minScrapeInterval = flag.Duration("dedup.minScrapeInterval", 0, "Leave only the last sample in every time series per each discrete interval "+
"equal to -dedup.minScrapeInterval > 0. See also -streamAggr.dedupInterval and https://docs.victoriametrics.com/#deduplication")
"equal to -dedup.minScrapeInterval > 0. See https://docs.victoriametrics.com/#deduplication and https://docs.victoriametrics.com/#downsampling")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running VictoriaMetrics. The following config files are checked: "+
"-promscrape.config, -relabelConfig and -streamAggr.config. Unknown config entries aren't allowed in -promscrape.config by default. "+
"This can be changed with -promscrape.config.strictParse=false command-line flag")
@@ -66,11 +66,7 @@ func main() {
return
}
listenAddrs := *httpListenAddrs
if len(listenAddrs) == 0 {
listenAddrs = []string{":8428"}
}
logger.Infof("starting VictoriaMetrics at %q...", listenAddrs)
logger.Infof("starting VictoriaMetrics at %q...", *httpListenAddr)
startTime := time.Now()
storage.SetDedupInterval(*minScrapeInterval)
storage.SetDataFlushInterval(*inmemoryDataFlushInterval)
@@ -80,7 +76,7 @@ func main() {
startSelfScraper()
go httpserver.Serve(listenAddrs, useProxyProtocol, requestHandler)
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started VictoriaMetrics in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
@@ -90,9 +86,9 @@ func main() {
stopSelfScraper()
logger.Infof("gracefully shutting down webservice at %q", listenAddrs)
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
startTime = time.Now()
if err := httpserver.Stop(listenAddrs); err != nil {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
@@ -123,12 +119,12 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
{"expand-with-exprs", "WITH expressions' tutorial"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"stream-agg", "streaming aggregation status"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"api/v1/status/tsdb", "tsdb status page"},
{"api/v1/status/top_queries", "top queries"},
{"api/v1/status/active_queries", "active queries"},
{"-/reload", "reload configuration"},
})
return true
}

View File

@@ -7,7 +7,6 @@ import (
"fmt"
"io"
"log"
"math/rand"
"net"
"net/http"
"os"
@@ -39,13 +38,11 @@ const (
)
const (
testReadHTTPPath = "http://127.0.0.1" + testHTTPListenAddr
testWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/write"
testOpenTSDBWriteHTTPPath = "http://127.0.0.1" + testOpenTSDBHTTPListenAddr + "/api/put"
testPromWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/api/v1/write"
testImportCSVWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/api/v1/import/csv"
testHealthHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/health"
testReadHTTPPath = "http://127.0.0.1" + testHTTPListenAddr
testWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/write"
testOpenTSDBWriteHTTPPath = "http://127.0.0.1" + testOpenTSDBHTTPListenAddr + "/api/put"
testPromWriteHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/api/v1/write"
testHealthHTTPPath = "http://127.0.0.1" + testHTTPListenAddr + "/health"
)
const (
@@ -58,15 +55,14 @@ var (
)
type test struct {
Name string `json:"name"`
Data []string `json:"data"`
InsertQuery string `json:"insert_query"`
Query []string `json:"query"`
ResultMetrics []Metric `json:"result_metrics"`
ResultSeries Series `json:"result_series"`
ResultQuery Query `json:"result_query"`
Issue string `json:"issue"`
ExpectedResultLinesCount int `json:"expected_result_lines_count"`
Name string `json:"name"`
Data []string `json:"data"`
InsertQuery string `json:"insert_query"`
Query []string `json:"query"`
ResultMetrics []Metric `json:"result_metrics"`
ResultSeries Series `json:"result_series"`
ResultQuery Query `json:"result_query"`
Issue string `json:"issue"`
}
type Metric struct {
@@ -184,7 +180,7 @@ func setUp() {
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
vmselect.Init()
vminsert.Init()
go httpserver.Serve(*httpListenAddrs, useProxyProtocol, requestHandler)
go httpserver.Serve(*httpListenAddr, false, requestHandler)
readyStorageCheckFunc := func() bool {
resp, err := http.Get(testHealthHTTPPath)
if err != nil {
@@ -230,7 +226,7 @@ func waitFor(timeout time.Duration, f func() bool) error {
}
func tearDown() {
if err := httpserver.Stop(*httpListenAddrs); err != nil {
if err := httpserver.Stop(*httpListenAddr); err != nil {
log.Printf("cannot stop the webservice: %s", err)
}
vminsert.Stop()
@@ -241,9 +237,8 @@ func tearDown() {
func TestWriteRead(t *testing.T) {
t.Run("write", testWrite)
time.Sleep(500 * time.Millisecond)
vmstorage.Storage.DebugFlush()
time.Sleep(1500 * time.Millisecond)
time.Sleep(1 * time.Second)
t.Run("read", testRead)
}
@@ -265,14 +260,6 @@ func testWrite(t *testing.T) {
httpWrite(t, testPromWriteHTTPPath, test.InsertQuery, bytes.NewBuffer(data))
}
})
t.Run("csv", func(t *testing.T) {
for _, test := range readIn("csv", t, insertionTime) {
if test.Data == nil {
continue
}
httpWrite(t, testImportCSVWriteHTTPPath, test.InsertQuery, bytes.NewBuffer([]byte(strings.Join(test.Data, "\n"))))
}
})
t.Run("influxdb", func(t *testing.T) {
for _, x := range readIn("influxdb", t, insertionTime) {
@@ -314,7 +301,7 @@ func testWrite(t *testing.T) {
}
func testRead(t *testing.T) {
for _, engine := range []string{"csv", "prometheus", "graphite", "opentsdb", "influxdb", "opentsdbhttp"} {
for _, engine := range []string{"prometheus", "graphite", "opentsdb", "influxdb", "opentsdbhttp"} {
t.Run(engine, func(t *testing.T) {
for _, x := range readIn(engine, t, insertionTime) {
test := x
@@ -325,12 +312,7 @@ func testRead(t *testing.T) {
if test.Issue != "" {
test.Issue = "\nRegression in " + test.Issue
}
switch {
case strings.HasPrefix(q, "/api/v1/export/csv"):
data := strings.Split(string(httpReadData(t, testReadHTTPPath, q)), "\n")
if len(data) == test.ExpectedResultLinesCount {
t.Fatalf("not expected number of csv lines want=%d\ngot=%d test=%s.%s\n\response=%q", len(data), test.ExpectedResultLinesCount, q, test.Issue, strings.Join(data, "\n"))
}
switch true {
case strings.HasPrefix(q, "/api/v1/export"):
if err := checkMetricsResult(httpReadMetrics(t, testReadHTTPPath, q), test.ResultMetrics); err != nil {
t.Fatalf("Export. %s fails with error %s.%s", q, err, test.Issue)
@@ -369,7 +351,7 @@ func readIn(readFor string, t *testing.T, insertTime time.Time) []test {
t.Helper()
s := newSuite(t)
var tt []test
s.noError(filepath.Walk(filepath.Join(testFixturesDir, readFor), func(path string, _ os.FileInfo, err error) error {
s.noError(filepath.Walk(filepath.Join(testFixturesDir, readFor), func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
@@ -431,7 +413,6 @@ func httpReadMetrics(t *testing.T, address, query string) []Metric {
}
return rows
}
func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
t.Helper()
s := newSuite(t)
@@ -444,20 +425,6 @@ func httpReadStruct(t *testing.T, address, query string, dst interface{}) {
s.noError(json.NewDecoder(resp.Body).Decode(dst))
}
func httpReadData(t *testing.T, address, query string) []byte {
t.Helper()
s := newSuite(t)
resp, err := http.Get(address + query)
s.noError(err)
defer func() {
_ = resp.Body.Close()
}()
s.equalInt(resp.StatusCode, 200)
data, err := io.ReadAll(resp.Body)
s.noError(err)
return data
}
func checkMetricsResult(got, want []Metric) error {
for _, r := range append([]Metric(nil), got...) {
want = removeIfFoundMetrics(r, want)
@@ -530,73 +497,3 @@ func (s *suite) greaterThan(a, b int) {
s.t.FailNow()
}
}
func TestImportJSONLines(t *testing.T) {
f := func(labelsCount, labelLen int) {
t.Helper()
reqURL := fmt.Sprintf("http://localhost%s/api/v1/import", testHTTPListenAddr)
line := generateJSONLine(labelsCount, labelLen)
req, err := http.NewRequest("POST", reqURL, bytes.NewBufferString(line))
if err != nil {
t.Fatalf("cannot create request: %s", err)
}
resp, err := http.DefaultClient.Do(req)
if err != nil {
t.Fatalf("cannot perform request for labelsCount=%d, labelLen=%d: %s", labelsCount, labelLen, err)
}
if resp.StatusCode != 204 {
t.Fatalf("unexpected statusCode for labelsCount=%d, labelLen=%d; got %d; want 204", labelsCount, labelLen, resp.StatusCode)
}
}
// labels with various lengths
for i := 0; i < 500; i++ {
f(10, i*5)
}
// Too many labels
f(1000, 100)
// Too long labels
f(1, 100_000)
f(10, 100_000)
f(10, 10_000)
}
func generateJSONLine(labelsCount, labelLen int) string {
m := make(map[string]string, labelsCount)
m["__name__"] = generateSizedRandomString(labelLen)
for j := 1; j < labelsCount; j++ {
labelName := generateSizedRandomString(labelLen)
labelValue := generateSizedRandomString(labelLen)
m[labelName] = labelValue
}
type jsonLine struct {
Metric map[string]string `json:"metric"`
Values []float64 `json:"values"`
Timestamps []int64 `json:"timestamps"`
}
line := &jsonLine{
Metric: m,
Values: []float64{1.34},
Timestamps: []int64{time.Now().UnixNano() / 1e6},
}
data, err := json.Marshal(&line)
if err != nil {
panic(fmt.Errorf("cannot marshal JSON: %w", err))
}
data = append(data, '\n')
return string(data)
}
const alphabetSample = `qwertyuiopasdfghjklzxcvbnm`
func generateSizedRandomString(size int) string {
dst := make([]byte, size)
for i := range dst {
dst[i] = alphabetSample[rand.Intn(len(alphabetSample))]
}
return string(dst)
}

View File

@@ -8,7 +8,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/appmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompb"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/prometheus"
@@ -50,8 +49,16 @@ func selfScraper(scrapeInterval time.Duration) {
var mrs []storage.MetricRow
var labels []prompb.Label
t := time.NewTicker(scrapeInterval)
f := func(currentTime time.Time, sendStaleMarkers bool) {
currentTimestamp := currentTime.UnixNano() / 1e6
var currentTimestamp int64
for {
select {
case <-selfScraperStopCh:
t.Stop()
logger.Infof("stopped self-scraping `/metrics` page")
return
case currentTime := <-t.C:
currentTimestamp = currentTime.UnixNano() / 1e6
}
bb.Reset()
appmetrics.WritePrometheusMetrics(&bb)
s := bytesutil.ToUnsafeString(bb.B)
@@ -76,27 +83,12 @@ func selfScraper(scrapeInterval time.Duration) {
mr := &mrs[len(mrs)-1]
mr.MetricNameRaw = storage.MarshalMetricNameRaw(mr.MetricNameRaw[:0], labels)
mr.Timestamp = currentTimestamp
if sendStaleMarkers {
mr.Value = decimal.StaleNaN
} else {
mr.Value = r.Value
}
mr.Value = r.Value
}
if err := vmstorage.AddRows(mrs); err != nil {
logger.Errorf("cannot store self-scraped metrics: %s", err)
}
}
for {
select {
case <-selfScraperStopCh:
f(time.Now(), true)
t.Stop()
logger.Infof("stopped self-scraping `/metrics` page")
return
case currentTime := <-t.C:
f(currentTime, false)
}
}
}
func addLabel(dst []prompb.Label, key, value string) []prompb.Label {

View File

@@ -1,14 +0,0 @@
{
"name": "csv export",
"data": [
"rfc3339,4,{TIME_MS}",
"rfc3339milli,6,{TIME_MS}",
"ts,8,{TIME_MS}",
"tsms,10,{TIME_MS},"
],
"insert_query": "?format=1:label:tfmt,2:metric:test_csv,3:time:unix_ms",
"query": [
"/api/v1/export/csv?format=__name__,tfmt,__value__,__timestamp__:rfc3339&match[]={__name__=\"test_csv\"}&step=30s&start={TIME_MS-180s}"
],
"expected_result_lines_count": 4
}

View File

@@ -1,14 +0,0 @@
{
"name": "csv export with extra_labels",
"data": [
"location-1,4,{TIME_MS}",
"location-2,6,{TIME_MS}",
"location-3,8,{TIME_MS}",
"location-4,10,{TIME_MS},"
],
"insert_query": "?format=1:label:location,2:metric:test_csv_labels,3:time:unix_ms&extra_label=location=location-1",
"query": [
"/api/v1/export/csv?format=__name__,location,__value__,__timestamp__:unix_ms&match[]={__name__=\"test_csv\"}&step=30s&start={TIME_MS-180s}"
],
"expected_result_lines_count": 4
}

View File

@@ -33,7 +33,7 @@ func benchmarkReadBulkRequest(b *testing.B, isGzip bool) {
timeField := "@timestamp"
msgField := "message"
processLogMessage := func(_ int64, _ []logstorage.Field) {}
processLogMessage := func(timestmap int64, fields []logstorage.Field) {}
b.ReportAllocs()
b.SetBytes(int64(len(data)))

View File

@@ -11,7 +11,7 @@ import (
func TestParseJSONRequestFailure(t *testing.T) {
f := func(s string) {
t.Helper()
n, err := parseJSONRequest([]byte(s), func(_ int64, _ []logstorage.Field) {
n, err := parseJSONRequest([]byte(s), func(timestamp int64, fields []logstorage.Field) {
t.Fatalf("unexpected call to parseJSONRequest callback!")
})
if err == nil {

View File

@@ -27,7 +27,7 @@ func benchmarkParseJSONRequest(b *testing.B, streams, rows, labels int) {
b.RunParallel(func(pb *testing.PB) {
data := getJSONBody(streams, rows, labels)
for pb.Next() {
_, err := parseJSONRequest(data, func(_ int64, _ []logstorage.Field) {})
_, err := parseJSONRequest(data, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %w", err))
}

View File

@@ -29,7 +29,7 @@ func benchmarkParseProtobufRequest(b *testing.B, streams, rows, labels int) {
b.RunParallel(func(pb *testing.PB) {
body := getProtobufBody(streams, rows, labels)
for pb.Next() {
_, err := parseProtobufRequest(body, func(_ int64, _ []logstorage.Field) {})
_, err := parseProtobufRequest(body, func(timestamp int64, fields []logstorage.Field) {})
if err != nil {
panic(fmt.Errorf("unexpected error: %w", err))
}

View File

@@ -7,7 +7,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logstorage"
)
@@ -18,18 +17,13 @@ var (
)
// ProcessQueryRequest handles /select/logsql/query request
func ProcessQueryRequest(w http.ResponseWriter, r *http.Request, stopCh <-chan struct{}, cancel func()) {
func ProcessQueryRequest(w http.ResponseWriter, r *http.Request, stopCh <-chan struct{}) {
// Extract tenantID
tenantID, err := logstorage.GetTenantIDFromRequest(r)
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
limit, err := httputils.GetInt(r, "limit")
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return
}
qStr := r.FormValue("query")
q, err := logstorage.ParseQuery(qStr)
@@ -40,7 +34,7 @@ func ProcessQueryRequest(w http.ResponseWriter, r *http.Request, stopCh <-chan s
w.Header().Set("Content-Type", "application/stream+json; charset=utf-8")
sw := getSortWriter()
sw.Init(w, maxSortBufferSize.IntN(), limit)
sw.Init(w, maxSortBufferSize.IntN())
tenantIDs := []logstorage.TenantID{tenantID}
vlstorage.RunQuery(tenantIDs, q, stopCh, func(columns []logstorage.BlockColumn) {
if len(columns) == 0 {
@@ -48,36 +42,11 @@ func ProcessQueryRequest(w http.ResponseWriter, r *http.Request, stopCh <-chan s
}
rowsCount := len(columns[0].Values)
// skip entries with empty _stream column
// _stream is empty in case indexdb entry was not flushed to the storage yet
// skipping such entries makes the result more consistent
streamCol := 0
// fast path
// _stream column is a built-in column and it is always supposed to be at the same position
if len(columns) >= 2 && columns[1].Name == "_stream" {
streamCol = 1
} else {
for i := 1; i < len(columns); i++ {
if columns[i].Name == "_stream" {
streamCol = i
break
}
}
}
bb := blockResultPool.Get()
for rowIdx := 0; rowIdx < rowsCount; rowIdx++ {
if columns[streamCol].Values[rowIdx] == "" {
continue
}
WriteJSONRow(bb, columns, rowIdx)
}
if !sw.TryWrite(bb.B) {
cancel()
}
sw.MustWrite(bb.B)
blockResultPool.Put(bb)
})
sw.FinalFlush()

View File

@@ -36,12 +36,8 @@ var sortWriterPool sync.Pool
// If the buf isn't empty at FinalFlush() call, then the buffered data
// is sorted by _time field.
type sortWriter struct {
mu sync.Mutex
w io.Writer
maxLines int
linesWritten int
mu sync.Mutex
w io.Writer
maxBufLen int
buf []byte
bufFlushed bool
@@ -51,119 +47,58 @@ type sortWriter struct {
func (sw *sortWriter) reset() {
sw.w = nil
sw.maxLines = 0
sw.linesWritten = 0
sw.maxBufLen = 0
sw.buf = sw.buf[:0]
sw.bufFlushed = false
sw.hasErr = false
}
// Init initializes sw.
//
// If maxLines is set to positive value, then sw accepts up to maxLines
// and then rejects all the other lines by returning false from TryWrite.
func (sw *sortWriter) Init(w io.Writer, maxBufLen, maxLines int) {
func (sw *sortWriter) Init(w io.Writer, maxBufLen int) {
sw.reset()
sw.w = w
sw.maxBufLen = maxBufLen
sw.maxLines = maxLines
}
// TryWrite writes p to sw.
//
// True is returned on successful write, false otherwise.
//
// Unsuccessful write may occur on underlying write error or when maxLines lines are already written to sw.
func (sw *sortWriter) TryWrite(p []byte) bool {
func (sw *sortWriter) MustWrite(p []byte) {
sw.mu.Lock()
defer sw.mu.Unlock()
if sw.hasErr {
return false
return
}
if sw.bufFlushed {
if !sw.writeToUnderlyingWriterLocked(p) {
if _, err := sw.w.Write(p); err != nil {
sw.hasErr = true
return false
}
return true
return
}
if len(sw.buf)+len(p) < sw.maxBufLen {
sw.buf = append(sw.buf, p...)
return true
return
}
sw.bufFlushed = true
if !sw.writeToUnderlyingWriterLocked(sw.buf) {
sw.hasErr = true
return false
}
sw.buf = sw.buf[:0]
if !sw.writeToUnderlyingWriterLocked(p) {
sw.hasErr = true
return false
}
return true
}
func (sw *sortWriter) writeToUnderlyingWriterLocked(p []byte) bool {
if len(p) == 0 {
return true
}
if sw.maxLines > 0 {
if sw.linesWritten >= sw.maxLines {
return false
if len(sw.buf) > 0 {
if _, err := sw.w.Write(sw.buf); err != nil {
sw.hasErr = true
return
}
var linesLeft int
p, linesLeft = trimLines(p, sw.maxLines-sw.linesWritten)
sw.linesWritten += linesLeft
sw.buf = sw.buf[:0]
}
if _, err := sw.w.Write(p); err != nil {
return false
sw.hasErr = true
}
return true
}
func trimLines(p []byte, maxLines int) ([]byte, int) {
if maxLines <= 0 {
return nil, 0
}
n := bytes.Count(p, newline)
if n < maxLines {
return p, n
}
for n >= maxLines {
idx := bytes.LastIndexByte(p, '\n')
p = p[:idx]
n--
}
return p[:len(p)+1], maxLines
}
var newline = []byte("\n")
func (sw *sortWriter) FinalFlush() {
if sw.hasErr || sw.bufFlushed {
return
}
rs := getRowsSorter()
rs.parseRows(sw.buf)
rs.sort()
rows := rs.rows
if sw.maxLines > 0 && len(rows) > sw.maxLines {
rows = rows[:sw.maxLines]
}
WriteJSONRows(sw.w, rows)
WriteJSONRows(sw.w, rs.rows)
putRowsSorter(rs)
}

View File

@@ -7,16 +7,15 @@ import (
)
func TestSortWriter(t *testing.T) {
f := func(maxBufLen, maxLines int, data string, expectedResult string) {
f := func(maxBufLen int, data string, expectedResult string) {
t.Helper()
var bb bytes.Buffer
sw := getSortWriter()
sw.Init(&bb, maxBufLen, maxLines)
sw.Init(&bb, maxBufLen)
for _, s := range strings.Split(data, "\n") {
if !sw.TryWrite([]byte(s + "\n")) {
break
}
sw.MustWrite([]byte(s + "\n"))
}
sw.FinalFlush()
putSortWriter(sw)
@@ -27,20 +26,14 @@ func TestSortWriter(t *testing.T) {
}
}
f(100, 0, "", "")
f(100, 0, "{}", "{}\n")
f(100, "", "")
f(100, "{}", "{}\n")
data := `{"_time":"def","_msg":"xxx"}
{"_time":"abc","_msg":"foo"}`
resultExpected := `{"_time":"abc","_msg":"foo"}
{"_time":"def","_msg":"xxx"}
`
f(100, 0, data, resultExpected)
f(10, 0, data, data+"\n")
// Test with the maxLines
f(100, 1, data, `{"_time":"abc","_msg":"foo"}`+"\n")
f(10, 1, data, `{"_time":"def","_msg":"xxx"}`+"\n")
f(10, 2, data, data+"\n")
f(100, 2, data, resultExpected)
f(100, data, resultExpected)
f(10, data, data+"\n")
}

View File

@@ -1,7 +1,6 @@
package vlselect
import (
"context"
"embed"
"flag"
"fmt"
@@ -102,8 +101,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
// Limit the number of concurrent queries, which can consume big amounts of CPU.
startTime := time.Now()
ctx := r.Context()
stopCh := ctx.Done()
stopCh := r.Context().Done()
select {
case concurrencyLimitCh <- struct{}{}:
defer func() { <-concurrencyLimitCh }()
@@ -141,15 +139,11 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
}
ctxWithCancel, cancel := context.WithCancel(ctx)
defer cancel()
stopCh = ctxWithCancel.Done()
switch {
case path == "/logsql/query":
logsqlQueryRequests.Inc()
httpserver.EnableCORS(w, r)
logsql.ProcessQueryRequest(w, r, stopCh, cancel)
logsql.ProcessQueryRequest(w, r, stopCh)
return true
default:
return false

View File

@@ -1,13 +1,13 @@
{
"files": {
"main.css": "./static/css/main.bc07cc78.css",
"main.js": "./static/js/main.034044a7.js",
"static/js/685.bebe1265.chunk.js": "./static/js/685.bebe1265.chunk.js",
"static/media/MetricsQL.md": "./static/media/MetricsQL.10add6e7bdf0f1d98cf7.md",
"main.css": "./static/css/main.d1313636.css",
"main.js": "./static/js/main.1919fefe.js",
"static/js/522.da77e7b3.chunk.js": "./static/js/522.da77e7b3.chunk.js",
"static/media/MetricsQL.md": "./static/media/MetricsQL.8644fd7c964802dd34a9.md",
"index.html": "./index.html"
},
"entrypoints": [
"static/css/main.bc07cc78.css",
"static/js/main.034044a7.js"
"static/css/main.d1313636.css",
"static/js/main.1919fefe.js"
]
}

View File

@@ -1 +1 @@
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=5"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.034044a7.js"></script><link href="./static/css/main.bc07cc78.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>
<!doctype html><html lang="en"><head><meta charset="utf-8"/><link rel="icon" href="./favicon.ico"/><meta name="viewport" content="width=device-width,initial-scale=1,maximum-scale=5"/><meta name="theme-color" content="#000000"/><meta name="description" content="UI for VictoriaMetrics"/><link rel="apple-touch-icon" href="./apple-touch-icon.png"/><link rel="icon" type="image/png" sizes="32x32" href="./favicon-32x32.png"><link rel="manifest" href="./manifest.json"/><title>VM UI</title><script src="./dashboards/index.js" type="module"></script><meta name="twitter:card" content="summary_large_image"><meta name="twitter:image" content="./preview.jpg"><meta name="twitter:title" content="UI for VictoriaMetrics"><meta name="twitter:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta name="twitter:site" content="@VictoriaMetrics"><meta property="og:title" content="Metric explorer for VictoriaMetrics"><meta property="og:description" content="Explore and troubleshoot your VictoriaMetrics data"><meta property="og:image" content="./preview.jpg"><meta property="og:type" content="website"><script defer="defer" src="./static/js/main.1919fefe.js"></script><link href="./static/css/main.d1313636.css" rel="stylesheet"></head><body><noscript>You need to enable JavaScript to run this app.</noscript><div id="root"></div></body></html>

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

File diff suppressed because one or more lines are too long

View File

@@ -4,8 +4,10 @@
http://jedwatson.github.io/classnames
*/
/*! regenerator-runtime -- Copyright (c) 2014-present, Facebook, Inc. -- license (MIT): https://github.com/facebook/regenerator/blob/main/LICENSE */
/**
* @remix-run/router v1.15.1
* @remix-run/router v1.10.0
*
* Copyright (c) Remix Software Inc.
*
@@ -16,7 +18,7 @@
*/
/**
* React Router DOM v6.22.1
* React Router DOM v6.17.0
*
* Copyright (c) Remix Software Inc.
*
@@ -27,7 +29,7 @@
*/
/**
* React Router v6.22.1
* React Router v6.17.0
*
* Copyright (c) Remix Software Inc.
*

View File

@@ -26,18 +26,12 @@ and introduction into [basic querying via MetricsQL](https://docs.victoriametric
The following functionality is implemented differently in MetricsQL compared to PromQL. This improves user experience:
* MetricsQL takes into account the last [raw sample](https://docs.victoriametrics.com/keyconcepts/#raw-samples) before the lookbehind window
in square brackets for [increase](#increase) and [rate](#rate) functions. This allows returning the exact results users expect for `increase(metric[$__interval])` queries
instead of incomplete results Prometheus returns for such queries. Prometheus misses the increase between the last sample before the lookbehind window
and the first sample inside the lookbehind window.
* MetricsQL doesn't extrapolate [rate](#rate) and [increase](#increase) function results, so it always returns the expected results. For example, it returns
integer results from `increase()` over slow-changing integer counter. Prometheus in this case returns unexpected fractional results,
which may significantly differ from the expected results. This addresses [this issue from Prometheus](https://github.com/prometheus/prometheus/issues/3746).
* MetricsQL takes into account the previous point before the window in square brackets for range functions such as [rate](#rate) and [increase](#increase).
This allows returning the exact results users expect for `increase(metric[$__interval])` queries instead of incomplete results Prometheus returns for such queries.
* MetricsQL doesn't extrapolate range function results. This addresses [this issue from Prometheus](https://github.com/prometheus/prometheus/issues/3746).
See technical details about VictoriaMetrics and Prometheus calculations for [rate](#rate)
and [increase](#increase) [in this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/1215#issuecomment-850305711).
* MetricsQL returns the expected non-empty responses for [rate](#rate) function when Grafana or [vmui](https://docs.victoriametrics.com/#vmui)
passes `step` values smaller than the interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples)
to [/api/v1/query_range](https://docs.victoriametrics.com/keyconcepts/#range-query).
* MetricsQL returns the expected non-empty responses for [rate](#rate) with `step` values smaller than scrape interval.
This addresses [this issue from Grafana](https://github.com/grafana/grafana/issues/11451).
See also [this blog post](https://www.percona.com/blog/2020/02/28/better-prometheus-rate-function-with-victoriametrics/).
* MetricsQL treats `scalar` type the same as `instant vector` without labels, since subtle differences between these types usually confuse users.
@@ -67,17 +61,16 @@ The list of MetricsQL features on top of PromQL:
* Graphite-compatible filters can be passed via `{__graphite__="foo.*.bar"}` syntax.
See [these docs](https://docs.victoriametrics.com/#selecting-graphite-metrics).
VictoriaMetrics can be used as Graphite datasource in Grafana. See [these docs](https://docs.victoriametrics.com/#graphite-api-usage) for details.
VictoriaMetrics also can be used as Graphite datasource in Grafana.
See [these docs](https://docs.victoriametrics.com/#graphite-api-usage) for details.
See also [label_graphite_group](#label_graphite_group) function, which can be used for extracting the given groups from Graphite metric name.
* Lookbehind window in square brackets for [rollup functions](#rollup-functions) may be omitted. VictoriaMetrics automatically selects the lookbehind window
depending on the `step` query arg passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query)
and the real interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) (aka `scrape_interval`).
* Lookbehind window in square brackets may be omitted. VictoriaMetrics automatically selects the lookbehind window
depending on the current step used for building the graph (e.g. `step` query arg passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query)).
For instance, the following query is valid in VictoriaMetrics: `rate(node_network_receive_bytes_total)`.
It is roughly equivalent to `rate(node_network_receive_bytes_total[$__interval])` when used in Grafana.
The difference is documented in [rate() docs](#rate).
It is equivalent to `rate(node_network_receive_bytes_total[$__interval])` when used in Grafana.
* Numeric values can contain `_` delimiters for better readability. For example, `1_234_567_890` can be used in queries instead of `1234567890`.
* [Series selectors](https://docs.victoriametrics.com/keyConcepts.html#filtering) accept multiple `or` filters. For example, `{env="prod",job="a" or env="dev",job="b"}`
selects series with `{env="prod",job="a"}` or `{env="dev",job="b"}` labels.
selects series with either `{env="prod",job="a"}` or `{env="dev",job="b"}` labels.
See [these docs](https://docs.victoriametrics.com/keyConcepts.html#filtering-by-multiple-or-filters) for details.
* Support for `group_left(*)` and `group_right(*)` for copying all the labels from time series on the `one` side
of [many-to-one operations](https://prometheus.io/docs/prometheus/latest/querying/operators/#many-to-one-and-one-to-many-vector-matches).
@@ -105,7 +98,7 @@ The list of MetricsQL features on top of PromQL:
* Trailing commas on all the lists are allowed - label filters, function args and with expressions.
For instance, the following queries are valid: `m{foo="bar",}`, `f(a, b,)`, `WITH (x=y,) x`.
This simplifies maintenance of multi-line queries.
* Metric names and label names may contain any unicode letter. For example `температура{город="Київ"}` is a value MetricsQL expression.
* Metric names and label names may contain any unicode letter. For example `температура{город="Киев"}` is a value MetricsQL expression.
* Metric names and labels names may contain escaped chars. For example, `foo\-bar{baz\=aa="b"}` is valid expression.
It returns time series with name `foo-bar` containing label `baz=aa` with value `b`.
Additionally, the following escape sequences are supported:
@@ -124,8 +117,7 @@ The list of MetricsQL features on top of PromQL:
Go to [WITH templates playground](https://play.victoriametrics.com/select/accounting/1/6a716b0f-38bc-4856-90ce-448fd713e3fe/expand-with-exprs) and try it.
* String literals may be concatenated. This is useful with `WITH` templates:
`WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}`.
* `keep_metric_names` modifier can be applied to all the [rollup functions](#rollup-functions), [transform functions](#transform-functions)
and [binary operators](https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operators).
* `keep_metric_names` modifier can be applied to all the [rollup functions](#rollup-functions), [transform functions](#transform-functions) and [binary operators](https://prometheus.io/docs/prometheus/latest/querying/operators/#binary-operators).
This modifier prevents from dropping metric names in function results. See [these docs](#keep_metric_names).
## keep_metric_names
@@ -163,15 +155,14 @@ Additional details:
The interval between points is set as `step` query arg passed by Grafana to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query).
* If the given [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering) returns multiple time series,
then rollups are calculated individually per each returned series.
* If lookbehind window in square brackets is missing, then it is automatically set to the following value:
- To `step` value passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query) or [/api/v1/query](https://docs.victoriametrics.com/keyconcepts/#instant-query)
for all the [rollup functions](#rollup-functions) except of [default_rollup](#default_rollup) and [rate](#rate). This value is known as `$__interval` in Grafana or `1i` in MetricsQL.
For example, `avg_over_time(temperature)` is automatically transformed to `avg_over_time(temperature[1i])`.
- To the `max(step, scrape_interval)`, where `scrape_interval` is the interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples)
for [default_rollup](#default_rollup) and [rate](#rate) functions. This allows avoiding unexpected gaps on the graph when `step` is smaller than `scrape_interval`.
* If lookbehind window in square brackets is missing, then MetricsQL automatically sets the lookbehind window
to the interval between points on the graph (aka `step` query arg at [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query),
`$__interval` value from Grafana or `1i` duration in MetricsQL).
For example, `rate(http_requests_total)` is equivalent to `rate(http_requests_total[$__interval])` in Grafana.
It is also equivalent to `rate(http_requests_total[1i])`.
* Every [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering) in MetricsQL must be wrapped into a rollup function.
Otherwise, it is automatically wrapped into [default_rollup](#default_rollup). For example, `foo{bar="baz"}`
is automatically converted to `default_rollup(foo{bar="baz"})` before performing the calculations.
is automatically converted to `default_rollup(foo{bar="baz"}[1i])` before performing the calculations.
* If something other than [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering) is passed to rollup function,
then the inner arg is automatically converted to a [subquery](#subqueries).
* All the rollup functions accept optional `keep_metric_names` modifier. If it is set, then the function keeps metric names in results.
@@ -186,9 +177,7 @@ The list of supported rollup functions:
`absent_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns 1
if the given lookbehind window `d` doesn't contain raw samples. Otherwise, it returns an empty result.
This function is supported by PromQL.
See also [present_over_time](#present_over_time).
This function is supported by PromQL. See also [present_over_time](#present_over_time).
#### aggr_over_time
@@ -218,9 +207,7 @@ See also [descent_over_time](#descent_over_time).
over raw samples on the given lookbehind window `d` per each time series returned
from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
This function is supported by PromQL.
See also [median_over_time](#median_over_time).
This function is supported by PromQL. See also [median_over_time](#median_over_time).
#### changes
@@ -233,9 +220,7 @@ See [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-co
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [changes_prometheus](#changes_prometheus).
This function is supported by PromQL. See also [changes_prometheus](#changes_prometheus).
#### changes_prometheus
@@ -248,9 +233,7 @@ See [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-co
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [changes](#changes).
This function is supported by PromQL. See also [changes](#changes).
#### count_eq_over_time
@@ -260,7 +243,7 @@ from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.ht
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_over_time](#count_over_time), [share_eq_over_time](#share_eq_over_time) and [count_values_over_time](#count_values_over_time).
See also [count_over_time](#count_over_time).
#### count_gt_over_time
@@ -270,7 +253,7 @@ from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.ht
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_over_time](#count_over_time) and [share_gt_over_time](#share_gt_over_time).
See also [count_over_time](#count_over_time).
#### count_le_over_time
@@ -280,7 +263,7 @@ from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.ht
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_over_time](#count_over_time) and [share_le_over_time](#share_le_over_time).
See also [count_over_time](#count_over_time).
#### count_ne_over_time
@@ -299,19 +282,8 @@ on the given lookbehind window `d` per each time series returned from the given
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [count_le_over_time](#count_le_over_time), [count_gt_over_time](#count_gt_over_time), [count_eq_over_time](#count_eq_over_time) and [count_ne_over_time](#count_ne_over_time).
#### count_values_over_time
`count_values_over_time("label", series_selector[d])` is a [rollup function](#rollup-functions), which counts the number of raw samples
with the same value over the given lookbehind window and stores the counts in a time series with an additional `label`, which contains each initial value.
The results are calculated independently per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_eq_over_time](#count_eq_over_time), [count_values](#count_values) and [distinct_over_time](#distinct_over_time) and [label_match](#label_match).
This function is supported by PromQL. See also [count_le_over_time](#count_le_over_time), [count_gt_over_time](#count_gt_over_time),
[count_eq_over_time](#count_eq_over_time) and [count_ne_over_time](#count_ne_over_time).
#### decreases_over_time
@@ -327,11 +299,6 @@ See also [increases_over_time](#increases_over_time).
`default_rollup(series_selector[d])` is a [rollup function](#rollup-functions), which returns the last raw sample value on the given lookbehind window `d`
per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
If the lookbehind window is skipped in square brackets, then it is automatically calculated as `max(step, scrape_interval)`, where `step` is the query arg value
passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyconcepts/#range-query) or [/api/v1/query](https://docs.victoriametrics.com/keyconcepts/#instant-query),
while `scrape_interval` is the interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) for the selected time series.
This allows avoiding unexpected gaps on the graph when `step` is smaller than the `scrape_interval`.
#### delta
`delta(series_selector[d])` is a [rollup function](#rollup-functions), which calculates the difference between
@@ -343,9 +310,7 @@ See [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-co
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [increase](#increase) and [delta_prometheus](#delta_prometheus).
This function is supported by PromQL. See also [increase](#increase) and [delta_prometheus](#delta_prometheus).
#### delta_prometheus
@@ -368,9 +333,7 @@ The derivative is calculated using linear regression.
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [deriv_fast](#deriv_fast) and [ideriv](#ideriv).
This function is supported by PromQL. See also [deriv_fast](#deriv_fast) and [ideriv](#ideriv).
#### deriv_fast
@@ -401,8 +364,6 @@ on the given lookbehind window `d` per each time series returned from the given
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_values_over_time](#count_values_over_time).
#### duration_over_time
`duration_over_time(series_selector[d], max_interval)` is a [rollup function](#rollup-functions), which returns the duration in seconds
@@ -462,9 +423,7 @@ over the given lookbehind window `d` using the given smoothing factor `sf` and t
Both `sf` and `tf` must be in the range `[0...1]`. It is expected that the [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering)
returns time series of [gauge type](https://docs.victoriametrics.com/keyConcepts.html#gauge).
This function is supported by PromQL.
See also [range_linear_regression](#range_linear_regression).
This function is supported by PromQL. See also [range_linear_regression](#range_linear_regression).
#### idelta
@@ -473,9 +432,7 @@ on the given lookbehind window `d` per each time series returned from the given
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [delta](#delta).
This function is supported by PromQL. See also [delta](#delta).
#### ideriv
@@ -498,9 +455,7 @@ See [this article](https://medium.com/@romanhavronenko/victoriametrics-promql-co
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [increase_pure](#increase_pure), [increase_prometheus](#increase_prometheus) and [delta](#delta).
This function is supported by PromQL. See also [increase_pure](#increase_pure), [increase_prometheus](#increase_prometheus) and [delta](#delta).
#### increase_prometheus
@@ -544,9 +499,7 @@ It is expected that the `series_selector` returns time series of [counter type](
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [rate](#rate) and [rollup_rate](#rollup_rate).
This function is supported by PromQL. See also [rate](#rate) and [rollup_rate](#rollup_rate).
#### lag
@@ -563,9 +516,7 @@ See also [lifetime](#lifetime) and [duration_over_time](#duration_over_time).
`last_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the last raw sample value on the given lookbehind window `d`
per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
This function is supported by PromQL.
See also [first_over_time](#first_over_time) and [tlast_over_time](#tlast_over_time).
This function is supported by PromQL. See also [first_over_time](#first_over_time) and [tlast_over_time](#tlast_over_time).
#### lifetime
@@ -588,9 +539,7 @@ See also [mad](#mad), [range_mad](#range_mad) and [outlier_iqr_over_time](#outli
`max_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which calculates the maximum value over raw samples
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
This function is supported by PromQL.
See also [tmax_over_time](#tmax_over_time).
This function is supported by PromQL. See also [tmax_over_time](#tmax_over_time).
#### median_over_time
@@ -605,9 +554,7 @@ See also [avg_over_time](#avg_over_time).
`min_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which calculates the minimum value over raw samples
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
This function is supported by PromQL.
See also [tmin_over_time](#tmin_over_time).
This function is supported by PromQL. See also [tmin_over_time](#tmin_over_time).
#### mode_over_time
@@ -623,7 +570,7 @@ if its value is either smaller than the `q25-1.5*iqr` or bigger than `q75+1.5*iq
- `q25` and `q75` are 25th and 75th [percentiles](https://en.wikipedia.org/wiki/Percentile) over raw samples on the lookbehind window `d`.
The `outlier_iqr_over_time()` is useful for detecting anomalies in gauge values based on the previous history of values.
For example, `outlier_iqr_over_time(memory_usage_bytes[1h])` triggers when `memory_usage_bytes` suddenly goes outside the usual value range for the last hour.
For example, `outlier_iqr_over_time(memory_usage_bytes[1h])` triggers when `memory_usage_bytes` suddenly goes outside the usual value range for the last 24 hours.
See also [outliers_iqr](#outliers_iqr).
@@ -633,9 +580,7 @@ See also [outliers_iqr](#outliers_iqr).
linear interpolation over raw samples on the given lookbehind window `d`. The predicted value is calculated individually per each time series
returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
This function is supported by PromQL.
See also [range_linear_regression](#range_linear_regression).
This function is supported by PromQL. See also [range_linear_regression](#range_linear_regression).
#### present_over_time
@@ -652,9 +597,7 @@ This function is supported by PromQL.
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
The `phi` value must be in the range `[0...1]`.
This function is supported by PromQL.
See also [quantiles_over_time](#quantiles_over_time).
This function is supported by PromQL. See also [quantiles_over_time](#quantiles_over_time).
#### quantiles_over_time
@@ -679,16 +622,9 @@ Metric names are stripped from the resulting rollups. Add [keep_metric_names](#k
over the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
It is expected that the `series_selector` returns time series of [counter type](https://docs.victoriametrics.com/keyConcepts.html#counter).
If the lookbehind window is skipped in square brackets, then it is automatically calculated as `max(step, scrape_interval)`, where `step` is the query arg value
passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyconcepts/#range-query) or [/api/v1/query](https://docs.victoriametrics.com/keyconcepts/#instant-query),
while `scrape_interval` is the interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples) for the selected time series.
This allows avoiding unexpected gaps on the graph when `step` is smaller than the `scrape_interval`.
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [irate](#irate) and [rollup_rate](#rollup_rate).
This function is supported by PromQL. See also [irate](#irate) and [rollup_rate](#rollup_rate).
#### rate_over_sum
@@ -716,7 +652,6 @@ on the given lookbehind window `d` and returns them in time series with `rollup=
These values are calculated individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
#### rollup_candlestick
@@ -725,8 +660,7 @@ over raw samples on the given lookbehind window `d` and returns them in time ser
The calculations are performed individually per each time series returned
from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering). This function is useful for financial applications.
Optional 2nd argument `"open"`, `"high"` or `"low"` or `"close"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
#### rollup_delta
@@ -736,7 +670,6 @@ and returns them in time series with `rollup="min"`, `rollup="max"` and `rollup=
The calculations are performed individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
@@ -750,7 +683,6 @@ and returns them in time series with `rollup="min"`, `rollup="max"` and `rollup=
The calculations are performed individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
@@ -762,7 +694,6 @@ and returns them in time series with `rollup="min"`, `rollup="max"` and `rollup=
The calculations are performed individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names. See also [rollup_delta](#rollup_delta).
@@ -776,10 +707,10 @@ See [this article](https://valyala.medium.com/why-irate-from-prometheus-doesnt-c
when to use `rollup_rate()`.
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
The calculations are performed individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
#### rollup_scrape_interval
@@ -790,7 +721,6 @@ and returns them in time series with `rollup="min"`, `rollup="max"` and `rollup=
The calculations are performed individually per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Optional 2nd argument `"min"`, `"max"` or `"avg"` can be passed to keep only one calculation result and without adding a label.
See also [label_match](#label_match).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names. See also [scrape_interval](#scrape_interval).
@@ -813,7 +743,7 @@ This function is useful for calculating SLI and SLO. Example: `share_gt_over_tim
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [share_le_over_time](#share_le_over_time) and [count_gt_over_time](#count_gt_over_time).
See also [share_le_over_time](#share_le_over_time).
#### share_le_over_time
@@ -826,7 +756,7 @@ the share of time series values for the last 24 hours when memory usage was belo
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [share_gt_over_time](#share_gt_over_time) and [count_le_over_time](#count_le_over_time).
See also [share_gt_over_time](#share_gt_over_time).
#### share_eq_over_time
@@ -836,8 +766,6 @@ from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.ht
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [count_eq_over_time](#count_eq_over_time).
#### stale_samples_over_time
`stale_samples_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which calculates the number
@@ -853,9 +781,7 @@ on the given lookbehind window `d` per each time series returned from the given
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [stdvar_over_time](#stdvar_over_time).
This function is supported by PromQL. See also [stdvar_over_time](#stdvar_over_time).
#### stdvar_over_time
@@ -864,36 +790,7 @@ on the given lookbehind window `d` per each time series returned from the given
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [stddev_over_time](#stddev_over_time).
#### sum_eq_over_time
`sum_eq_over_time(series_selector[d], eq)` is a [rollup function](#rollup-function), which calculates the sum of raw sample values equal to `eq`
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [sum_over_time](#sum_over_time) and [count_eq_over_time](#count_eq_over_time).
#### sum_gt_over_time
`sum_gt_over_time(series_selector[d], gt)` is a [rollup function](#rollup-function), which calculates the sum of raw sample values bigger than `gt`
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [sum_over_time](#sum_over_time) and [count_gt_over_time](#count_gt_over_time).
#### sum_le_over_time
`sum_le_over_time(series_selector[d], le)` is a [rollup function](#rollup-function), which calculates the sum of raw sample values smaller or equal to `le`
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
See also [sum_over_time](#sum_over_time) and [count_le_over_time](#count_le_over_time).
This function is supported by PromQL. See also [stddev_over_time](#stddev_over_time).
#### sum_over_time
@@ -913,27 +810,25 @@ Metric names are stripped from the resulting rollups. Add [keep_metric_names](#k
#### timestamp
`timestamp(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the last raw sample
`timestamp(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the last raw sample
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [time](#time) and [now](#now).
This function is supported by PromQL. See also [timestamp_with_name](#timestamp_with_name).
#### timestamp_with_name
`timestamp_with_name(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the last raw sample
`timestamp_with_name(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the last raw sample
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are preserved in the resulting rollups.
See also [timestamp](#timestamp) and [keep_metric_names](#keep_metric_names) modifier.
See also [timestamp](#timestamp).
#### tfirst_over_time
`tfirst_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the first raw sample
`tfirst_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the first raw sample
on the given lookbehind window `d` per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
@@ -942,7 +837,7 @@ See also [first_over_time](#first_over_time).
#### tlast_change_over_time
`tlast_change_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the last change
`tlast_change_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the last change
per each time series returned from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering) on the given lookbehind window `d`.
Metric names are stripped from the resulting rollups. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
@@ -957,7 +852,7 @@ See also [tlast_change_over_time](#tlast_change_over_time).
#### tmax_over_time
`tmax_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the raw sample
`tmax_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the raw sample
with the maximum value on the given lookbehind window `d`. It is calculated independently per each time series returned
from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
@@ -967,7 +862,7 @@ See also [max_over_time](#max_over_time).
#### tmin_over_time
`tmin_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds with millisecond precision for the raw sample
`tmin_over_time(series_selector[d])` is a [rollup function](#rollup-functions), which returns the timestamp in seconds for the raw sample
with the minimum value on the given lookbehind window `d`. It is calculated independently per each time series returned
from the given [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering).
@@ -996,7 +891,7 @@ Additional details:
* If transform function is applied directly to a [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering),
then the [default_rollup()](#default_rollup) function is automatically applied before calculating the transformations.
For example, `abs(temperature)` is implicitly transformed to `abs(default_rollup(temperature))`.
For example, `abs(temperature)` is implicitly transformed to `abs(default_rollup(temperature[1i]))`.
* All the transform functions accept optional `keep_metric_names` modifier. If it is set,
then the function doesn't drop metric names from the resulting time series. See [these docs](#keep_metric_names).
@@ -1014,9 +909,7 @@ This function is supported by PromQL.
`absent(q)` is a [transform function](#transform-functions), which returns 1 if `q` has no points. Otherwise, returns an empty result.
This function is supported by PromQL.
See also [absent_over_time](#absent_over_time).
This function is supported by PromQL. See also [absent_over_time](#absent_over_time).
#### acos
@@ -1025,9 +918,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [asin](#asin) and [cos](#cos).
This function is supported by PromQL. See also [asin](#asin) and [cos](#cos).
#### acosh
@@ -1036,9 +927,7 @@ See also [asin](#asin) and [cos](#cos).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [sinh](#cosh).
This function is supported by PromQL. See also [sinh](#cosh).
#### asin
@@ -1047,9 +936,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [acos](#acos) and [sin](#sin).
This function is supported by PromQL. See also [acos](#acos) and [sin](#sin).
#### asinh
@@ -1058,9 +945,7 @@ See also [acos](#acos) and [sin](#sin).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [sinh](#sinh).
This function is supported by PromQL. See also [sinh](#sinh).
#### atan
@@ -1069,9 +954,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [tan](#tan).
This function is supported by PromQL. See also [tan](#tan).
#### atanh
@@ -1080,9 +963,7 @@ See also [tan](#tan).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [tanh](#tanh).
This function is supported by PromQL. See also [tanh](#tanh).
#### bitmap_and
@@ -1113,33 +994,25 @@ See also [prometheus_buckets](#prometheus_buckets) and [histogram_quantile](#his
`ceil(q)` is a [transform function](#transform-functions), which rounds every point for every time series returned by `q` to the upper nearest integer.
This function is supported by PromQL.
See also [floor](#floor) and [round](#round).
This function is supported by PromQL. See also [floor](#floor) and [round](#round).
#### clamp
`clamp(q, min, max)` is a [transform function](#transform-functions), which clamps every point for every time series returned by `q` with the given `min` and `max` values.
This function is supported by PromQL.
See also [clamp_min](#clamp_min) and [clamp_max](#clamp_max).
This function is supported by PromQL. See also [clamp_min](#clamp_min) and [clamp_max](#clamp_max).
#### clamp_max
`clamp_max(q, max)` is a [transform function](#transform-functions), which clamps every point for every time series returned by `q` with the given `max` value.
This function is supported by PromQL.
See also [clamp](#clamp) and [clamp_min](#clamp_min).
This function is supported by PromQL. See also [clamp](#clamp) and [clamp_min](#clamp_min).
#### clamp_min
`clamp_min(q, min)` is a [transform function](#transform-functions), which clamps every point for every time series returned by `q` with the given `min` value.
This function is supported by PromQL.
See also [clamp](#clamp) and [clamp_max](#clamp_max).
This function is supported by PromQL. See also [clamp](#clamp) and [clamp_max](#clamp_max).
#### cos
@@ -1147,9 +1020,7 @@ See also [clamp](#clamp) and [clamp_max](#clamp_max).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [sin](#sin).
This function is supported by PromQL. See also [sin](#sin).
#### cosh
@@ -1158,9 +1029,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [acosh](#acosh).
This function is supported by PromQL. This function is supported by PromQL. See also [acosh](#acosh).
#### day_of_month
@@ -1171,8 +1040,6 @@ Metric names are stripped from the resulting series. Add [keep_metric_names](#ke
This function is supported by PromQL.
See also [day_of_week](#day_of_week) and [day_of_year](#day_of_year).
#### day_of_week
`day_of_week(q)` is a [transform function](#transform-functions), which returns the day of week for every point of every time series returned by `q`.
@@ -1182,19 +1049,6 @@ Metric names are stripped from the resulting series. Add [keep_metric_names](#ke
This function is supported by PromQL.
See also [day_of_month](#day_of_month) and [day_of_year](#day_of_year).
#### day_of_year
`day_of_year(q)` is a [transform function](#transform-functions), which returns the day of year for every point of every time series returned by `q`.
It is expected that `q` returns unix timestamps. The returned values are in the range `[1...365]` for non-leap years, and `[1 to 366]` in leap years.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [day_of_week](#day_of_week) and [day_of_month](#day_of_month).
#### days_in_month
`days_in_month(q)` is a [transform function](#transform-functions), which returns the number of days in the month identified
@@ -1212,9 +1066,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [rad](#rad).
This function is supported by PromQL. See also [rad](#rad).
#### drop_empty_series
@@ -1240,17 +1092,13 @@ See also [start](#start), [time](#time) and [now](#now).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [ln](#ln).
This function is supported by PromQL. See also [ln](#ln).
#### floor
`floor(q)` is a [transform function](#transform-functions), which rounds every point for every time series returned by `q` to the lower nearest integer.
This function is supported by PromQL.
See also [ceil](#ceil) and [round](#round).
This function is supported by PromQL. See also [ceil](#ceil) and [round](#round).
#### histogram_avg
@@ -1273,9 +1121,8 @@ When the [percentile](https://en.wikipedia.org/wiki/Percentile) is calculated ov
then all the input histograms **must** have buckets with identical boundaries, e.g. they must have the same set of `le` or `vmrange` labels.
Otherwise, the returned result may be invalid. See [this issue](https://github.com/VictoriaMetrics/VictoriaMetrics/issues/3231) for details.
This function is supported by PromQL (except of the `boundLabel` arg).
See also [histogram_quantiles](#histogram_quantiles), [histogram_share](#histogram_share) and [quantile](#quantile).
This function is supported by PromQL (except of the `boundLabel` arg). See also [histogram_quantiles](#histogram_quantiles), [histogram_share](#histogram_share)
and [quantile](#quantile).
#### histogram_quantiles
@@ -1347,9 +1194,7 @@ This allows implementing simple paging for `q` time series. See also [limitk](#l
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [exp](#exp) and [log2](#log2).
This function is supported by PromQL. See also [exp](#exp) and [log2](#log2).
#### log2
@@ -1357,9 +1202,7 @@ See also [exp](#exp) and [log2](#log2).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [log10](#log10) and [ln](#ln).
This function is supported by PromQL. See also [log10](#log10) and [ln](#ln).
#### log10
@@ -1367,9 +1210,7 @@ See also [log10](#log10) and [ln](#ln).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [log2](#log2) and [ln](#ln).
This function is supported by PromQL. See also [log2](#log2) and [ln](#ln).
#### minute
@@ -1408,9 +1249,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by PromQL.
See also [deg](#deg).
This function is supported by PromQL. See also [deg](#deg).
#### prometheus_buckets
@@ -1538,9 +1377,7 @@ for points returned by `q`, e.g. it is equivalent to the following query: `(q -
`round(q, nearest)` is a [transform function](#transform-functions), which rounds every point of every time series returned by `q` to the `nearest` multiple.
If `nearest` is missing then the rounding is performed to the nearest integer.
This function is supported by PromQL.
See also [floor](#floor) and [ceil](#ceil).
This function is supported by PromQL. See also [floor](#floor) and [ceil](#ceil).
#### ru
@@ -1584,9 +1421,7 @@ This function is supported by PromQL.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by MetricsQL.
See also [cos](#cos).
This function is supported by MetricsQL. See also [cos](#cos).
#### sinh
@@ -1595,9 +1430,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by MetricsQL.
See also [cosh](#cosh).
This function is supported by MetricsQL. See also [cosh](#cosh).
#### tan
@@ -1605,9 +1438,7 @@ See also [cosh](#cosh).
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by MetricsQL.
See also [atan](#atan).
This function is supported by MetricsQL. See also [atan](#atan).
#### tanh
@@ -1616,9 +1447,7 @@ for every point of every time series returned by `q`.
Metric names are stripped from the resulting series. Add [keep_metric_names](#keep_metric_names) modifier in order to keep metric names.
This function is supported by MetricsQL.
See also [atanh](#atanh).
This function is supported by MetricsQL. See also [atanh](#atanh).
#### smooth_exponential
@@ -1629,17 +1458,13 @@ by `q` using [exponential moving average](https://en.wikipedia.org/wiki/Moving_a
`sort(q)` is a [transform function](#transform-functions), which sorts series in ascending order by the last point in every time series returned by `q`.
This function is supported by PromQL.
See also [sort_desc](#sort_desc) and [sort_by_label](#sort_by_label).
This function is supported by PromQL. See also [sort_desc](#sort_desc) and [sort_by_label](#sort_by_label).
#### sort_desc
`sort_desc(q)` is a [transform function](#transform-functions), which sorts series in descending order by the last point in every time series returned by `q`.
This function is supported by PromQL.
See also [sort](#sort) and [sort_by_label](#sort_by_label_desc).
This function is supported by PromQL. See also [sort](#sort) and [sort_by_label](#sort_by_label_desc).
#### sqrt
@@ -1668,9 +1493,7 @@ See also [start](#start) and [end](#end).
`time()` is a [transform function](#transform-functions), which returns unix timestamp for every returned point.
This function is supported by PromQL.
See also [timestamp](#timestamp), [now](#now), [start](#start) and [end](#end).
This function is supported by PromQL. See also [now](#now), [start](#start) and [end](#end).
#### timezone_offset
@@ -1719,7 +1542,7 @@ Additional details:
* If label manipulation function is applied directly to a [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering),
then the [default_rollup()](#default_rollup) function is automatically applied before performing the label transformation.
For example, `alias(temperature, "foo")` is implicitly transformed to `alias(default_rollup(temperature), "foo")`.
For example, `alias(temperature, "foo")` is implicitly transformed to `alias(default_rollup(temperature[1i]), "foo")`.
See also [implicit query conversions](#implicit-query-conversions).
@@ -1896,7 +1719,7 @@ Additional details:
Multiple labels can be put in `by` and `without` modifiers.
* If the aggregate function is applied directly to a [series_selector](https://docs.victoriametrics.com/keyConcepts.html#filtering),
then the [default_rollup()](#default_rollup) function is automatically applied before calculating the aggregate.
For example, `count(up)` is implicitly transformed to `count(default_rollup(up))`.
For example, `count(up)` is implicitly transformed to `count(default_rollup(up[1i]))`.
* Aggregate functions accept arbitrary number of args. For example, `avg(q1, q2, q3)` would return the average values for every point
across time series returned by `q1`, `q2` and `q3`.
* Aggregate functions support optional `limit N` suffix, which can be used for limiting the number of output groups.
@@ -1924,9 +1747,7 @@ This function is supported by PromQL.
`bottomk(k, q)` is [aggregate function](#aggregate-functions), which returns up to `k` points with the smallest values across all the time series returned by `q`.
The aggregate is calculated individually per each group of points with the same timestamp.
This function is supported by PromQL.
See also [topk](#topk), [bottomk_min](#bottomk_min) and [#bottomk_last](#bottomk_last).
This function is supported by PromQL. See also [topk](#topk).
#### bottomk_avg
@@ -1988,14 +1809,10 @@ The aggregate is calculated individually per each group of points with the same
This function is supported by PromQL.
See also [count_values_over_time](#count_values_over_time) and [label_match](#label_match).
#### distinct
`distinct(q)` is [aggregate function](#aggregate-functions), which calculates the number of unique values per each group of points with the same timestamp.
See also [distinct_over_time](#distinct_over_time).
#### geomean
`geomean(q)` is [aggregate function](#aggregate-functions), which calculates geometric mean per each group of points with the same timestamp.
@@ -2087,9 +1904,7 @@ See also [outliers_iqr](#outliers_iqr) and [outliers_mad](#outliers_mad).
for all the time series returned by `q`. `phi` must be in the range `[0...1]`.
The aggregate is calculated individually per each group of points with the same timestamp.
This function is supported by PromQL.
See also [quantiles](#quantiles) and [histogram_quantile](#histogram_quantile).
This function is supported by PromQL. See also [quantiles](#quantiles) and [histogram_quantile](#histogram_quantile).
#### quantiles
@@ -2148,9 +1963,7 @@ for all the time series returned by `q`. The aggregate is calculated individuall
`topk(k, q)` is [aggregate function](#aggregate-functions), which returns up to `k` points with the biggest values across all the time series returned by `q`.
The aggregate is calculated individually per each group of points with the same timestamp.
This function is supported by PromQL.
See also [bottomk](#bottomk), [topk_max](#topk_max) and [topk_last](#topk_last).
This function is supported by PromQL. See also [bottomk](#bottomk).
#### topk_avg
@@ -2210,7 +2023,7 @@ See also [zscore_over_time](#zscore_over_time), [range_trim_zscore](#range_trim_
MetricsQL supports and extends PromQL subqueries. See [this article](https://valyala.medium.com/prometheus-subqueries-in-victoriametrics-9b1492b720b3) for details.
Any [rollup function](#rollup-functions) for something other than [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering) form a subquery.
Nested rollup functions can be implicit thanks to the [implicit query conversions](#implicit-query-conversions).
For example, `delta(sum(m))` is implicitly converted to `delta(sum(default_rollup(m))[1i:1i])`, so it becomes a subquery,
For example, `delta(sum(m))` is implicitly converted to `delta(sum(default_rollup(m[1i]))[1i:1i])`, so it becomes a subquery,
since it contains [default_rollup](#default_rollup) nested into [delta](#delta).
VictoriaMetrics performs subqueries in the following way:
@@ -2225,23 +2038,21 @@ VictoriaMetrics performs subqueries in the following way:
VictoriaMetrics performs the following implicit conversions for incoming queries before starting the calculations:
* If lookbehind window in square brackets is missing inside [rollup function](#rollup-functions), then it is automatically set to the following value:
- To `step` value passed to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query) or [/api/v1/query](https://docs.victoriametrics.com/keyconcepts/#instant-query)
for all the [rollup functions](#rollup-functions) except of [default_rollup](#default_rollup) and [rate](#rate). This value is known as `$__interval` in Grafana or `1i` in MetricsQL.
For example, `avg_over_time(temperature)` is automatically transformed to `avg_over_time(temperature[1i])`.
- To the `max(step, scrape_interval)`, where `scrape_interval` is the interval between [raw samples](https://docs.victoriametrics.com/keyconcepts/#raw-samples)
for [default_rollup](#default_rollup) and [rate](#rate) functions. This allows avoiding unexpected gaps on the graph when `step` is smaller than `scrape_interval`.
* If lookbehind window in square brackets is missing inside [rollup function](#rollup-functions),
then `[1i]` is automatically added there. The `[1i]` means one `step` value, which is passed
to [/api/v1/query_range](https://docs.victoriametrics.com/keyConcepts.html#range-query).
It is also known as `$__interval` in Grafana. For example, `rate(http_requests_count)` is automatically transformed to `rate(http_requests_count[1i])`.
* All the [series selectors](https://docs.victoriametrics.com/keyConcepts.html#filtering),
which aren't wrapped into [rollup functions](#rollup-functions), are automatically wrapped into [default_rollup](#default_rollup) function.
Examples:
* `foo` is transformed to `default_rollup(foo)`
* `foo + bar` is transformed to `default_rollup(foo) + default_rollup(bar)`
* `count(up)` is transformed to `count(default_rollup(up))`, because [count](#count) isn't a [rollup function](#rollup-functions) -
* `foo` is transformed to `default_rollup(foo[1i])`
* `foo + bar` is transformed to `default_rollup(foo[1i]) + default_rollup(bar[1i])`
* `count(up)` is transformed to `count(default_rollup(up[1i]))`, because [count](#count) isn't a [rollup function](#rollup-functions) -
it is [aggregate function](#aggregate-functions)
* `abs(temperature)` is transformed to `abs(default_rollup(temperature))`, because [abs](#abs) isn't a [rollup function](#rollup-functions) -
* `abs(temperature)` is transformed to `abs(default_rollup(temperature[1i]))`, because [abs](#abs) isn't a [rollup function](#rollup-functions) -
it is [transform function](#transform-functions)
* If `step` in square brackets is missing inside [subquery](#subqueries), then `1i` step is automatically added there.
For example, `avg_over_time(rate(http_requests_total[5m])[1h])` is automatically converted to `avg_over_time(rate(http_requests_total[5m])[1h:1i])`.
* If something other than [series selector](https://docs.victoriametrics.com/keyConcepts.html#filtering)
is passed to [rollup function](#rollup-functions), then a [subquery](#subqueries) with `1i` lookbehind window and `1i` step is automatically formed.
For example, `rate(sum(up))` is automatically converted to `rate((sum(default_rollup(up)))[1i:1i])`.
For example, `rate(sum(up))` is automatically converted to `rate((sum(default_rollup(up[1i])))[1i:1i])`.

View File

@@ -1,95 +0,0 @@
package datadogsketches
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/remotewrite"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogsketches"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogsketches/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vmagent_rows_inserted_total{type="datadogsketches"}`)
rowsTenantInserted = tenantmetrics.NewCounterMap(`vmagent_tenant_inserted_rows_total{type="datadogsketches"}`)
rowsPerInsert = metrics.NewHistogram(`vmagent_rows_per_insert{type="datadogsketches"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/beta/sketches request.
func InsertHandlerForHTTP(at *auth.Token, req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, func(sketches []*datadogsketches.Sketch) error {
return insertRows(at, sketches, extraLabels)
})
}
func insertRows(at *auth.Token, sketches []*datadogsketches.Sketch, extraLabels []prompbmarshal.Label) error {
ctx := common.GetPushCtx()
defer common.PutPushCtx(ctx)
rowsTotal := 0
tssDst := ctx.WriteRequest.Timeseries[:0]
labels := ctx.Labels[:0]
samples := ctx.Samples[:0]
for _, sketch := range sketches {
ms := sketch.ToSummary()
for _, m := range ms {
labelsLen := len(labels)
labels = append(labels, prompbmarshal.Label{
Name: "__name__",
Value: m.Name,
})
for _, label := range m.Labels {
labels = append(labels, prompbmarshal.Label{
Name: label.Name,
Value: label.Value,
})
}
for _, tag := range sketch.Tags {
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
labels = append(labels, prompbmarshal.Label{
Name: name,
Value: value,
})
}
labels = append(labels, extraLabels...)
samplesLen := len(samples)
for _, p := range m.Points {
samples = append(samples, prompbmarshal.Sample{
Timestamp: p.Timestamp,
Value: p.Value,
})
}
rowsTotal += len(m.Points)
tssDst = append(tssDst, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
}
ctx.WriteRequest.Timeseries = tssDst
ctx.Labels = labels
ctx.Samples = samples
if !remotewrite.TryPush(at, &ctx.WriteRequest) {
return remotewrite.ErrQueueFullHTTPRetry
}
rowsInserted.Add(rowsTotal)
if at != nil {
rowsTenantInserted.Get(at).Add(rowsTotal)
}
rowsPerInsert.Update(float64(rowsTotal))
return nil
}

View File

@@ -8,10 +8,10 @@ import (
"net/http"
"os"
"strings"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadogsketches"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmagent/graphite"
@@ -40,16 +40,16 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/metrics"
)
var (
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "TCP address to listen for incoming http requests. "+
httpListenAddr = flag.String("httpListenAddr", ":8429", "TCP address to listen for http connections. "+
"Set this flag to empty value in order to disable listening on any port. This mode may be useful for running multiple vmagent instances on the same server. "+
"Note that /targets and /metrics pages aren't available if -httpListenAddr=''. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the corresponding -httpListenAddr . "+
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
influxListenAddr = flag.String("influxListenAddr", "", "TCP and UDP address to listen for InfluxDB line protocol data. Usually :8089 must be set. Doesn't work if empty. "+
@@ -70,8 +70,7 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
dryRun = flag.Bool("dryRun", false, "Whether to check config files without running vmagent. The following files are checked: "+
"-promscrape.config, -remoteWrite.relabelConfig, -remoteWrite.urlRelabelConfig, -remoteWrite.streamAggr.config . "+
"Unknown config entries aren't allowed in -promscrape.config by default. This can be changed by passing -promscrape.config.strictParse=false command-line flag")
@@ -120,13 +119,8 @@ func main() {
return
}
listenAddrs := *httpListenAddrs
if len(listenAddrs) == 0 {
listenAddrs = []string{":8429"}
}
logger.Infof("starting vmagent at %q...", listenAddrs)
logger.Infof("starting vmagent at %q...", *httpListenAddr)
startTime := time.Now()
remotewrite.StartIngestionRateLimiter()
remotewrite.Init()
common.StartUnmarshalWorkers()
if len(*influxListenAddr) > 0 {
@@ -148,21 +142,24 @@ func main() {
promscrape.Init(remotewrite.PushDropSamplesOnFailure)
go httpserver.Serve(listenAddrs, useProxyProtocol, requestHandler)
if len(*httpListenAddr) > 0 {
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
}
logger.Infof("started vmagent in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("received signal %s", sig)
remotewrite.StopIngestionRateLimiter()
pushmetrics.Stop()
startTime = time.Now()
logger.Infof("gracefully shutting down webservice at %q", listenAddrs)
if err := httpserver.Stop(listenAddrs); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
if len(*httpListenAddr) > 0 {
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
promscrape.Stop()
@@ -233,6 +230,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
{"metric-relabel-debug", "debug metric relabeling"},
{"api/v1/targets", "advanced information about discovered targets in JSON format"},
{"config", "-promscrape.config contents"},
{"stream-agg", "streaming aggregation status"},
{"metrics", "available service metrics"},
{"flags", "command-line flags"},
{"-/reload", "reload configuration"},
@@ -264,7 +262,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write", "/api/v1/push", "/prometheus/api/v1/push":
case "/prometheus/api/v1/write", "/api/v1/write":
if common.HandleVMProtoServerHandshake(w, r) {
return true
}
@@ -316,14 +314,14 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
case "/opentelemetry/api/v1/push", "/opentelemetry/v1/metrics":
case "/opentelemetry/api/v1/push":
opentelemetryPushRequests.Inc()
if err := opentelemetry.InsertHandler(nil, r); err != nil {
opentelemetryPushErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
firehose.WriteSuccessResponse(w, r)
w.WriteHeader(http.StatusOK)
return true
case "/newrelic":
newrelicCheckRequest.Inc()
@@ -371,15 +369,6 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/beta/sketches":
datadogsketchesWriteRequests.Inc()
if err := datadogsketches.InsertHandlerForHTTP(nil, r); err != nil {
datadogsketchesWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(202)
return true
case "/datadog/api/v1/validate":
datadogValidateRequests.Inc()
// See https://docs.datadoghq.com/api/latest/authentication/#validate-api-key
@@ -434,7 +423,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/prometheus/config", "/config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeConfigRequests.Inc()
@@ -443,7 +432,7 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeStatusConfigRequests.Inc()
@@ -453,15 +442,15 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
return true
case "/stream-agg":
streamaggr.WriteHumanReadableState(w, r, remotewrite.GetAggregators())
return true
case "/ready":
if rdy := promscrape.PendingScrapeConfigs.Load(); rdy > 0 {
if rdy := atomic.LoadInt32(&promscrape.PendingScrapeConfigs); rdy > 0 {
errMsg := fmt.Sprintf("waiting for scrapes to init, left: %d", rdy)
http.Error(w, errMsg, http.StatusTooEarly)
} else {
@@ -513,7 +502,7 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
p.Suffix = strings.TrimSuffix(p.Suffix, "/")
}
switch p.Suffix {
case "prometheus/", "prometheus", "prometheus/api/v1/write", "prometheus/api/v1/push":
case "prometheus/", "prometheus", "prometheus/api/v1/write":
prometheusWriteRequests.Inc()
if err := promremotewrite.InsertHandler(at, r); err != nil {
prometheusWriteErrors.Inc()
@@ -562,14 +551,14 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
influxQueryRequests.Inc()
influxutils.WriteDatabaseNames(w)
return true
case "opentelemetry/api/v1/push", "opentelemetry/v1/metrics":
case "opentelemetry/api/v1/push":
opentelemetryPushRequests.Inc()
if err := opentelemetry.InsertHandler(at, r); err != nil {
opentelemetryPushErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
firehose.WriteSuccessResponse(w, r)
w.WriteHeader(http.StatusOK)
return true
case "newrelic":
newrelicCheckRequest.Inc()
@@ -615,15 +604,6 @@ func processMultitenantRequest(w http.ResponseWriter, r *http.Request, path stri
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "datadog/api/beta/sketches":
datadogsketchesWriteRequests.Inc()
if err := datadogsketches.InsertHandlerForHTTP(at, r); err != nil {
datadogsketchesWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(202)
return true
case "datadog/api/v1/validate":
datadogValidateRequests.Inc()
// See https://docs.datadoghq.com/api/latest/authentication/#validate-api-key
@@ -680,16 +660,13 @@ var (
datadogv2WriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogv2WriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogsketchesWriteRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/beta/sketches", protocol="datadog"}`)
datadogsketchesWriteErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/datadog/api/beta/sketches", protocol="datadog"}`)
datadogValidateRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)
datadogIntakeRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/intake", protocol="datadog"}`)
datadogMetadataRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/datadog/api/v1/metadata", protocol="datadog"}`)
opentelemetryPushRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushRequests = metrics.NewCounter(`vmagent_http_requests_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vmagent_http_request_errors_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
newrelicWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
newrelicWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)

View File

@@ -9,7 +9,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
@@ -28,15 +27,10 @@ func InsertHandler(at *auth.Token, req *http.Request) error {
return err
}
isGzipped := req.Header.Get("Content-Encoding") == "gzip"
var processBody func([]byte) ([]byte, error)
if req.Header.Get("Content-Type") == "application/json" {
if req.Header.Get("X-Amz-Firehose-Protocol-Version") != "" {
processBody = firehose.ProcessRequestBody
} else {
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
return fmt.Errorf("json encoding isn't supported for opentelemetry format. Use protobuf encoding")
}
return stream.ParseStream(req.Body, isGzipped, processBody, func(tss []prompbmarshal.TimeSeries) error {
return stream.ParseStream(req.Body, isGzipped, func(tss []prompbmarshal.TimeSeries) error {
return insertRows(at, tss, extraLabels)
})
}

View File

@@ -17,9 +17,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timerpool"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
)
@@ -31,12 +29,11 @@ var (
rateLimit = flagutil.NewArrayInt("remoteWrite.rateLimit", 0, "Optional rate limit in bytes per second for data sent to the corresponding -remoteWrite.url. "+
"By default, the rate limit is disabled. It can be useful for limiting load on remote storage when big amounts of buffered data "+
"is sent after temporary unavailability of the remote storage. See also -maxIngestionRate")
"is sent after temporary unavailability of the remote storage")
sendTimeout = flagutil.NewArrayDuration("remoteWrite.sendTimeout", time.Minute, "Timeout for sending a single block of data to the corresponding -remoteWrite.url")
proxyURL = flagutil.NewArrayString("remoteWrite.proxyURL", "Optional proxy URL for writing data to the corresponding -remoteWrite.url. "+
"Supported proxies: http, https, socks5. Example: -remoteWrite.proxyURL=socks5://proxy:1234")
tlsHandshakeTimeout = flagutil.NewArrayDuration("remoteWrite.tlsHandshakeTimeout", 20*time.Second, "The timeout for estabilishing tls connections to the corresponding -remoteWrite.url")
tlsInsecureSkipVerify = flagutil.NewArrayBool("remoteWrite.tlsInsecureSkipVerify", "Whether to skip tls verification when connecting to the corresponding -remoteWrite.url")
tlsCertFile = flagutil.NewArrayString("remoteWrite.tlsCertFile", "Optional path to client-side TLS certificate file to use when connecting "+
"to the corresponding -remoteWrite.url")
@@ -92,7 +89,7 @@ type client struct {
authCfg *promauth.Config
awsCfg *awsapi.Config
rl *ratelimiter.RateLimiter
rl rateLimiter
bytesSent *metrics.Counter
blocksSent *metrics.Counter
@@ -113,13 +110,18 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
if err != nil {
logger.Fatalf("cannot initialize auth config for -remoteWrite.url=%q: %s", remoteWriteURL, err)
}
tlsCfg, err := authCfg.NewTLSConfig()
if err != nil {
logger.Fatalf("cannot initialize tls config for -remoteWrite.url=%q: %s", remoteWriteURL, err)
}
awsCfg, err := getAWSAPIConfig(argIdx)
if err != nil {
logger.Fatalf("cannot initialize AWS Config for -remoteWrite.url=%q: %s", remoteWriteURL, err)
}
tr := &http.Transport{
DialContext: statDial,
TLSHandshakeTimeout: tlsHandshakeTimeout.GetOptionalArg(argIdx),
TLSClientConfig: tlsCfg,
TLSHandshakeTimeout: 10 * time.Second,
MaxConnsPerHost: 2 * concurrency,
MaxIdleConnsPerHost: 2 * concurrency,
IdleConnTimeout: time.Minute,
@@ -137,7 +139,7 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
tr.Proxy = http.ProxyURL(pu)
}
hc := &http.Client{
Transport: authCfg.NewRoundTripper(tr),
Transport: tr,
Timeout: sendTimeout.GetOptionalArg(argIdx),
}
c := &client{
@@ -173,11 +175,12 @@ func newHTTPClient(argIdx int, remoteWriteURL, sanitizedURL string, fq *persiste
}
func (c *client) init(argIdx, concurrency int, sanitizedURL string) {
limitReached := metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rate_limit_reached_total{url=%q}`, c.sanitizedURL))
if bytesPerSec := rateLimit.GetOptionalArg(argIdx); bytesPerSec > 0 {
logger.Infof("applying %d bytes per second rate limit for -remoteWrite.url=%q", bytesPerSec, sanitizedURL)
c.rl = ratelimiter.New(int64(bytesPerSec), limitReached, c.stopCh)
c.rl.perSecondLimit = int64(bytesPerSec)
}
c.rl.limitReached = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_rate_limit_reached_total{url=%q}`, c.sanitizedURL))
c.bytesSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_bytes_sent_total{url=%q}`, c.sanitizedURL))
c.blocksSent = metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_remotewrite_blocks_sent_total{url=%q}`, c.sanitizedURL))
c.rateLimit = metrics.GetOrCreateGauge(fmt.Sprintf(`vmagent_remotewrite_rate_limit{url=%q}`, c.sanitizedURL), func() float64 {
@@ -391,9 +394,8 @@ func (c *client) newRequest(url string, body []byte) (*http.Request, error) {
// The function returns false only if c.stopCh is closed.
// Otherwise it tries sending the block to remote storage indefinitely.
func (c *client) sendBlockHTTP(block []byte) bool {
c.rl.Register(len(block))
maxRetryDuration := timeutil.AddJitterToDuration(time.Minute)
retryDuration := timeutil.AddJitterToDuration(time.Second)
c.rl.register(len(block), c.stopCh)
retryDuration := time.Second
retriesCount := 0
again:
@@ -403,8 +405,8 @@ again:
if err != nil {
c.errorsCount.Inc()
retryDuration *= 2
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
if retryDuration > time.Minute {
retryDuration = time.Minute
}
logger.Warnf("couldn't send a block with size %d bytes to %q: %s; re-sending the block in %.3f seconds",
len(block), c.sanitizedURL, err, retryDuration.Seconds())
@@ -450,8 +452,8 @@ again:
// Unexpected status code returned
retriesCount++
retryDuration *= 2
if retryDuration > maxRetryDuration {
retryDuration = maxRetryDuration
if retryDuration > time.Minute {
retryDuration = time.Minute
}
body, err := io.ReadAll(resp.Body)
_ = resp.Body.Close()
@@ -474,3 +476,45 @@ again:
}
var remoteWriteRejectedLogger = logger.WithThrottler("remoteWriteRejected", 5*time.Second)
type rateLimiter struct {
perSecondLimit int64
// mu protects budget and deadline from concurrent access.
mu sync.Mutex
// The current budget. It is increased by perSecondLimit every second.
budget int64
// The next deadline for increasing the budget by perSecondLimit
deadline time.Time
limitReached *metrics.Counter
}
func (rl *rateLimiter) register(dataLen int, stopCh <-chan struct{}) {
limit := rl.perSecondLimit
if limit <= 0 {
return
}
rl.mu.Lock()
defer rl.mu.Unlock()
for rl.budget <= 0 {
if d := time.Until(rl.deadline); d > 0 {
rl.limitReached.Inc()
t := timerpool.Get(d)
select {
case <-stopCh:
timerpool.Put(t)
return
case <-t.C:
timerpool.Put(t)
}
}
rl.budget += limit
rl.deadline = time.Now().Add(time.Second)
}
rl.budget -= int64(dataLen)
}

View File

@@ -7,7 +7,6 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/cgroup"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/decimal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding/zstd"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
@@ -16,7 +15,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/persistentqueue"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/timeutil"
"github.com/VictoriaMetrics/metrics"
"github.com/golang/snappy"
)
@@ -71,8 +69,7 @@ func (ps *pendingSeries) periodicFlusher() {
if flushSeconds <= 0 {
flushSeconds = 1
}
d := timeutil.AddJitterToDuration(*flushInterval)
ticker := time.NewTicker(d)
ticker := time.NewTicker(*flushInterval)
defer ticker.Stop()
for {
select {
@@ -82,7 +79,7 @@ func (ps *pendingSeries) periodicFlusher() {
ps.mu.Unlock()
return
case <-ticker.C:
if fasttime.UnixTimestamp()-ps.wr.lastFlushTime.Load() < uint64(flushSeconds) {
if fasttime.UnixTimestamp()-atomic.LoadUint64(&ps.wr.lastFlushTime) < uint64(flushSeconds) {
continue
}
}
@@ -93,7 +90,8 @@ func (ps *pendingSeries) periodicFlusher() {
}
type writeRequest struct {
lastFlushTime atomic.Uint64
// Move lastFlushTime to the top of the struct in order to guarantee atomic access on 32-bit architectures.
lastFlushTime uint64
// The queue to send blocks to.
fq *persistentqueue.FastQueue
@@ -109,12 +107,11 @@ type writeRequest struct {
wr prompbmarshal.WriteRequest
tss []prompbmarshal.TimeSeries
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
samples []prompbmarshal.Sample
// buf holds labels data
buf []byte
buf []byte
}
func (wr *writeRequest) reset() {
@@ -154,7 +151,7 @@ func (wr *writeRequest) mustWriteBlock(block []byte) bool {
func (wr *writeRequest) tryFlush() bool {
wr.wr.Timeseries = wr.tss
wr.lastFlushTime.Store(fasttime.UnixTimestamp())
atomic.StoreUint64(&wr.lastFlushTime, fasttime.UnixTimestamp())
if !tryPushWriteRequest(&wr.wr, wr.fq.TryWriteBlock, wr.isVMRemoteWrite) {
return false
}
@@ -225,45 +222,33 @@ func (wr *writeRequest) copyTimeSeries(dst, src *prompbmarshal.TimeSeries) {
wr.buf = buf
}
// marshalConcurrency limits the maximum number of concurrent workers, which marshal and compress WriteRequest.
var marshalConcurrencyCh = make(chan struct{}, cgroup.AvailableCPUs())
func tryPushWriteRequest(wr *prompbmarshal.WriteRequest, tryPushBlock func(block []byte) bool, isVMRemoteWrite bool) bool {
if len(wr.Timeseries) == 0 {
// Nothing to push
return true
}
marshalConcurrencyCh <- struct{}{}
bb := writeRequestBufPool.Get()
bb.B = wr.MarshalProtobuf(bb.B[:0])
if len(bb.B) <= maxUnpackedBlockSize.IntN() {
zb := compressBufPool.Get()
zb := snappyBufPool.Get()
if isVMRemoteWrite {
zb.B = zstd.CompressLevel(zb.B[:0], bb.B, *vmProtoCompressLevel)
} else {
zb.B = snappy.Encode(zb.B[:cap(zb.B)], bb.B)
}
writeRequestBufPool.Put(bb)
<-marshalConcurrencyCh
if len(zb.B) <= persistentqueue.MaxBlockSize {
zbLen := len(zb.B)
ok := tryPushBlock(zb.B)
compressBufPool.Put(zb)
if ok {
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(zbLen))
if !tryPushBlock(zb.B) {
return false
}
return ok
blockSizeRows.Update(float64(len(wr.Timeseries)))
blockSizeBytes.Update(float64(len(zb.B)))
snappyBufPool.Put(zb)
return true
}
compressBufPool.Put(zb)
snappyBufPool.Put(zb)
} else {
writeRequestBufPool.Put(bb)
<-marshalConcurrencyCh
}
// Too big block. Recursively split it into smaller parts if possible.
@@ -309,7 +294,5 @@ var (
blockSizeRows = metrics.NewHistogram(`vmagent_remotewrite_block_size_rows`)
)
var (
writeRequestBufPool bytesutil.ByteBufferPool
compressBufPool bytesutil.ByteBufferPool
)
var writeRequestBufPool bytesutil.ByteBufferPool
var snappyBufPool bytesutil.ByteBufferPool

View File

@@ -27,7 +27,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promutils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/ratelimiter"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/tenantmetrics"
"github.com/VictoriaMetrics/metrics"
@@ -51,17 +50,13 @@ var (
"By default the data is replicated across all the -remoteWrite.url . See https://docs.victoriametrics.com/vmagent.html#sharding-among-remote-storages")
shardByURLLabels = flagutil.NewArrayString("remoteWrite.shardByURL.labels", "Optional list of labels, which must be used for sharding outgoing samples "+
"among remote storage systems if -remoteWrite.shardByURL command-line flag is set. By default all the labels are used for sharding in order to gain "+
"even distribution of series over the specified -remoteWrite.url systems. See also -remoteWrite.shardByURL.ignoreLabels")
shardByURLIgnoreLabels = flagutil.NewArrayString("remoteWrite.shardByURL.ignoreLabels", "Optional list of labels, which must be ignored when sharding outgoing samples "+
"among remote storage systems if -remoteWrite.shardByURL command-line flag is set. By default all the labels are used for sharding in order to gain "+
"even distribution of series over the specified -remoteWrite.url systems. See also -remoteWrite.shardByURL.labels")
"even distribution of series over the specified -remoteWrite.url systems")
tmpDataPath = flag.String("remoteWrite.tmpDataPath", "vmagent-remotewrite-data", "Path to directory for storing pending data, which isn't sent to the configured -remoteWrite.url . "+
"See also -remoteWrite.maxDiskUsagePerURL and -remoteWrite.disableOnDiskQueue")
keepDanglingQueues = flag.Bool("remoteWrite.keepDanglingQueues", false, "Keep persistent queues contents at -remoteWrite.tmpDataPath in case there are no matching -remoteWrite.url. "+
"Useful when -remoteWrite.url is changed temporarily and persistent queue files will be needed later on.")
queues = flag.Int("remoteWrite.queues", cgroup.AvailableCPUs()*2, "The number of concurrent queues to each -remoteWrite.url. Set more queues if default number of queues "+
"isn't enough for sending high volume of collected data to remote storage. "+
"Default value depends on the number of available CPU cores. It should work fine in most cases since it minimizes resource usage")
"isn't enough for sending high volume of collected data to remote storage. Default value is 2 * numberOfAvailableCPUs")
showRemoteWriteURL = flag.Bool("remoteWrite.showURL", false, "Whether to show -remoteWrite.url in the exported metrics. "+
"It is hidden by default, since it can contain sensitive info such as auth key")
maxPendingBytesPerURL = flagutil.NewArrayBytes("remoteWrite.maxDiskUsagePerURL", 0, "The maximum file-based buffer size in bytes at -remoteWrite.tmpDataPath "+
@@ -84,8 +79,6 @@ var (
"Excess series are logged and dropped. This can be useful for limiting series cardinality. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
maxDailySeries = flag.Int("remoteWrite.maxDailySeries", 0, "The maximum number of unique series vmagent can send to remote storage systems during the last 24 hours. "+
"Excess series are logged and dropped. This can be useful for limiting series churn rate. See https://docs.victoriametrics.com/vmagent.html#cardinality-limiter")
maxIngestionRate = flag.Int("maxIngestionRate", 0, "The maximum number of samples vmagent can receive per second. Data ingestion is paused when the limit is exceeded. "+
"By default there are no limits on samples ingestion rate. See also -remoteWrite.rateLimit")
streamAggrConfig = flagutil.NewArrayString("remoteWrite.streamAggr.config", "Optional path to file with stream aggregation config. "+
"See https://docs.victoriametrics.com/stream-aggregation.html . "+
@@ -96,13 +89,8 @@ var (
streamAggrDropInput = flagutil.NewArrayBool("remoteWrite.streamAggr.dropInput", "Whether to drop all the input samples after the aggregation "+
"with -remoteWrite.streamAggr.config. By default, only aggregates samples are dropped, while the remaining samples "+
"are written to the corresponding -remoteWrite.url . See also -remoteWrite.streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before optional aggregation "+
"with -remoteWrite.streamAggr.config . See also -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication")
streamAggrIgnoreOldSamples = flagutil.NewArrayBool("remoteWrite.streamAggr.ignoreOldSamples", "Whether to ignore input samples with old timestamps outside the current aggregation interval "+
"for the corresponding -remoteWrite.streamAggr.config . See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples")
streamAggrDropInputLabels = flagutil.NewArrayString("streamAggr.dropInputLabels", "An optional list of labels to drop from samples "+
"before stream de-duplication and aggregation . See https://docs.victoriametrics.com/stream-aggregation.html#dropping-unneeded-labels")
streamAggrDedupInterval = flagutil.NewArrayDuration("remoteWrite.streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
disableOnDiskQueue = flag.Bool("remoteWrite.disableOnDiskQueue", false, "Whether to disable storing pending data to -remoteWrite.tmpDataPath "+
"when the configured remote storage systems cannot keep up with the data ingestion rate. See https://docs.victoriametrics.com/vmagent.html#disabling-on-disk-persistence ."+
"See also -remoteWrite.dropSamplesOnOverload")
@@ -152,10 +140,7 @@ func InitSecretFlags() {
}
}
var (
shardByURLLabelsMap map[string]struct{}
shardByURLIgnoreLabelsMap map[string]struct{}
)
var shardByURLLabelsMap map[string]struct{}
// Init initializes remotewrite.
//
@@ -187,21 +172,19 @@ func Init() {
return float64(dailySeriesLimiter.CurrentItems())
})
}
if *queues > maxQueues {
*queues = maxQueues
}
if *queues <= 0 {
*queues = 1
}
if len(*shardByURLLabels) > 0 && len(*shardByURLIgnoreLabels) > 0 {
logger.Fatalf("-remoteWrite.shardByURL.labels and -remoteWrite.shardByURL.ignoreLabels cannot be set simultaneously; " +
"see https://docs.victoriametrics.com/vmagent/#sharding-among-remote-storages")
if len(*shardByURLLabels) > 0 {
m := make(map[string]struct{}, len(*shardByURLLabels))
for _, label := range *shardByURLLabels {
m[label] = struct{}{}
}
shardByURLLabelsMap = m
}
shardByURLLabelsMap = newMapFromStrings(*shardByURLLabels)
shardByURLIgnoreLabelsMap = newMapFromStrings(*shardByURLIgnoreLabels)
initLabelsGlobal()
// Register SIGHUP handler for config reload before loadRelabelConfigs.
@@ -353,35 +336,6 @@ func newRemoteWriteCtxs(at *auth.Token, urls []string) []*remoteWriteCtx {
var configReloaderStopCh = make(chan struct{})
var configReloaderWG sync.WaitGroup
// StartIngestionRateLimiter starts ingestion rate limiter.
//
// Ingestion rate limiter must be started before Init() call.
//
// StopIngestionRateLimiter must be called before Stop() call in order to unblock all the callers
// to ingestion rate limiter. Otherwise deadlock may occur at Stop() call.
func StartIngestionRateLimiter() {
if *maxIngestionRate <= 0 {
return
}
ingestionRateLimitReached := metrics.NewCounter(`vmagent_max_ingestion_rate_limit_reached_total`)
ingestionRateLimiterStopCh = make(chan struct{})
ingestionRateLimiter = ratelimiter.New(int64(*maxIngestionRate), ingestionRateLimitReached, ingestionRateLimiterStopCh)
}
// StopIngestionRateLimiter stops ingestion rate limiter.
func StopIngestionRateLimiter() {
if ingestionRateLimiterStopCh == nil {
return
}
close(ingestionRateLimiterStopCh)
ingestionRateLimiterStopCh = nil
}
var (
ingestionRateLimiter *ratelimiter.RateLimiter
ingestionRateLimiterStopCh chan struct{}
)
// Stop stops remotewrite.
//
// It is expected that nobody calls TryPush during and after the call to this func.
@@ -508,9 +462,6 @@ func tryPush(at *auth.Token, wr *prompbmarshal.WriteRequest, dropSamplesOnFailur
break
}
}
ingestionRateLimiter.Register(samplesCount)
tssBlock := tss
if i < len(tss) {
tssBlock = tss[:i]
@@ -575,15 +526,6 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmar
hashLabels = append(hashLabels, label)
}
}
tmpLabels.Labels = hashLabels
} else if len(shardByURLIgnoreLabelsMap) > 0 {
hashLabels = tmpLabels.Labels[:0]
for _, label := range ts.Labels {
if _, ok := shardByURLIgnoreLabelsMap[label.Name]; !ok {
hashLabels = append(hashLabels, label)
}
}
tmpLabels.Labels = hashLabels
}
h := getLabelsHash(hashLabels)
idx := h % uint64(len(tssByURL))
@@ -594,22 +536,22 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmar
// Push sharded data to remote storages in parallel in order to reduce
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
var anyPushFailed atomic.Bool
wg.Add(len(rwctxs))
var anyPushFailed uint64
for i, rwctx := range rwctxs {
tssShard := tssByURL[i]
if len(tssShard) == 0 {
continue
}
wg.Add(1)
go func(rwctx *remoteWriteCtx, tss []prompbmarshal.TimeSeries) {
defer wg.Done()
if !rwctx.TryPush(tss) {
anyPushFailed.Store(true)
atomic.StoreUint64(&anyPushFailed, 1)
}
}(rwctx, tssShard)
}
wg.Wait()
return !anyPushFailed.Load()
return atomic.LoadUint64(&anyPushFailed) == 0
}
// Replicate data among rwctxs.
@@ -617,17 +559,17 @@ func tryPushBlockToRemoteStorages(rwctxs []*remoteWriteCtx, tssBlock []prompbmar
// the time needed for sending the data to multiple remote storage systems.
var wg sync.WaitGroup
wg.Add(len(rwctxs))
var anyPushFailed atomic.Bool
var anyPushFailed uint64
for _, rwctx := range rwctxs {
go func(rwctx *remoteWriteCtx) {
defer wg.Done()
if !rwctx.TryPush(tssBlock) {
anyPushFailed.Store(true)
atomic.StoreUint64(&anyPushFailed, 1)
}
}(rwctx)
}
wg.Wait()
return !anyPushFailed.Load()
return atomic.LoadUint64(&anyPushFailed) == 0
}
// sortLabelsIfNeeded sorts labels if -sortLabels command-line flag is set.
@@ -723,14 +665,12 @@ type remoteWriteCtx struct {
fq *persistentqueue.FastQueue
c *client
sas atomic.Pointer[streamaggr.Aggregators]
deduplicator *streamaggr.Deduplicator
sas atomic.Pointer[streamaggr.Aggregators]
streamAggrKeepInput bool
streamAggrDropInput bool
pss []*pendingSeries
pssNextIdx atomic.Uint64
pssNextIdx uint64
rowsPushedAfterRelabel *metrics.Counter
rowsDroppedByRelabel *metrics.Counter
@@ -798,15 +738,9 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
// Initialize sas
sasFile := streamAggrConfig.GetOptionalArg(argIdx)
dedupInterval := streamAggrDedupInterval.GetOptionalArg(argIdx)
ignoreOldSamples := streamAggrIgnoreOldSamples.GetOptionalArg(argIdx)
if sasFile != "" {
opts := &streamaggr.Options{
DedupInterval: dedupInterval,
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: ignoreOldSamples,
}
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, opts)
dedupInterval := streamAggrDedupInterval.GetOptionalArg(argIdx)
sas, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, dedupInterval)
if err != nil {
logger.Fatalf("cannot initialize stream aggregators from -remoteWrite.streamAggr.config=%q: %s", sasFile, err)
}
@@ -815,24 +749,17 @@ func newRemoteWriteCtx(argIdx int, remoteWriteURL *url.URL, maxInmemoryBlocks in
rwctx.streamAggrDropInput = streamAggrDropInput.GetOptionalArg(argIdx)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(1)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_success_timestamp_seconds{path=%q}`, sasFile)).Set(fasttime.UnixTimestamp())
} else if dedupInterval > 0 {
rwctx.deduplicator = streamaggr.NewDeduplicator(rwctx.pushInternalTrackDropped, dedupInterval, *streamAggrDropInputLabels)
}
return rwctx
}
func (rwctx *remoteWriteCtx) MustStop() {
// sas and deduplicator must be stopped before rwctx is closed
// sas must be stopped before rwctx is closed
// because sas can write pending series to rwctx.pss if there are any
sas := rwctx.sas.Swap(nil)
sas.MustStop()
if rwctx.deduplicator != nil {
rwctx.deduplicator.MustStop()
rwctx.deduplicator = nil
}
for _, ps := range rwctx.pss {
ps.MustStop()
}
@@ -871,7 +798,7 @@ func (rwctx *remoteWriteCtx) TryPush(tss []prompbmarshal.TimeSeries) bool {
rowsCount := getRowsCount(tss)
rwctx.rowsPushedAfterRelabel.Add(rowsCount)
// Apply stream aggregation or deduplication if they are configured
// Apply stream aggregation if any
sas := rwctx.sas.Load()
if sas != nil {
matchIdxs := matchIdxsPool.Get()
@@ -886,10 +813,6 @@ func (rwctx *remoteWriteCtx) TryPush(tss []prompbmarshal.TimeSeries) bool {
tss = dropAggregatedSeries(tss, matchIdxs.B, rwctx.streamAggrDropInput)
}
matchIdxsPool.Put(matchIdxs)
} else if rwctx.deduplicator != nil {
rwctx.deduplicator.Push(tss)
clear(tss)
tss = tss[:0]
}
// Try pushing the data to remote storage
@@ -918,7 +841,7 @@ func dropAggregatedSeries(src []prompbmarshal.TimeSeries, matchIdxs []byte, drop
}
}
tail := src[len(dst):]
clear(tail)
_ = prompbmarshal.ResetTimeSeries(tail)
return dst
}
@@ -949,7 +872,7 @@ func (rwctx *remoteWriteCtx) tryPushInternal(tss []prompbmarshal.TimeSeries) boo
}
pss := rwctx.pss
idx := rwctx.pssNextIdx.Add(1) % uint64(len(pss))
idx := atomic.AddUint64(&rwctx.pssNextIdx, 1) % uint64(len(pss))
ok := pss[idx].TryPush(tss)
@@ -971,12 +894,8 @@ func (rwctx *remoteWriteCtx) reinitStreamAggr() {
logger.Infof("reloading stream aggregation configs pointed by -remoteWrite.streamAggr.config=%q", sasFile)
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_total{path=%q}`, sasFile)).Inc()
opts := &streamaggr.Options{
DedupInterval: streamAggrDedupInterval.GetOptionalArg(rwctx.idx),
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: streamAggrIgnoreOldSamples.GetOptionalArg(rwctx.idx),
}
sasNew, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, opts)
dedupInterval := streamAggrDedupInterval.GetOptionalArg(rwctx.idx)
sasNew, err := streamaggr.LoadFromFile(sasFile, rwctx.pushInternalTrackDropped, dedupInterval)
if err != nil {
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reloads_errors_total{path=%q}`, sasFile)).Inc()
metrics.GetOrCreateCounter(fmt.Sprintf(`vmagent_streamaggr_config_reload_successful{path=%q}`, sasFile)).Set(0)
@@ -1013,17 +932,13 @@ func getRowsCount(tss []prompbmarshal.TimeSeries) int {
// CheckStreamAggrConfigs checks configs pointed by -remoteWrite.streamAggr.config
func CheckStreamAggrConfigs() error {
pushNoop := func(_ []prompbmarshal.TimeSeries) {}
pushNoop := func(tss []prompbmarshal.TimeSeries) {}
for idx, sasFile := range *streamAggrConfig {
if sasFile == "" {
continue
}
opts := &streamaggr.Options{
DedupInterval: streamAggrDedupInterval.GetOptionalArg(idx),
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: streamAggrIgnoreOldSamples.GetOptionalArg(idx),
}
sas, err := streamaggr.LoadFromFile(sasFile, pushNoop, opts)
dedupInterval := streamAggrDedupInterval.GetOptionalArg(idx)
sas, err := streamaggr.LoadFromFile(sasFile, pushNoop, dedupInterval)
if err != nil {
return fmt.Errorf("cannot load -remoteWrite.streamAggr.config=%q: %w", sasFile, err)
}
@@ -1032,10 +947,23 @@ func CheckStreamAggrConfigs() error {
return nil
}
func newMapFromStrings(a []string) map[string]struct{} {
m := make(map[string]struct{}, len(a))
for _, s := range a {
m[s] = struct{}{}
// GetAggregators returns aggregators for all the configured remote writes.
func GetAggregators() map[string]*streamaggr.Aggregators {
var result = map[string]*streamaggr.Aggregators{}
if len(*remoteWriteMultitenantURLs) > 0 {
rwctxsMapLock.Lock()
for tenant, rwctxs := range rwctxsMap {
for rwNum, rw := range rwctxs {
result[fmt.Sprintf("rw %d for tenant %v:%v", rwNum, tenant.AccountID, tenant.ProjectID)] = rw.sas.Load()
}
}
rwctxsMapLock.Unlock()
} else {
for rwNum, rw := range rwctxsDefault {
result[fmt.Sprintf("remote write %d", rwNum)] = rw.sas.Load()
}
}
return m
return result
}

View File

@@ -50,7 +50,7 @@ var (
)
type statConn struct {
closed atomic.Int32
closed uint64
net.Conn
}
@@ -76,7 +76,7 @@ func (sc *statConn) Write(p []byte) (int, error) {
func (sc *statConn) Close() error {
err := sc.Conn.Close()
if sc.closed.Add(1) == 1 {
if atomic.AddUint64(&sc.closed, 1) == 1 {
conns.Dec()
}
return err

View File

@@ -25,7 +25,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmselect/promql"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
@@ -185,8 +184,7 @@ func processFlags() {
func setUp() {
vmstorage.Init(promql.ResetRollupResultCacheIfNeeded)
var ab flagutil.ArrayBool
go httpserver.Serve([]string{httpListenAddr}, &ab, func(w http.ResponseWriter, r *http.Request) bool {
go httpserver.Serve(httpListenAddr, false, func(w http.ResponseWriter, r *http.Request) bool {
switch r.URL.Path {
case "/prometheus/api/v1/query":
if err := prometheus.QueryHandler(nil, time.Now(), w, r); err != nil {
@@ -227,7 +225,7 @@ checkCheck:
}
func tearDown() {
if err := httpserver.Stop([]string{httpListenAddr}); err != nil {
if err := httpserver.Stop(httpListenAddr); err != nil {
logger.Errorf("cannot stop the webservice: %s", err)
}
vmstorage.Stop()

View File

@@ -68,7 +68,6 @@ publish-vmalert:
test-vmalert:
go test -v -race -cover ./app/vmalert -loggerLevel=ERROR
go test -v -race -cover ./app/vmalert/rule
go test -v -race -cover ./app/vmalert/templates
go test -v -race -cover ./app/vmalert/datasource
go test -v -race -cover ./app/vmalert/notifier

View File

@@ -1158,9 +1158,9 @@
$labels.pod }}.'
runbook_url: https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-cputhrottlinghigh
expr: |
sum(increase(container_cpu_cfs_throttled_periods_total{container!="", }[5m])) by (cluster, container, pod, namespace)
sum(increase(container_cpu_cfs_throttled_periods_total{container!="", }[5m])) by (container, pod, namespace)
/
sum(increase(container_cpu_cfs_periods_total{}[5m])) by (cluster, container, pod, namespace)
sum(increase(container_cpu_cfs_periods_total{}[5m])) by (container, pod, namespace)
> ( 25 / 100 )
for: 15m
labels:

View File

@@ -22,7 +22,6 @@ groups:
{{ . | first | value }}
{{ end }}
description: "It is {{ $value }} connections for {{$labels.instance}}"
link: http://localhost:3000/d/wNf0q_kZk?viewPanel=51&from={{($activeAt.Add (parseDurationTime "1h")).UnixMilli}}&to={{($activeAt.Add (parseDurationTime "-1h")).UnixMilli}}
- alert: ExampleAlertAlwaysFiring
update_entries_limit: -1
expr: sum by(job)

View File

@@ -10,7 +10,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
)
@@ -46,7 +45,7 @@ var (
oauth2TokenURL = flag.String("datasource.oauth2.tokenUrl", "", "Optional OAuth2 tokenURL to use for -datasource.url")
oauth2Scopes = flag.String("datasource.oauth2.scopes", "", "Optional OAuth2 scopes to use for -datasource.url. Scopes must be delimited by ';'")
lookBack = flag.Duration("datasource.lookback", 0, `Deprecated: please adjust "-search.latencyOffset" at datasource side `+
lookBack = flag.Duration("datasource.lookback", 0, `Will be deprecated soon, please adjust "-search.latencyOffset" at datasource side `+
`or specify "latency_offset" in rule group's params. Lookback defines how far into the past to look when evaluating queries. `+
`For example, if the datasource.lookback=5m then param "time" with value now()-5m will be added to every query.`)
queryStep = flag.Duration("datasource.queryStep", 5*time.Minute, "How far a value can fallback to when evaluating queries. "+
@@ -91,10 +90,10 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
logger.Warnf("flag `-datasource.queryTimeAlignment` is deprecated and will be removed in next releases. Please use `eval_alignment` in rule group instead.")
}
if *lookBack != 0 {
logger.Warnf("flag `-datasource.lookback` is deprecated and will be removed in next releases. Please adjust `-search.latencyOffset` at datasource side or specify `latency_offset` in rule group's params. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5155 for details.")
logger.Warnf("flag `-datasource.lookback` will be deprecated soon. Please use `-rule.evalDelay` command-line flag instead. See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/5155 for details.")
}
tr, err := httputils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
tr, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
@@ -133,6 +132,7 @@ func Init(extraParams url.Values) (QuerierBuilder, error) {
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(*addr, "/"),
appendTypePrefix: *appendTypePrefix,
lookBack: *lookBack,
queryStep: *queryStep,
dataSourceType: datasourcePrometheus,
extraParams: extraParams,

View File

@@ -35,6 +35,7 @@ type VMStorage struct {
authCfg *promauth.Config
datasourceURL string
appendTypePrefix bool
lookBack time.Duration
queryStep time.Duration
dataSourceType datasourceType
@@ -62,6 +63,7 @@ func (s *VMStorage) Clone() *VMStorage {
authCfg: s.authCfg,
datasourceURL: s.datasourceURL,
appendTypePrefix: s.appendTypePrefix,
lookBack: s.lookBack,
queryStep: s.queryStep,
dataSourceType: s.dataSourceType,
@@ -120,12 +122,13 @@ func (s *VMStorage) BuildWithParams(params QuerierParams) Querier {
}
// NewVMStorage is a constructor for VMStorage
func NewVMStorage(baseURL string, authCfg *promauth.Config, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
func NewVMStorage(baseURL string, authCfg *promauth.Config, lookBack time.Duration, queryStep time.Duration, appendTypePrefix bool, c *http.Client) *VMStorage {
return &VMStorage{
c: c,
authCfg: authCfg,
datasourceURL: strings.TrimSuffix(baseURL, "/"),
appendTypePrefix: appendTypePrefix,
lookBack: lookBack,
queryStep: queryStep,
dataSourceType: datasourcePrometheus,
extraParams: url.Values{},
@@ -134,11 +137,11 @@ func NewVMStorage(baseURL string, authCfg *promauth.Config, queryStep time.Durat
// Query executes the given query and returns parsed response
func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) (Result, *http.Request, error) {
req, err := s.newQueryRequest(ctx, query, ts)
req, err := s.newQueryRequest(query, ts)
if err != nil {
return Result{}, nil, err
}
resp, err := s.do(req)
resp, err := s.do(ctx, req)
if err != nil {
if !errors.Is(err, io.EOF) && !errors.Is(err, io.ErrUnexpectedEOF) {
// Return unexpected error to the caller.
@@ -146,11 +149,11 @@ func (s *VMStorage) Query(ctx context.Context, query string, ts time.Time) (Resu
}
// Something in the middle between client and datasource might be closing
// the connection. So we do a one more attempt in hope request will succeed.
req, err = s.newQueryRequest(ctx, query, ts)
req, err = s.newQueryRequest(query, ts)
if err != nil {
return Result{}, nil, fmt.Errorf("second attempt: %w", err)
}
resp, err = s.do(req)
resp, err = s.do(ctx, req)
if err != nil {
return Result{}, nil, fmt.Errorf("second attempt: %w", err)
}
@@ -179,11 +182,11 @@ func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end tim
if end.IsZero() {
return res, fmt.Errorf("end param is missing")
}
req, err := s.newQueryRangeRequest(ctx, query, start, end)
req, err := s.newQueryRangeRequest(query, start, end)
if err != nil {
return res, err
}
resp, err := s.do(req)
resp, err := s.do(ctx, req)
if err != nil {
if !errors.Is(err, io.EOF) && !errors.Is(err, io.ErrUnexpectedEOF) {
// Return unexpected error to the caller.
@@ -191,11 +194,11 @@ func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end tim
}
// Something in the middle between client and datasource might be closing
// the connection. So we do a one more attempt in hope request will succeed.
req, err = s.newQueryRangeRequest(ctx, query, start, end)
req, err = s.newQueryRangeRequest(query, start, end)
if err != nil {
return res, fmt.Errorf("second attempt: %w", err)
}
resp, err = s.do(req)
resp, err = s.do(ctx, req)
if err != nil {
return res, fmt.Errorf("second attempt: %w", err)
}
@@ -207,7 +210,7 @@ func (s *VMStorage) QueryRange(ctx context.Context, query string, start, end tim
return res, err
}
func (s *VMStorage) do(req *http.Request) (*http.Response, error) {
func (s *VMStorage) do(ctx context.Context, req *http.Request) (*http.Response, error) {
ru := req.URL.Redacted()
if *showDatasourceURL {
ru = req.URL.String()
@@ -215,7 +218,7 @@ func (s *VMStorage) do(req *http.Request) (*http.Response, error) {
if s.debug {
logger.Infof("DEBUG datasource request: executing %s request with params %q", req.Method, ru)
}
resp, err := s.c.Do(req)
resp, err := s.c.Do(req.WithContext(ctx))
if err != nil {
return nil, fmt.Errorf("error getting response from %s: %w", ru, err)
}
@@ -227,8 +230,8 @@ func (s *VMStorage) do(req *http.Request) (*http.Response, error) {
return resp, nil
}
func (s *VMStorage) newQueryRangeRequest(ctx context.Context, query string, start, end time.Time) (*http.Request, error) {
req, err := s.newRequest(ctx)
func (s *VMStorage) newQueryRangeRequest(query string, start, end time.Time) (*http.Request, error) {
req, err := s.newRequest()
if err != nil {
return nil, fmt.Errorf("cannot create query_range request to datasource %q: %w", s.datasourceURL, err)
}
@@ -236,8 +239,8 @@ func (s *VMStorage) newQueryRangeRequest(ctx context.Context, query string, star
return req, nil
}
func (s *VMStorage) newQueryRequest(ctx context.Context, query string, ts time.Time) (*http.Request, error) {
req, err := s.newRequest(ctx)
func (s *VMStorage) newQueryRequest(query string, ts time.Time) (*http.Request, error) {
req, err := s.newRequest()
if err != nil {
return nil, fmt.Errorf("cannot create query request to datasource %q: %w", s.datasourceURL, err)
}
@@ -245,15 +248,15 @@ func (s *VMStorage) newQueryRequest(ctx context.Context, query string, ts time.T
case "", datasourcePrometheus:
s.setPrometheusInstantReqParams(req, query, ts)
case datasourceGraphite:
s.setGraphiteReqParams(req, query)
s.setGraphiteReqParams(req, query, ts)
default:
logger.Panicf("BUG: engine not found: %q", s.dataSourceType)
}
return req, nil
}
func (s *VMStorage) newRequest(ctx context.Context) (*http.Request, error) {
req, err := http.NewRequestWithContext(ctx, http.MethodPost, s.datasourceURL, nil)
func (s *VMStorage) newRequest() (*http.Request, error) {
req, err := http.NewRequest(http.MethodPost, s.datasourceURL, nil)
if err != nil {
logger.Panicf("BUG: unexpected error from http.NewRequest(%q): %s", s.datasourceURL, err)
}

View File

@@ -4,6 +4,8 @@ import (
"encoding/json"
"fmt"
"net/http"
"strconv"
"time"
)
type graphiteResponse []graphiteResponseTarget
@@ -46,13 +48,17 @@ const (
graphitePrefix = "/graphite"
)
func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string) {
func (s *VMStorage) setGraphiteReqParams(r *http.Request, query string, timestamp time.Time) {
if s.appendTypePrefix {
r.URL.Path += graphitePrefix
}
r.URL.Path += graphitePath
q := r.URL.Query()
from := "-5min"
if s.lookBack > 0 {
lookBack := timestamp.Add(-s.lookBack)
from = strconv.FormatInt(lookBack.Unix(), 10)
}
q.Set("from", from)
q.Set("format", "json")
q.Set("target", query)

View File

@@ -161,6 +161,9 @@ func (s *VMStorage) setPrometheusInstantReqParams(r *http.Request, query string,
r.URL.Path += "/api/v1/query"
}
q := r.URL.Query()
if s.lookBack > 0 {
timestamp = timestamp.Add(-s.lookBack)
}
q.Set("time", timestamp.Format(time.RFC3339))
if !*disableStepParam && s.evaluationInterval > 0 { // set step as evaluationInterval by default
// always convert to seconds to keep compatibility with older

View File

@@ -71,7 +71,7 @@ func TestVMInstantQuery(t *testing.T) {
w.Write([]byte(`{"status":"success","data":{"resultType":"scalar","result":[1583786142, "1"]},"stats":{"seriesFetched": "42"}}`))
}
})
mux.HandleFunc("/render", func(w http.ResponseWriter, _ *http.Request) {
mux.HandleFunc("/render", func(w http.ResponseWriter, request *http.Request) {
c++
switch c {
case 8:
@@ -86,7 +86,7 @@ func TestVMInstantQuery(t *testing.T) {
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, 0, false, srv.Client())
s := NewVMStorage(srv.URL, authCfg, time.Minute, 0, false, srv.Client())
p := datasourcePrometheus
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(p), EvaluationInterval: 15 * time.Second})
@@ -225,7 +225,7 @@ func TestVMInstantQueryWithRetry(t *testing.T) {
srv := httptest.NewServer(mux)
defer srv.Close()
s := NewVMStorage(srv.URL, nil, 0, false, srv.Client())
s := NewVMStorage(srv.URL, nil, time.Minute, 0, false, srv.Client())
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourcePrometheus)})
expErr := func(err string) {
@@ -334,7 +334,7 @@ func TestVMRangeQuery(t *testing.T) {
if err != nil {
t.Fatalf("unexpected: %s", err)
}
s := NewVMStorage(srv.URL, authCfg, *queryStep, false, srv.Client())
s := NewVMStorage(srv.URL, authCfg, time.Minute, *queryStep, false, srv.Client())
pq := s.BuildWithParams(QuerierParams{DataSourceType: string(datasourcePrometheus), EvaluationInterval: 15 * time.Second})
@@ -487,6 +487,17 @@ func TestRequestParams(t *testing.T) {
checkEqualString(t, "bar", p)
},
},
{
"lookback",
false,
&VMStorage{
lookBack: time.Minute,
},
func(t *testing.T, r *http.Request) {
exp := url.Values{"query": {query}, "time": {timestamp.Add(-time.Minute).Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
"evaluation interval",
false,
@@ -499,6 +510,20 @@ func TestRequestParams(t *testing.T) {
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
"lookback + evaluation interval",
false,
&VMStorage{
lookBack: time.Minute,
evaluationInterval: 15 * time.Second,
},
func(t *testing.T, r *http.Request) {
evalInterval := 15 * time.Second
tt := timestamp.Add(-time.Minute)
exp := url.Values{"query": {query}, "step": {evalInterval.String()}, "time": {tt.Format(time.RFC3339)}}
checkEqualString(t, exp.Encode(), r.URL.RawQuery)
},
},
{
"step override",
false,
@@ -612,7 +637,7 @@ func TestRequestParams(t *testing.T) {
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
req, err := tc.vm.newRequest(ctx)
req, err := tc.vm.newRequest()
if err != nil {
t.Fatal(err)
}
@@ -624,7 +649,7 @@ func TestRequestParams(t *testing.T) {
tc.vm.setPrometheusInstantReqParams(req, query, timestamp)
}
case datasourceGraphite:
tc.vm.setGraphiteReqParams(req, query)
tc.vm.setGraphiteReqParams(req, query, timestamp)
}
tc.checkFn(t, req)
})
@@ -710,7 +735,7 @@ func TestHeaders(t *testing.T) {
for _, tt := range testCases {
t.Run(tt.name, func(t *testing.T) {
vm := tt.vmFn()
req, err := vm.newQueryRequest(ctx, "foo", time.Now())
req, err := vm.newQueryRequest("foo", time.Now())
if err != nil {
t.Fatal(err)
}

View File

@@ -59,8 +59,8 @@ absolute path to all .tpl files in root.
configCheckInterval = flag.Duration("configCheckInterval", 0, "Interval for checking for changes in '-rule' or '-notifier.config' files. "+
"By default, the checking is disabled. Send SIGHUP signal in order to force config check for changes.")
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "Address to listen for incoming http requests. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the corresponding -httpListenAddr . "+
httpListenAddr = flag.String("httpListenAddr", ":8880", "Address to listen for http connections. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
evaluationInterval = flag.Duration("evaluationInterval", time.Minute, "How often to evaluate the rules")
@@ -178,19 +178,15 @@ func main() {
go configReload(ctx, manager, groupsCfg, sighupCh)
listenAddrs := *httpListenAddrs
if len(listenAddrs) == 0 {
listenAddrs = []string{":8880"}
}
rh := &requestHandler{m: manager}
go httpserver.Serve(listenAddrs, useProxyProtocol, rh.handler)
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, rh.handler)
pushmetrics.Init()
sig := procutil.WaitForSigterm()
logger.Infof("service received signal %s", sig)
pushmetrics.Stop()
if err := httpserver.Stop(listenAddrs); err != nil {
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
cancel()
@@ -252,13 +248,7 @@ func newManager(ctx context.Context) (*manager, error) {
func getExternalURL(customURL string) (*url.URL, error) {
if customURL == "" {
// use local hostname as external URL
listenAddr := ":8880"
if len(*httpListenAddrs) > 0 {
listenAddr = (*httpListenAddrs)[0]
}
isTLS := httpserver.IsTLS(0)
return getHostnameAsExternalURL(listenAddr, isTLS)
return getHostnameAsExternalURL(*httpListenAddr, httpserver.IsTLS())
}
u, err := url.Parse(customURL)
if err != nil {
@@ -270,13 +260,13 @@ func getExternalURL(customURL string) (*url.URL, error) {
return u, nil
}
func getHostnameAsExternalURL(addr string, isSecure bool) (*url.URL, error) {
func getHostnameAsExternalURL(httpListenAddr string, isSecure bool) (*url.URL, error) {
hname, err := os.Hostname()
if err != nil {
return nil, fmt.Errorf("failed to get hostname: %w", err)
}
port := ""
if ipport := strings.Split(addr, ":"); len(ipport) > 1 {
if ipport := strings.Split(httpListenAddr, ":"); len(ipport) > 1 {
port = ":" + ipport[1]
}
schema := "http://"
@@ -304,7 +294,7 @@ func getAlertURLGenerator(externalURL *url.URL, externalAlertSource string, vali
"tpl": externalAlertSource,
}
return func(alert notifier.Alert) string {
qFn := func(_ string) ([]datasource.Metric, error) {
qFn := func(query string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported for alert source template")
}
templated, err := alert.ExecTemplate(qFn, alert.Labels, m)

View File

@@ -156,14 +156,11 @@ func (m *manager) update(ctx context.Context, groupsCfg []config.Group, restore
var wg sync.WaitGroup
for _, item := range toUpdate {
wg.Add(1)
// cancel evaluation so the Update will be applied as fast as possible.
// it is important to call InterruptEval before the update, because cancel fn
// can be re-assigned during the update.
item.old.InterruptEval()
go func(old *rule.Group, new *rule.Group) {
old.UpdateWith(new)
wg.Done()
}(item.old, item.new)
item.old.InterruptEval()
}
wg.Wait()
}

View File

@@ -178,7 +178,7 @@ func TestAlert_ExecTemplate(t *testing.T) {
},
}
qFn := func(_ string) ([]datasource.Metric, error) {
qFn := func(q string) ([]datasource.Metric, error) {
return []datasource.Metric{
{
Labels: []datasource.Label{

View File

@@ -11,7 +11,6 @@ import (
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promauth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
)
@@ -105,7 +104,7 @@ func (am *AlertManager) send(ctx context.Context, alerts []Alert, headers map[st
if *showNotifierURL {
amURL = am.addr.String()
}
if resp.StatusCode/100 != 2 {
if resp.StatusCode != http.StatusOK {
body, err := io.ReadAll(resp.Body)
if err != nil {
return fmt.Errorf("failed to read response from %q: %w", amURL, err)
@@ -128,7 +127,7 @@ func NewAlertManager(alertManagerURL string, fn AlertURLGenerator, authCfg proma
if authCfg.TLSConfig != nil {
tls = authCfg.TLSConfig
}
tr, err := httputils.Transport(alertManagerURL, tls.CertFile, tls.KeyFile, tls.CAFile, tls.ServerName, tls.InsecureSkipVerify)
tr, err := utils.Transport(alertManagerURL, tls.CertFile, tls.KeyFile, tls.CAFile, tls.ServerName, tls.InsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}

View File

@@ -8,7 +8,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/datasource"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
)
var (
@@ -61,7 +60,7 @@ func Init() (datasource.QuerierBuilder, error) {
if *addr == "" {
return nil, nil
}
tr, err := httputils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
tr, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}
@@ -79,5 +78,5 @@ func Init() (datasource.QuerierBuilder, error) {
return nil, fmt.Errorf("failed to configure auth: %w", err)
}
c := &http.Client{Transport: tr}
return datasource.NewVMStorage(*addr, authCfg, 0, false, c), nil
return datasource.NewVMStorage(*addr, authCfg, 0, 0, false, c), nil
}

View File

@@ -151,22 +151,12 @@ func (c *Client) run(ctx context.Context) {
ticker := time.NewTicker(c.flushInterval)
wr := &prompbmarshal.WriteRequest{}
shutdown := func() {
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
logger.Infof("shutting down remote write client and flushing remained series")
shutdownFlushCnt := 0
for ts := range c.input {
wr.Timeseries = append(wr.Timeseries, ts)
if len(wr.Timeseries) >= c.maxBatchSize {
shutdownFlushCnt += len(wr.Timeseries)
c.flush(lastCtx, wr)
}
}
// flush the last batch. `flush` will re-check and avoid flushing empty batch.
shutdownFlushCnt += len(wr.Timeseries)
lastCtx, cancel := context.WithTimeout(context.Background(), defaultWriteTimeout)
logger.Infof("shutting down remote write client and flushing remained %d series", len(wr.Timeseries))
c.flush(lastCtx, wr)
logger.Infof("shutting down remote write client flushed %d series", shutdownFlushCnt)
cancel()
}
c.wg.Add(1)
@@ -289,7 +279,7 @@ L:
func (c *Client) send(ctx context.Context, data []byte) error {
r := bytes.NewReader(data)
req, err := http.NewRequestWithContext(ctx, http.MethodPost, c.addr, r)
req, err := http.NewRequest(http.MethodPost, c.addr, r)
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
@@ -312,7 +302,7 @@ func (c *Client) send(ctx context.Context, data []byte) error {
if !*disablePathAppend {
req.URL.Path = path.Join(req.URL.Path, "/api/v1/write")
}
resp, err := c.c.Do(req)
resp, err := c.c.Do(req.WithContext(ctx))
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL.Redacted(), err, len(data), r.Size())

View File

@@ -84,70 +84,6 @@ func TestClient_Push(t *testing.T) {
}
}
func TestClient_run_maxBatchSizeDuringShutdown(t *testing.T) {
batchSize := 20
testTable := []struct {
name string // name of the test case
pushCnt int // how many time series is pushed to the client
batchCnt int // the expected batch count sent by the client
}{
{
name: "pushCnt % batchSize == 0",
pushCnt: batchSize * 40,
batchCnt: 40,
},
{
name: "pushCnt % batchSize != 0",
pushCnt: batchSize*40 + 1,
batchCnt: 40 + 1,
},
}
for _, tt := range testTable {
t.Run(tt.name, func(t *testing.T) {
// run new server
bcServer := newBatchCntRWServer()
// run new client
rwClient, err := NewClient(context.Background(), Config{
MaxBatchSize: batchSize,
// Set everything to 1 to simplify the calculation.
Concurrency: 1,
MaxQueueSize: 1000,
FlushInterval: time.Minute,
// batch count server
Addr: bcServer.URL,
})
if err != nil {
t.Fatalf("new remote write client failed, err: %v", err)
}
// push time series to the client.
for i := 0; i < tt.pushCnt; i++ {
if err = rwClient.Push(prompbmarshal.TimeSeries{}); err != nil {
t.Fatalf("push time series to the client failed, err: %v", err)
}
}
// close the client so the rest ts will be flushed in `shutdown`
if err = rwClient.Close(); err != nil {
t.Fatalf("shutdown client failed, err: %v", err)
}
// finally check how many batches is sent.
if tt.batchCnt != bcServer.acceptedBatches() {
t.Errorf("client sent batch count incorrect, want: %d, get: %d", tt.batchCnt, bcServer.acceptedBatches())
}
if tt.pushCnt != bcServer.accepted() {
t.Errorf("client sent time series count incorrect, want: %d, get: %d", tt.pushCnt, bcServer.accepted())
}
})
}
}
func newRWServer() *rwServer {
rw := &rwServer{}
rw.Server = httptest.NewServer(http.HandlerFunc(rw.handler))
@@ -155,12 +91,14 @@ func newRWServer() *rwServer {
}
type rwServer struct {
acceptedRows atomic.Uint64
// WARN: ordering of fields is important for alignment!
// see https://golang.org/pkg/sync/atomic/#pkg-note-BUG
acceptedRows uint64
*httptest.Server
}
func (rw *rwServer) accepted() int {
return int(rw.acceptedRows.Load())
return int(atomic.LoadUint64(&rw.acceptedRows))
}
func (rw *rwServer) err(w http.ResponseWriter, err error) {
@@ -206,7 +144,7 @@ func (rw *rwServer) handler(w http.ResponseWriter, r *http.Request) {
rw.err(w, fmt.Errorf("unmarhsal err: %w", err))
return
}
rw.acceptedRows.Add(uint64(len(wr.Timeseries)))
atomic.AddUint64(&rw.acceptedRows, uint64(len(wr.Timeseries)))
w.WriteHeader(http.StatusNoContent)
}
@@ -248,27 +186,3 @@ func (frw *faultyRWServer) handler(w http.ResponseWriter, r *http.Request) {
w.Write([]byte("server overloaded"))
}
}
type batchCntRWServer struct {
*rwServer
batchCnt atomic.Int64 // accepted batch count, which also equals to request count
}
func newBatchCntRWServer() *batchCntRWServer {
bc := &batchCntRWServer{
rwServer: &rwServer{},
}
bc.Server = httptest.NewServer(http.HandlerFunc(bc.handler))
return bc
}
func (bc *batchCntRWServer) handler(w http.ResponseWriter, r *http.Request) {
bc.batchCnt.Add(1)
bc.rwServer.handler(w, r)
}
func (bc *batchCntRWServer) acceptedBatches() int {
return int(bc.batchCnt.Load())
}

View File

@@ -11,7 +11,7 @@ import (
"github.com/golang/snappy"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
)
@@ -30,7 +30,7 @@ func NewDebugClient() (*DebugClient, error) {
return nil, nil
}
t, err := httputils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
t, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}

View File

@@ -8,7 +8,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
)
var (
@@ -65,7 +64,7 @@ func Init(ctx context.Context) (*Client, error) {
return nil, nil
}
t, err := httputils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
t, err := utils.Transport(*addr, *tlsCertFile, *tlsKeyFile, *tlsCAFile, *tlsServerName, *tlsInsecureSkipVerify)
if err != nil {
return nil, fmt.Errorf("failed to create transport: %w", err)
}

View File

@@ -310,7 +310,7 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
}
var result []prompbmarshal.TimeSeries
holdAlertState := make(map[uint64]*notifier.Alert)
qFn := func(_ string) ([]datasource.Metric, error) {
qFn := func(query string) ([]datasource.Metric, error) {
return nil, fmt.Errorf("`query` template isn't supported in replay mode")
}
for _, s := range res.Data {
@@ -324,6 +324,16 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
return nil, fmt.Errorf("failed to create alert: %w", err)
}
// if alert is instant, For: 0
if ar.For == 0 {
a.State = notifier.StateFiring
for i := range s.Values {
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
}
continue
}
// if alert with For > 0
prevT := time.Time{}
for i := range s.Values {
at := time.Unix(s.Timestamps[i], 0)
@@ -344,10 +354,6 @@ func (ar *AlertingRule) execRange(ctx context.Context, start, end time.Time) ([]
a.Start = at
}
prevT = at
if ar.For == 0 {
// rules with `for: 0` are always firing when they have Value
a.State = notifier.StateFiring
}
result = append(result, ar.alertToTimeSeries(a, s.Timestamps[i])...)
// save alert's state on last iteration, so it can be used on the next execRange call
@@ -440,13 +446,14 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
a.KeepFiringSince = time.Time{}
continue
}
a, err := ar.newAlert(m, ls, ts, qFn)
a, err := ar.newAlert(m, ls, start, qFn)
if err != nil {
curState.Err = fmt.Errorf("failed to create alert: %w", err)
return nil, curState.Err
}
a.ID = h
a.State = notifier.StatePending
a.ActiveAt = ts
ar.alerts[h] = a
ar.logDebugf(ts, a, "created in state PENDING")
}
@@ -472,7 +479,7 @@ func (ar *AlertingRule) exec(ctx context.Context, ts time.Time, limit int) ([]pr
}
// alerts with ar.KeepFiringFor>0 may remain FIRING
// even if their expression isn't true anymore
if ts.Sub(a.KeepFiringSince) >= ar.KeepFiringFor {
if ts.Sub(a.KeepFiringSince) > ar.KeepFiringFor {
a.State = notifier.StateInactive
a.ResolvedAt = ts
ar.logDebugf(ts, a, "FIRING => INACTIVE: is absent in current evaluation round")
@@ -552,9 +559,9 @@ func (ar *AlertingRule) newAlert(m datasource.Metric, ls *labelSet, start time.T
}
const (
// alertMetricName is the metric name for time series reflecting the alert state.
// alertMetricName is the metric name for synthetic alert timeseries.
alertMetricName = "ALERTS"
// alertForStateMetricName is the metric name for time series reflecting the moment of time when alert became active.
// alertForStateMetricName is the metric name for 'for' state of alert.
alertForStateMetricName = "ALERTS_FOR_STATE"
// alertNameLabel is the label name indicating the name of an alert.
@@ -569,10 +576,12 @@ const (
// alertToTimeSeries converts the given alert with the given timestamp to time series
func (ar *AlertingRule) alertToTimeSeries(a *notifier.Alert, timestamp int64) []prompbmarshal.TimeSeries {
return []prompbmarshal.TimeSeries{
alertToTimeSeries(a, timestamp),
alertForToTimeSeries(a, timestamp),
var tss []prompbmarshal.TimeSeries
tss = append(tss, alertToTimeSeries(a, timestamp))
if ar.For > 0 {
tss = append(tss, alertForToTimeSeries(a, timestamp))
}
return tss
}
func alertToTimeSeries(a *notifier.Alert, timestamp int64) prompbmarshal.TimeSeries {
@@ -655,19 +664,15 @@ func (ar *AlertingRule) restore(ctx context.Context, q datasource.Querier, ts ti
// alertsToSend walks through the current alerts of AlertingRule
// and returns only those which should be sent to notifier.
// Isn't concurrent safe.
func (ar *AlertingRule) alertsToSend(resolveDuration, resendDelay time.Duration) []notifier.Alert {
currentTime := time.Now()
func (ar *AlertingRule) alertsToSend(ts time.Time, resolveDuration, resendDelay time.Duration) []notifier.Alert {
needsSending := func(a *notifier.Alert) bool {
if a.State == notifier.StatePending {
return false
}
if a.State == notifier.StateFiring && a.End.Before(a.LastSent) {
if a.ResolvedAt.After(a.LastSent) {
return true
}
if a.State == notifier.StateInactive && a.ResolvedAt.After(a.LastSent) {
return true
}
return a.LastSent.Add(resendDelay).Before(currentTime)
return a.LastSent.Add(resendDelay).Before(ts)
}
var alerts []notifier.Alert
@@ -675,11 +680,11 @@ func (ar *AlertingRule) alertsToSend(resolveDuration, resendDelay time.Duration)
if !needsSending(a) {
continue
}
a.End = currentTime.Add(resolveDuration)
a.End = ts.Add(resolveDuration)
if a.State == notifier.StateInactive {
a.End = a.ResolvedAt
}
a.LastSent = currentTime
a.LastSent = ts
alerts = append(alerts, *a)
}
return alerts

View File

@@ -28,28 +28,20 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
}{
{
newTestAlertingRule("instant", 0),
&notifier.Alert{State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second)},
&notifier.Alert{State: notifier.StateFiring},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
}),
},
},
{
newTestAlertingRule("instant extra labels", 0),
&notifier.Alert{
State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second),
Labels: map[string]string{
"job": "foo",
"instance": "bar",
},
},
&notifier.Alert{State: notifier.StateFiring, Labels: map[string]string{
"job": "foo",
"instance": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
@@ -57,35 +49,19 @@ func TestAlertingRule_ToTimeSeries(t *testing.T) {
"job": "foo",
"instance": "bar",
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
"job": "foo",
"instance": "bar",
}),
},
},
{
newTestAlertingRule("instant labels override", 0),
&notifier.Alert{
State: notifier.StateFiring, ActiveAt: timestamp.Add(time.Second),
Labels: map[string]string{
alertStateLabel: "foo",
"__name__": "bar",
},
},
&notifier.Alert{State: notifier.StateFiring, Labels: map[string]string{
alertStateLabel: "foo",
"__name__": "bar",
}},
[]prompbmarshal.TimeSeries{
newTimeSeries([]float64{1}, []int64{timestamp.UnixNano()}, map[string]string{
"__name__": alertMetricName,
alertStateLabel: notifier.StateFiring.String(),
}),
newTimeSeries([]float64{float64(timestamp.Add(time.Second).Unix())},
[]int64{timestamp.UnixNano()},
map[string]string{
"__name__": alertForStateMetricName,
alertStateLabel: "foo",
}),
},
},
{
@@ -332,17 +308,14 @@ func TestAlertingRule_Exec(t *testing.T) {
fq := &datasource.FakeQuerier{}
tc.rule.q = fq
tc.rule.GroupID = fakeGroup.ID()
ts := time.Now()
for i, step := range tc.steps {
fq.Reset()
fq.Add(step...)
if _, err := tc.rule.exec(context.TODO(), ts, 0); err != nil {
if _, err := tc.rule.exec(context.TODO(), time.Now(), 0); err != nil {
t.Fatalf("unexpected err: %s", err)
}
// shift the execution timestamp before the next iteration
ts = ts.Add(defaultStep)
// artificial delay between applying steps
time.Sleep(defaultStep)
if _, ok := tc.expAlerts[i]; !ok {
continue
}
@@ -394,7 +367,7 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1}, Timestamps: []int64{1}},
},
[]*notifier.Alert{
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
{State: notifier.StateFiring},
},
nil,
},
@@ -405,9 +378,8 @@ func TestAlertingRule_ExecRange(t *testing.T) {
},
[]*notifier.Alert{
{
Labels: map[string]string{"name": "foo"},
State: notifier.StateFiring,
ActiveAt: time.Unix(1, 0),
Labels: map[string]string{"name": "foo"},
State: notifier.StateFiring,
},
},
nil,
@@ -418,9 +390,9 @@ func TestAlertingRule_ExecRange(t *testing.T) {
{Values: []float64{1, 1, 1}, Timestamps: []int64{1e3, 2e3, 3e3}},
},
[]*notifier.Alert{
{State: notifier.StateFiring, ActiveAt: time.Unix(1e3, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(2e3, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(3e3, 0)},
{State: notifier.StateFiring},
{State: notifier.StateFiring},
{State: notifier.StateFiring},
},
nil,
},
@@ -488,20 +460,6 @@ func TestAlertingRule_ExecRange(t *testing.T) {
For: time.Second,
}},
},
{
newTestAlertingRuleWithEvalInterval("firing=>inactive=>inactive=>firing=>firing", 0, time.Second),
[]datasource.Metric{
{Values: []float64{1, 1, 1, 1}, Timestamps: []int64{1, 4, 5, 6}},
},
[]*notifier.Alert{
{State: notifier.StateFiring, ActiveAt: time.Unix(1, 0)},
// It is expected for ActiveAT to remain the same while rule continues to fire in each iteration
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
{State: notifier.StateFiring, ActiveAt: time.Unix(4, 0)},
},
nil,
},
{
newTestAlertingRule("for=>pending=>firing=>pending=>firing=>pending", time.Second),
[]datasource.Metric{
@@ -576,33 +534,21 @@ func TestAlertingRule_ExecRange(t *testing.T) {
},
},
[]*notifier.Alert{
{
State: notifier.StateFiring, ActiveAt: time.Unix(1, 0),
Labels: map[string]string{
"source": "vm",
},
},
{
State: notifier.StateFiring, ActiveAt: time.Unix(100, 0),
Labels: map[string]string{
"source": "vm",
},
},
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"source": "vm",
}},
//
{
State: notifier.StateFiring, ActiveAt: time.Unix(1, 0),
Labels: map[string]string{
"foo": "bar",
"source": "vm",
},
},
{
State: notifier.StateFiring, ActiveAt: time.Unix(5, 0),
Labels: map[string]string{
"foo": "bar",
"source": "vm",
},
},
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
{State: notifier.StateFiring, Labels: map[string]string{
"foo": "bar",
"source": "vm",
}},
},
nil,
},
@@ -1054,7 +1000,7 @@ func TestAlertsToSend(t *testing.T) {
for i, a := range alerts {
ar.alerts[uint64(i)] = a
}
gotAlerts := ar.alertsToSend(resolveDuration, resendDelay)
gotAlerts := ar.alertsToSend(ts, resolveDuration, resendDelay)
if gotAlerts == nil && expAlerts == nil {
return
}
@@ -1070,36 +1016,60 @@ func TestAlertsToSend(t *testing.T) {
})
for i, exp := range expAlerts {
got := gotAlerts[i]
if got.Name != exp.Name {
t.Fatalf("expected Name to be %v; got %v", exp.Name, got.Name)
if got.LastSent != exp.LastSent {
t.Fatalf("expected LastSent to be %v; got %v", exp.LastSent, got.LastSent)
}
if got.End != exp.End {
t.Fatalf("expected End to be %v; got %v", exp.End, got.End)
}
}
}
f( // check if firing alerts need to be sent with non-zero resendDelay
[]*notifier.Alert{
{Name: "a", State: notifier.StateFiring, Start: ts},
// no need to resend firing
{Name: "b", State: notifier.StateFiring, Start: ts, LastSent: ts.Add(-30 * time.Second), End: ts.Add(5 * time.Minute)},
// last message is for resolved, send firing message this time
{Name: "c", State: notifier.StateFiring, Start: ts, LastSent: ts.Add(-30 * time.Second), End: ts.Add(-1 * time.Minute)},
// resend firing
{Name: "d", State: notifier.StateFiring, Start: ts, LastSent: ts.Add(-1 * time.Minute)},
},
[]*notifier.Alert{{Name: "a"}, {Name: "c"}, {Name: "d"}},
f( // send firing alert with custom resolve time
[]*notifier.Alert{{State: notifier.StateFiring}},
[]*notifier.Alert{{LastSent: ts, End: ts.Add(5 * time.Minute)}},
5*time.Minute, time.Minute,
)
f( // check if resolved alerts need to be sent with non-zero resendDelay
[]*notifier.Alert{
{Name: "a", State: notifier.StateInactive, ResolvedAt: ts, LastSent: ts.Add(-30 * time.Second)},
// no need to resend resolved
{Name: "b", State: notifier.StateInactive, ResolvedAt: ts, LastSent: ts},
// resend resolved
{Name: "c", State: notifier.StateInactive, ResolvedAt: ts.Add(-1 * time.Minute), LastSent: ts.Add(-1 * time.Minute)},
},
[]*notifier.Alert{{Name: "a"}, {Name: "c"}},
f( // resolve inactive alert at the current timestamp
[]*notifier.Alert{{State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{LastSent: ts, End: ts}},
time.Minute, time.Minute,
)
f( // mixed case of firing and resolved alerts. Names are added for deterministic sorting
[]*notifier.Alert{{Name: "a", State: notifier.StateFiring}, {Name: "b", State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{Name: "a", LastSent: ts, End: ts.Add(5 * time.Minute)}, {Name: "b", LastSent: ts, End: ts}},
5*time.Minute, time.Minute,
)
f( // mixed case of pending and resolved alerts. Names are added for deterministic sorting
[]*notifier.Alert{{Name: "a", State: notifier.StatePending}, {Name: "b", State: notifier.StateInactive, ResolvedAt: ts}},
[]*notifier.Alert{{Name: "b", LastSent: ts, End: ts}},
5*time.Minute, time.Minute,
)
f( // attempt to send alert that was already sent in the resendDelay interval
[]*notifier.Alert{{State: notifier.StateFiring, LastSent: ts.Add(-time.Second)}},
nil,
time.Minute, time.Minute,
)
f( // attempt to send alert that was sent out of the resendDelay interval
[]*notifier.Alert{{State: notifier.StateFiring, LastSent: ts.Add(-2 * time.Minute)}},
[]*notifier.Alert{{LastSent: ts, End: ts.Add(time.Minute)}},
time.Minute, time.Minute,
)
f( // alert must be sent even if resendDelay interval is 0
[]*notifier.Alert{{State: notifier.StateFiring, LastSent: ts.Add(-time.Second)}},
[]*notifier.Alert{{LastSent: ts, End: ts.Add(time.Minute)}},
time.Minute, 0,
)
f( // inactive alert which has been sent already
[]*notifier.Alert{{State: notifier.StateInactive, LastSent: ts.Add(-time.Second), ResolvedAt: ts.Add(-2 * time.Second)}},
nil,
time.Minute, time.Minute,
)
f( // inactive alert which has been resolved after last send
[]*notifier.Alert{{State: notifier.StateInactive, LastSent: ts.Add(-time.Second), ResolvedAt: ts}},
[]*notifier.Alert{{LastSent: ts, End: ts}},
time.Minute, time.Minute,
)
}
func newTestRuleWithLabels(name string, labels ...string) *AlertingRule {
@@ -1125,12 +1095,6 @@ func newTestAlertingRule(name string, waitFor time.Duration) *AlertingRule {
return &rule
}
func newTestAlertingRuleWithEvalInterval(name string, waitFor, evalInterval time.Duration) *AlertingRule {
rule := newTestAlertingRule(name, waitFor)
rule.EvalInterval = evalInterval
return rule
}
func newTestAlertingRuleWithKeepFiring(name string, waitFor, keepFiringFor time.Duration) *AlertingRule {
rule := newTestAlertingRule(name, waitFor)
rule.KeepFiringFor = keepFiringFor

View File

@@ -704,7 +704,7 @@ func (e *executor) exec(ctx context.Context, r Rule, ts time.Time, resolveDurati
return nil
}
alerts := ar.alertsToSend(resolveDuration, *resendDelay)
alerts := ar.alertsToSend(ts, resolveDuration, *resendDelay)
if len(alerts) < 1 {
return nil
}
@@ -724,7 +724,7 @@ func (e *executor) exec(ctx context.Context, r Rule, ts time.Time, resolveDurati
return errGr.Err()
}
// getStaleSeries checks whether there are stale series from previously sent ones.
// getStaledSeries checks whether there are stale series from previously sent ones.
func (e *executor) getStaleSeries(r Rule, tss []prompbmarshal.TimeSeries, timestamp time.Time) []prompbmarshal.TimeSeries {
ruleLabels := make(map[string][]prompbmarshal.Label, len(tss))
for _, ts := range tss {

View File

@@ -1,12 +1,5 @@
function expandAll() {
$('.group-heading').each(function () {
let style = $(this).attr("style")
// display only elements that are currently visible
if (style === "display: none;") {
return
}
$(this).next().addClass('show')
});
$('.collapse').addClass('show');
}
function collapseAll() {
@@ -22,100 +15,6 @@ function toggleByID(id) {
}
}
function debounce(func, delay) {
let timer;
return function (...args) {
clearTimeout(timer);
timer = setTimeout(() => {
func.apply(this, args);
}, delay);
};
}
$('#search').on("keyup", debounce(search, 500));
// search shows or hides groups&rules that satisfy the search phrase.
// case-insensitive, respects GET param `search`.
function search() {
$(".rule").show();
let groupHeader = $(".group-heading")
let searchPhrase = $("#search").val().toLowerCase()
if (searchPhrase.length === 0) {
groupHeader.show()
setParamURL('search', '')
return
}
$(".rule-table").removeClass('show');
groupHeader.hide()
searchPhrase = searchPhrase.toLowerCase()
filterRuleByName(searchPhrase);
filterRuleByLabels(searchPhrase);
filterGroupsByName(searchPhrase);
setParamURL('search', searchPhrase)
}
function setParamURL(key, value) {
let url = new URL(location.href)
url.searchParams.set(key, value);
window.history.replaceState(null, null, `?${url.searchParams.toString()}`);
}
function getParamURL(key) {
let url = new URL(location.href)
return url.searchParams.get(key)
}
function filterGroupsByName(searchPhrase) {
$(".group-heading").each(function () {
const groupName = $(this).attr('data-group-name').toLowerCase();
const hasValue = groupName.indexOf(searchPhrase) >= 0
if (!hasValue) {
return
}
const target = $(this).attr("data-bs-target");
$(`div[id="${target}"] .rule`).show();
$(this).show();
});
}
function filterRuleByName(searchPhrase) {
$(".rule").each(function () {
const ruleName = $(this).attr("data-rule-name").toLowerCase();
const hasValue = ruleName.indexOf(searchPhrase) >= 0
if (!hasValue) {
$(this).hide();
return
}
const target = $(this).attr('data-bs-target')
$(`#rules-${target}`).addClass('show');
$(`div[data-bs-target='rules-${target}']`).show();
$(this).show();
});
}
function filterRuleByLabels(searchPhrase) {
$(".rule").each(function () {
const matches = $(".label", this).filter(function () {
const label = $(this).text().toLowerCase();
return label.indexOf(searchPhrase) >= 0;
}).length;
if (matches > 0) {
const target = $(this).attr('data-bs-target')
$(`#rules-${target}`).addClass('show');
$(`div[data-bs-target='rules-${target}']`).show();
$(this).show();
}
});
}
$(document).ready(function () {
$(".group-heading a").click(function (e) {
e.stopPropagation(); // prevent collapse logic on link click
@@ -133,13 +32,6 @@ $(document).ready(function () {
});
});
// update search element with value from URL, if any
let searchPhrase = getParamURL('search')
$("#search").val(searchPhrase)
// apply filtering by search phrase
search()
let hash = window.location.hash.substr(1);
toggleByID(hash);
});

View File

@@ -476,7 +476,7 @@ func templateFuncs() textTpl.FuncMap {
// For example, {{ query "foo" | first | value }} will
// execute "/api/v1/query?query=foo" request and will return
// the first value in response.
"query": func(_ string) ([]metric, error) {
"query": func(q string) ([]metric, error) {
// query function supposed to be substituted at FuncsWithQuery().
// it is present here only for validation purposes, when there is no
// provided datasource.

View File

@@ -1,4 +1,4 @@
package httputils
package utils
import (
"crypto/tls"
@@ -11,12 +11,12 @@ import (
// Transport creates http.Transport object based on provided URL.
// Returns Transport with TLS configuration if URL contains `https` prefix
func Transport(URL, certFile, keyFile, caFile, serverName string, insecureSkipVerify bool) (*http.Transport, error) {
func Transport(URL, certFile, keyFile, CAFile, serverName string, insecureSkipVerify bool) (*http.Transport, error) {
t := http.DefaultTransport.(*http.Transport).Clone()
if !strings.HasPrefix(URL, "https") {
return t, nil
}
tlsCfg, err := TLSConfig(certFile, keyFile, caFile, serverName, insecureSkipVerify)
tlsCfg, err := TLSConfig(certFile, keyFile, CAFile, serverName, insecureSkipVerify)
if err != nil {
return nil, err
}
@@ -25,7 +25,7 @@ func Transport(URL, certFile, keyFile, caFile, serverName string, insecureSkipVe
}
// TLSConfig creates tls.Config object from provided arguments
func TLSConfig(certFile, keyFile, caFile, serverName string, insecureSkipVerify bool) (*tls.Config, error) {
func TLSConfig(certFile, keyFile, CAFile, serverName string, insecureSkipVerify bool) (*tls.Config, error) {
var certs []tls.Certificate
if certFile != "" {
cert, err := tls.LoadX509KeyPair(certFile, keyFile)
@@ -37,15 +37,15 @@ func TLSConfig(certFile, keyFile, caFile, serverName string, insecureSkipVerify
}
var rootCAs *x509.CertPool
if caFile != "" {
pem, err := os.ReadFile(caFile)
if CAFile != "" {
pem, err := os.ReadFile(CAFile)
if err != nil {
return nil, fmt.Errorf("cannot read `ca_file` %q: %w", caFile, err)
return nil, fmt.Errorf("cannot read `ca_file` %q: %w", CAFile, err)
}
rootCAs = x509.NewCertPool()
if !rootCAs.AppendCertsFromPEM(pem) {
return nil, fmt.Errorf("cannot parse data from `ca_file` %q", caFile)
return nil, fmt.Errorf("cannot parse data from `ca_file` %q", CAFile)
}
}

View File

@@ -1,4 +1,4 @@
package httputils
package utils
import "testing"

View File

@@ -12,15 +12,11 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/notifier"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/rule"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmalert/tpl"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
var reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
var (
apiLinks = [][2]string{
// api links are relative since they can be used by external clients,
@@ -88,10 +84,7 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
WriteRuleDetails(w, r, rule)
return true
case "/vmalert/groups":
var data []apiGroup
rf := extractRulesFilter(r)
data = rh.groups(rf)
WriteListGroups(w, r, data)
WriteListGroups(w, r, rh.groups())
return true
case "/vmalert/notifiers":
WriteListTargets(w, r, notifier.GetTargets())
@@ -102,20 +95,12 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
case "/rules":
// Grafana makes an extra request to `/rules`
// handler in addition to `/api/v1/rules` calls in alerts UI,
var data []apiGroup
rf := extractRulesFilter(r)
data = rh.groups(rf)
WriteListGroups(w, r, data)
WriteListGroups(w, r, rh.groups())
return true
case "/vmalert/api/v1/rules", "/api/v1/rules":
// path used by Grafana for ng alerting
var data []byte
var err error
rf := extractRulesFilter(r)
data, err = rh.listGroups(rf)
data, err := rh.listGroups()
if err != nil {
httpserver.Errorf(w, r, "%s", err)
return true
@@ -123,7 +108,6 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Header().Set("Content-Type", "application/json")
w.Write(data)
return true
case "/vmalert/api/v1/alerts", "/api/v1/alerts":
// path used by Grafana for ng alerting
data, err := rh.listAlerts()
@@ -167,9 +151,6 @@ func (rh *requestHandler) handler(w http.ResponseWriter, r *http.Request) bool {
w.Write(data)
return true
case "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
logger.Infof("api config reload was called, sending sighup")
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
@@ -220,94 +201,26 @@ type listGroupsResponse struct {
} `json:"data"`
}
// see https://prometheus.io/docs/prometheus/latest/querying/api/#rules
type rulesFilter struct {
files []string
groupNames []string
ruleNames []string
ruleType string
excludeAlerts bool
}
func extractRulesFilter(r *http.Request) rulesFilter {
rf := rulesFilter{}
var ruleType string
ruleTypeParam := r.URL.Query().Get("type")
// for some reason, `type` in filter doesn't match `type` in response,
// so we use this matching here
if ruleTypeParam == "alert" {
ruleType = ruleTypeAlerting
} else if ruleTypeParam == "record" {
ruleType = ruleTypeRecording
}
rf.ruleType = ruleType
rf.excludeAlerts = httputils.GetBool(r, "exclude_alerts")
rf.ruleNames = append([]string{}, r.Form["rule_name[]"]...)
rf.groupNames = append([]string{}, r.Form["rule_group[]"]...)
rf.files = append([]string{}, r.Form["file[]"]...)
return rf
}
func (rh *requestHandler) groups(rf rulesFilter) []apiGroup {
func (rh *requestHandler) groups() []apiGroup {
rh.m.groupsMu.RLock()
defer rh.m.groupsMu.RUnlock()
isInList := func(list []string, needle string) bool {
if len(list) < 1 {
return true
}
for _, i := range list {
if i == needle {
return true
}
}
return false
}
groups := make([]apiGroup, 0)
for _, group := range rh.m.groups {
if !isInList(rf.groupNames, group.Name) {
continue
}
if !isInList(rf.files, group.File) {
continue
}
g := groupToAPI(group)
// the returned list should always be non-nil
// https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4221
filteredRules := make([]apiRule, 0)
for _, r := range g.Rules {
if rf.ruleType != "" && rf.ruleType != r.Type {
continue
}
if !isInList(rf.ruleNames, r.Name) {
continue
}
if rf.excludeAlerts {
r.Alerts = nil
}
filteredRules = append(filteredRules, r)
}
g.Rules = filteredRules
groups = append(groups, g)
for _, g := range rh.m.groups {
groups = append(groups, groupToAPI(g))
}
// sort list of groups for deterministic output
// sort list of alerts for deterministic output
sort.Slice(groups, func(i, j int) bool {
a, b := groups[i], groups[j]
if a.Name != b.Name {
return a.Name < b.Name
}
return a.File < b.File
return groups[i].Name < groups[j].Name
})
return groups
}
func (rh *requestHandler) listGroups(rf rulesFilter) ([]byte, error) {
func (rh *requestHandler) listGroups() ([]byte, error) {
lr := listGroupsResponse{Status: "success"}
lr.Data.Groups = rh.groups(rf)
lr.Data.Groups = rh.groups()
b, err := json.Marshal(lr)
if err != nil {
return nil, &httpserver.ErrorWithStatusCode{

View File

@@ -70,29 +70,15 @@ btn-primary
}
}
%}
<div class="btn-toolbar mb-3" role="toolbar">
<div>
<a class="btn {%= buttonActive(filter, "") %}" role="button" onclick="window.location = window.location.pathname">All</a>
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
<a class="btn {%= buttonActive(filter, "unhealthy") %}" role="button" onclick="location.href='?filter=unhealthy'" title="Show only rules with errors">Unhealthy</a>
<a class="btn {%= buttonActive(filter, "noMatch") %}" role="button" onclick="location.href='?filter=noMatch'" title="Show only rules matching no time series during last evaluation">NoMatch</a>
</div>
<div class="col-md-4 col-lg-5">
<div class="px-3 input-group">
<div class="input-group-prepend">
<span class="input-group-text">
<svg fill="#000000" height="25px" width="20px" version="1.1" id="Capa_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 490.4 490.4" xml:space="preserve"><g id="SVGRepo_bgCarrier" stroke-width="0"></g><g id="SVGRepo_tracerCarrier" stroke-linecap="round" stroke-linejoin="round"></g><g id="SVGRepo_iconCarrier"> <g> <path d="M484.1,454.796l-110.5-110.6c29.8-36.3,47.6-82.8,47.6-133.4c0-116.3-94.3-210.6-210.6-210.6S0,94.496,0,210.796 s94.3,210.6,210.6,210.6c50.8,0,97.4-18,133.8-48l110.5,110.5c12.9,11.8,25,4.2,29.2,0C492.5,475.596,492.5,463.096,484.1,454.796z M41.1,210.796c0-93.6,75.9-169.5,169.5-169.5s169.6,75.9,169.6,169.5s-75.9,169.5-169.5,169.5S41.1,304.396,41.1,210.796z"></path> </g> </g></svg>
</span>
</div>
<input id="search" placeholder="Filter by group, rule or labels" type="text" class="form-control"/>
</div>
</div>
</div>
<a class="btn {%= buttonActive(filter, "") %}" role="button" onclick="window.location = window.location.pathname">All</a>
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
<a class="btn {%= buttonActive(filter, "unhealthy") %}" role="button" onclick="location.href='?filter=unhealthy'" title="Show only rules with errors">Unhealthy</a>
<a class="btn {%= buttonActive(filter, "noMatch") %}" role="button" onclick="location.href='?filter=noMatch'" title="Show only rules matching no time series during last evaluation">NoMatch</a>
{% if len(groups) > 0 %}
{% for _, g := range groups %}
<div
class="group-heading{% if rNotOk[g.ID] > 0 %} alert-danger{%endif%}" data-bs-target="rules-{%s g.ID %}" data-group-name="{%s g.Name %}">
class="group-heading{% if rNotOk[g.ID] > 0 %} alert-danger{%endif%}" data-bs-target="rules-{%s g.ID %}">
<span class="anchor" id="group-{%s g.ID %}"></span>
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %} (every {%f.0 g.Interval %}s) #</a>
{% if rNotOk[g.ID] > 0 %}<span class="badge bg-danger" title="Number of rules with status Error">{%d rNotOk[g.ID] %}</span> {% endif %}
@@ -114,7 +100,7 @@ btn-primary
</div>
{% endif %}
</div>
<div class="collapse rule-table" id="rules-{%s g.ID %}">
<div class="collapse" id="rules-{%s g.ID %}">
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
@@ -125,7 +111,7 @@ btn-primary
</thead>
<tbody>
{% for _, r := range g.Rules %}
<tr class="rule{% if r.LastError != "" %} alert-danger{% endif %}" data-rule-name="{%s r.Name %}" data-bs-target="{%s g.ID %}">
<tr{% if r.LastError != "" %} class="alert-danger"{% endif %}>
<td>
<div class="row">
<div class="col-12 mb-2">
@@ -148,7 +134,7 @@ btn-primary
<div class="col-12 mb-2">
{% if len(r.Labels) > 0 %} <b>Labels:</b>{% endif %}
{% for k, v := range r.Labels %}
<span class="ms-1 badge bg-primary label">{%s k %}={%s v %}</span>
<span class="ms-1 badge bg-primary">{%s k %}={%s v %}</span>
{% endfor %}
</div>
{% if r.LastError != "" %}
@@ -184,25 +170,11 @@ btn-primary
{%code prefix := utils.Prefix(r.URL.Path) %}
{%= tpl.Header(r, navItems, "Alerts", getLastConfigError()) %}
{% if len(groupAlerts) > 0 %}
<div class="btn-toolbar mb-3" role="toolbar">
<div>
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
</div>
<div class="col-md-4 col-lg-5">
<div class="px-3 input-group">
<div class="input-group-prepend">
<span class="input-group-text">
<svg fill="#000000" height="25px" width="20px" version="1.1" id="Capa_1" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 490.4 490.4" xml:space="preserve"><g id="SVGRepo_bgCarrier" stroke-width="0"></g><g id="SVGRepo_tracerCarrier" stroke-linecap="round" stroke-linejoin="round"></g><g id="SVGRepo_iconCarrier"> <g> <path d="M484.1,454.796l-110.5-110.6c29.8-36.3,47.6-82.8,47.6-133.4c0-116.3-94.3-210.6-210.6-210.6S0,94.496,0,210.796 s94.3,210.6,210.6,210.6c50.8,0,97.4-18,133.8-48l110.5,110.5c12.9,11.8,25,4.2,29.2,0C492.5,475.596,492.5,463.096,484.1,454.796z M41.1,210.796c0-93.6,75.9-169.5,169.5-169.5s169.6,75.9,169.6,169.5s-75.9,169.5-169.5,169.5S41.1,304.396,41.1,210.796z"></path> </g> </g></svg>
</span>
</div>
<input id="search" placeholder="Filter by group, rule or labels" type="text" class="form-control"/>
</div>
</div>
</div>
<a class="btn btn-primary" role="button" onclick="collapseAll()">Collapse All</a>
<a class="btn btn-primary" role="button" onclick="expandAll()">Expand All</a>
{% for _, ga := range groupAlerts %}
{%code g := ga.Group %}
<div class="group-heading alert-danger" data-bs-target="rules-{%s g.ID %}" data-group-name="{%s g.Name %}">
<div class="group-heading alert-danger" data-bs-target="rules-{%s g.ID %}">
<span class="anchor" id="group-{%s g.ID %}"></span>
<a href="#group-{%s g.ID %}">{%s g.Name %}{% if g.Type != "prometheus" %} ({%s g.Type %}){% endif %}</a>
<span class="badge bg-danger" title="Number of active alerts">{%d len(ga.Alerts) %}</span>
@@ -220,7 +192,7 @@ btn-primary
}
sort.Strings(keys)
%}
<div class="collapse rule-table" id="rules-{%s g.ID %}">
<div class="collapse" id="rules-{%s g.ID %}">
{% for _, ruleID := range keys %}
{%code
defaultAR := alertsByRule[ruleID][0]
@@ -231,46 +203,45 @@ btn-primary
sort.Strings(labelKeys)
%}
<br>
<div class="rule" data-rule-name="{%s defaultAR.Name %}" data-bs-target="{%s g.ID %}">
<b>alert:</b> {%s defaultAR.Name %} ({%d len(alertsByRule[ruleID]) %})
| <span><a target="_blank" href="{%s defaultAR.SourceLink %}">Source</a></span>
<br>
<b>expr:</b><code><pre>{%s defaultAR.Expression %}</pre></code>
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col">Labels</th>
<th scope="col">State</th>
<th scope="col">Active at</th>
<th scope="col">Value</th>
<th scope="col">Link</th>
</tr>
</thead>
<tbody>
{% for _, ar := range alertsByRule[ruleID] %}
<tr>
<td>
{% for _, k := range labelKeys %}
<span class="ms-1 badge bg-primary label">{%s k %}={%s ar.Labels[k] %}</span>
{% endfor %}
</td>
<td>{%= badgeState(ar.State) %}</td>
<td>
{%s ar.ActiveAt.Format("2006-01-02T15:04:05Z07:00") %}
{% if ar.Restored %}{%= badgeRestored() %}{% endif %}
{% if ar.Stabilizing %}{%= badgeStabilizing() %}{% endif %}
</td>
<td>{%s ar.Value %}</td>
<td>
<a href="{%s prefix+ar.WebLink() %}">Details</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</div>
<b>alert:</b> {%s defaultAR.Name %} ({%d len(alertsByRule[ruleID]) %})
| <span><a target="_blank" href="{%s defaultAR.SourceLink %}">Source</a></span>
<br>
<b>expr:</b><code><pre>{%s defaultAR.Expression %}</pre></code>
<table class="table table-striped table-hover table-sm">
<thead>
<tr>
<th scope="col">Labels</th>
<th scope="col">State</th>
<th scope="col">Active at</th>
<th scope="col">Value</th>
<th scope="col">Link</th>
</tr>
</thead>
<tbody>
{% for _, ar := range alertsByRule[ruleID] %}
<tr>
<td>
{% for _, k := range labelKeys %}
<span class="ms-1 badge bg-primary">{%s k %}={%s ar.Labels[k] %}</span>
{% endfor %}
</td>
<td>{%= badgeState(ar.State) %}</td>
<td>
{%s ar.ActiveAt.Format("2006-01-02T15:04:05Z07:00") %}
{% if ar.Restored %}{%= badgeRestored() %}{% endif %}
{% if ar.Stabilizing %}{%= badgeStabilizing() %}{% endif %}
</td>
<td>{%s ar.Value %}</td>
<td>
<a href="{%s prefix+ar.WebLink() %}">Details</a>
</td>
</tr>
{% endfor %}
</tbody>
</table>
{% endfor %}
</div>
<br>
{% endfor %}
{% else %}

File diff suppressed because it is too large Load Diff

View File

@@ -23,7 +23,6 @@ func TestHandler(t *testing.T) {
})
g := &rule.Group{
Name: "group",
File: "rules.yaml",
Concurrency: 1,
}
ar := rule.NewAlertingRule(fq, g, config.Rule{ID: 0, Alert: "alert"})
@@ -36,7 +35,7 @@ func TestHandler(t *testing.T) {
}}
rh := &requestHandler{m: m}
getResp := func(t *testing.T, url string, to interface{}, code int) {
getResp := func(url string, to interface{}, code int) {
t.Helper()
resp, err := http.Get(url)
if err != nil {
@@ -60,43 +59,43 @@ func TestHandler(t *testing.T) {
defer ts.Close()
t.Run("/", func(t *testing.T) {
getResp(t, ts.URL, nil, 200)
getResp(t, ts.URL+"/vmalert", nil, 200)
getResp(t, ts.URL+"/vmalert/alerts", nil, 200)
getResp(t, ts.URL+"/vmalert/groups", nil, 200)
getResp(t, ts.URL+"/vmalert/notifiers", nil, 200)
getResp(t, ts.URL+"/rules", nil, 200)
getResp(ts.URL, nil, 200)
getResp(ts.URL+"/vmalert", nil, 200)
getResp(ts.URL+"/vmalert/alerts", nil, 200)
getResp(ts.URL+"/vmalert/groups", nil, 200)
getResp(ts.URL+"/vmalert/notifiers", nil, 200)
getResp(ts.URL+"/rules", nil, 200)
})
t.Run("/vmalert/rule", func(t *testing.T) {
a := ruleToAPI(ar)
getResp(t, ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
getResp(ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
r := ruleToAPI(rr)
getResp(t, ts.URL+"/vmalert/"+r.WebLink(), nil, 200)
getResp(ts.URL+"/vmalert/"+r.WebLink(), nil, 200)
})
t.Run("/vmalert/alert", func(t *testing.T) {
alerts := ruleToAPIAlert(ar)
for _, a := range alerts {
getResp(t, ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
getResp(ts.URL+"/vmalert/"+a.WebLink(), nil, 200)
}
})
t.Run("/vmalert/rule?badParam", func(t *testing.T) {
params := fmt.Sprintf("?%s=0&%s=1", paramGroupID, paramRuleID)
getResp(t, ts.URL+"/vmalert/rule"+params, nil, 404)
getResp(ts.URL+"/vmalert/rule"+params, nil, 404)
params = fmt.Sprintf("?%s=1&%s=0", paramGroupID, paramRuleID)
getResp(t, ts.URL+"/vmalert/rule"+params, nil, 404)
getResp(ts.URL+"/vmalert/rule"+params, nil, 404)
})
t.Run("/api/v1/alerts", func(t *testing.T) {
lr := listAlertsResponse{}
getResp(t, ts.URL+"/api/v1/alerts", &lr, 200)
getResp(ts.URL+"/api/v1/alerts", &lr, 200)
if length := len(lr.Data.Alerts); length != 1 {
t.Errorf("expected 1 alert got %d", length)
}
lr = listAlertsResponse{}
getResp(t, ts.URL+"/vmalert/api/v1/alerts", &lr, 200)
getResp(ts.URL+"/vmalert/api/v1/alerts", &lr, 200)
if length := len(lr.Data.Alerts); length != 1 {
t.Errorf("expected 1 alert got %d", length)
}
@@ -104,13 +103,13 @@ func TestHandler(t *testing.T) {
t.Run("/api/v1/alert?alertID&groupID", func(t *testing.T) {
expAlert := newAlertAPI(ar, ar.GetAlerts()[0])
alert := &apiAlert{}
getResp(t, ts.URL+"/"+expAlert.APILink(), alert, 200)
getResp(ts.URL+"/"+expAlert.APILink(), alert, 200)
if !reflect.DeepEqual(alert, expAlert) {
t.Errorf("expected %v is equal to %v", alert, expAlert)
}
alert = &apiAlert{}
getResp(t, ts.URL+"/vmalert/"+expAlert.APILink(), alert, 200)
getResp(ts.URL+"/vmalert/"+expAlert.APILink(), alert, 200)
if !reflect.DeepEqual(alert, expAlert) {
t.Errorf("expected %v is equal to %v", alert, expAlert)
}
@@ -118,28 +117,28 @@ func TestHandler(t *testing.T) {
t.Run("/api/v1/alert?badParams", func(t *testing.T) {
params := fmt.Sprintf("?%s=0&%s=1", paramGroupID, paramAlertID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
getResp(ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
params = fmt.Sprintf("?%s=1&%s=0", paramGroupID, paramAlertID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
getResp(ts.URL+"/api/v1/alert"+params, nil, 404)
getResp(ts.URL+"/vmalert/api/v1/alert"+params, nil, 404)
// bad request, alertID is missing
params = fmt.Sprintf("?%s=1", paramGroupID)
getResp(t, ts.URL+"/api/v1/alert"+params, nil, 400)
getResp(t, ts.URL+"/vmalert/api/v1/alert"+params, nil, 400)
getResp(ts.URL+"/api/v1/alert"+params, nil, 400)
getResp(ts.URL+"/vmalert/api/v1/alert"+params, nil, 400)
})
t.Run("/api/v1/rules", func(t *testing.T) {
lr := listGroupsResponse{}
getResp(t, ts.URL+"/api/v1/rules", &lr, 200)
getResp(ts.URL+"/api/v1/rules", &lr, 200)
if length := len(lr.Data.Groups); length != 1 {
t.Errorf("expected 1 group got %d", length)
}
lr = listGroupsResponse{}
getResp(t, ts.URL+"/vmalert/api/v1/rules", &lr, 200)
getResp(ts.URL+"/vmalert/api/v1/rules", &lr, 200)
if length := len(lr.Data.Groups); length != 1 {
t.Errorf("expected 1 group got %d", length)
}
@@ -147,93 +146,25 @@ func TestHandler(t *testing.T) {
t.Run("/api/v1/rule?ruleID&groupID", func(t *testing.T) {
expRule := ruleToAPI(ar)
gotRule := apiRule{}
getResp(t, ts.URL+"/"+expRule.APILink(), &gotRule, 200)
getResp(ts.URL+"/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Errorf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRule = apiRule{}
getResp(t, ts.URL+"/vmalert/"+expRule.APILink(), &gotRule, 200)
getResp(ts.URL+"/vmalert/"+expRule.APILink(), &gotRule, 200)
if expRule.ID != gotRule.ID {
t.Errorf("expected to get Rule %q; got %q instead", expRule.ID, gotRule.ID)
}
gotRuleWithUpdates := apiRuleWithUpdates{}
getResp(t, ts.URL+"/"+expRule.APILink(), &gotRuleWithUpdates, 200)
getResp(ts.URL+"/"+expRule.APILink(), &gotRuleWithUpdates, 200)
if gotRuleWithUpdates.StateUpdates == nil || len(gotRuleWithUpdates.StateUpdates) < 1 {
t.Fatalf("expected %+v to have state updates field not empty", gotRuleWithUpdates.StateUpdates)
}
})
t.Run("/api/v1/rules&filters", func(t *testing.T) {
check := func(url string, expGroups, expRules int) {
t.Helper()
lr := listGroupsResponse{}
getResp(t, ts.URL+url, &lr, 200)
if length := len(lr.Data.Groups); length != expGroups {
t.Errorf("expected %d groups got %d", expGroups, length)
}
if len(lr.Data.Groups) < 1 {
return
}
var rulesN int
for _, gr := range lr.Data.Groups {
rulesN += len(gr.Rules)
}
if rulesN != expRules {
t.Errorf("expected %d rules got %d", expRules, rulesN)
}
}
check("/api/v1/rules?type=alert", 1, 1)
check("/api/v1/rules?type=record", 1, 1)
check("/vmalert/api/v1/rules?type=alert", 1, 1)
check("/vmalert/api/v1/rules?type=record", 1, 1)
// no filtering expected due to bad params
check("/api/v1/rules?type=badParam", 1, 2)
check("/api/v1/rules?foo=bar", 1, 2)
check("/api/v1/rules?rule_group[]=foo&rule_group[]=bar", 0, 0)
check("/api/v1/rules?rule_group[]=foo&rule_group[]=group&rule_group[]=bar", 1, 2)
check("/api/v1/rules?rule_group[]=group&file[]=foo", 0, 0)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml", 1, 2)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=foo", 1, 0)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=alert", 1, 1)
check("/api/v1/rules?rule_group[]=group&file[]=rules.yaml&rule_name[]=alert&rule_name[]=record", 1, 2)
})
t.Run("/api/v1/rules&exclude_alerts=true", func(t *testing.T) {
// check if response returns active alerts by default
lr := listGroupsResponse{}
getResp(t, ts.URL+"/api/v1/rules?rule_group[]=group&file[]=rules.yaml", &lr, 200)
activeAlerts := 0
for _, gr := range lr.Data.Groups {
for _, r := range gr.Rules {
activeAlerts += len(r.Alerts)
}
}
if activeAlerts == 0 {
t.Fatalf("expected at least 1 active alert in response; got 0")
}
// disable returning alerts via param
lr = listGroupsResponse{}
getResp(t, ts.URL+"/api/v1/rules?rule_group[]=group&file[]=rules.yaml&exclude_alerts=true", &lr, 200)
activeAlerts = 0
for _, gr := range lr.Data.Groups {
for _, r := range gr.Rules {
activeAlerts += len(r.Alerts)
}
}
if activeAlerts != 0 {
t.Fatalf("expected to get 0 active alert in response; got %d", activeAlerts)
}
})
}
func TestEmptyResponse(t *testing.T) {
@@ -241,7 +172,7 @@ func TestEmptyResponse(t *testing.T) {
ts := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) { rhWithNoGroups.handler(w, r) }))
defer ts.Close()
getResp := func(t *testing.T, url string, to interface{}, code int) {
getResp := func(url string, to interface{}, code int) {
t.Helper()
resp, err := http.Get(url)
if err != nil {
@@ -264,13 +195,13 @@ func TestEmptyResponse(t *testing.T) {
t.Run("no groups /api/v1/alerts", func(t *testing.T) {
lr := listAlertsResponse{}
getResp(t, ts.URL+"/api/v1/alerts", &lr, 200)
getResp(ts.URL+"/api/v1/alerts", &lr, 200)
if lr.Data.Alerts == nil {
t.Errorf("expected /api/v1/alerts response to have non-nil data")
}
lr = listAlertsResponse{}
getResp(t, ts.URL+"/vmalert/api/v1/alerts", &lr, 200)
getResp(ts.URL+"/vmalert/api/v1/alerts", &lr, 200)
if lr.Data.Alerts == nil {
t.Errorf("expected /api/v1/alerts response to have non-nil data")
}
@@ -278,13 +209,13 @@ func TestEmptyResponse(t *testing.T) {
t.Run("no groups /api/v1/rules", func(t *testing.T) {
lr := listGroupsResponse{}
getResp(t, ts.URL+"/api/v1/rules", &lr, 200)
getResp(ts.URL+"/api/v1/rules", &lr, 200)
if lr.Data.Groups == nil {
t.Errorf("expected /api/v1/rules response to have non-nil data")
}
lr = listGroupsResponse{}
getResp(t, ts.URL+"/vmalert/api/v1/rules", &lr, 200)
getResp(ts.URL+"/vmalert/api/v1/rules", &lr, 200)
if lr.Data.Groups == nil {
t.Errorf("expected /api/v1/rules response to have non-nil data")
}
@@ -295,13 +226,13 @@ func TestEmptyResponse(t *testing.T) {
t.Run("empty group /api/v1/rules", func(t *testing.T) {
lr := listGroupsResponse{}
getResp(t, ts.URL+"/api/v1/rules", &lr, 200)
getResp(ts.URL+"/api/v1/rules", &lr, 200)
if lr.Data.Groups == nil {
t.Fatalf("expected /api/v1/rules response to have non-nil data")
}
lr = listGroupsResponse{}
getResp(t, ts.URL+"/vmalert/api/v1/rules", &lr, 200)
getResp(ts.URL+"/vmalert/api/v1/rules", &lr, 200)
if lr.Data.Groups == nil {
t.Fatalf("expected /api/v1/rules response to have non-nil data")
}

View File

@@ -193,15 +193,10 @@ func ruleToAPI(r interface{}) apiRule {
return apiRule{}
}
const (
ruleTypeRecording = "recording"
ruleTypeAlerting = "alerting"
)
func recordingToAPI(rr *rule.RecordingRule) apiRule {
lastState := rule.GetLastEntry(rr)
r := apiRule{
Type: ruleTypeRecording,
Type: "recording",
DatasourceType: rr.Type.String(),
Name: rr.Name,
Query: rr.Expr,
@@ -229,7 +224,7 @@ func recordingToAPI(rr *rule.RecordingRule) apiRule {
func alertingToAPI(ar *rule.AlertingRule) apiRule {
lastState := rule.GetLastEntry(ar)
r := apiRule{
Type: ruleTypeAlerting,
Type: "alerting",
DatasourceType: ar.Type.String(),
Name: ar.Name,
Query: ar.Expr,

View File

@@ -2,30 +2,26 @@ package main
import (
"bytes"
"context"
"encoding/base64"
"flag"
"fmt"
"math"
"net"
"net/http"
"net/url"
"os"
"regexp"
"sort"
"strconv"
"strings"
"sync"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
"github.com/cespare/xxhash/v2"
"gopkg.in/yaml.v2"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envtemplate"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/fscore"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
)
@@ -38,34 +34,22 @@ var (
defaultRetryStatusCodes = flagutil.NewArrayInt("retryStatusCodes", 0, "Comma-separated list of default HTTP response status codes when vmauth re-tries the request on other backends. "+
"See https://docs.victoriametrics.com/vmauth.html#load-balancing for details")
defaultLoadBalancingPolicy = flag.String("loadBalancingPolicy", "least_loaded", "The default load balancing policy to use for backend urls specified inside url_prefix section. "+
"Supported policies: least_loaded, first_available. See https://docs.victoriametrics.com/vmauth.html#load-balancing")
discoverBackendIPsGlobal = flag.Bool("discoverBackendIPs", false, "Whether to discover backend IPs via periodic DNS queries to hostnames specified in url_prefix. "+
"This may be useful when url_prefix points to a hostname with dynamically scaled instances behind it. See https://docs.victoriametrics.com/vmauth.html#discovering-backend-ips")
discoverBackendIPsInterval = flag.Duration("discoverBackendIPsInterval", 10*time.Second, "The interval for re-discovering backend IPs if -discoverBackendIPs command-line flag is set. "+
"Too low value may lead to DNS errors")
httpAuthHeader = flagutil.NewArrayString("httpAuthHeader", "HTTP request header to use for obtaining authorization tokens. By default auth tokens are read from Authorization request header")
"Supported policies: least_loaded, first_available. See https://docs.victoriametrics.com/vmauth.html#load-balancing for more details")
)
// AuthConfig represents auth config.
type AuthConfig struct {
Users []UserInfo `yaml:"users,omitempty"`
UnauthorizedUser *UserInfo `yaml:"unauthorized_user,omitempty"`
// ms holds all the metrics for the given AuthConfig
ms *metrics.Set
}
// UserInfo is user information read from authConfigPath
type UserInfo struct {
Name string `yaml:"name,omitempty"`
BearerToken string `yaml:"bearer_token,omitempty"`
AuthToken string `yaml:"auth_token,omitempty"`
Username string `yaml:"username,omitempty"`
Password string `yaml:"password,omitempty"`
Name string `yaml:"name,omitempty"`
BearerToken string `yaml:"bearer_token,omitempty"`
Username string `yaml:"username,omitempty"`
Password string `yaml:"password,omitempty"`
URLPrefix *URLPrefix `yaml:"url_prefix,omitempty"`
DiscoverBackendIPs *bool `yaml:"discover_backend_ips,omitempty"`
URLMaps []URLMap `yaml:"url_map,omitempty"`
HeadersConf HeadersConf `yaml:",inline"`
MaxConcurrentRequests int `yaml:"max_concurrent_requests,omitempty"`
@@ -76,15 +60,12 @@ type UserInfo struct {
TLSInsecureSkipVerify *bool `yaml:"tls_insecure_skip_verify,omitempty"`
TLSCAFile string `yaml:"tls_ca_file,omitempty"`
MetricLabels map[string]string `yaml:"metric_labels,omitempty"`
concurrencyLimitCh chan struct{}
concurrencyLimitReached *metrics.Counter
httpTransport *http.Transport
requests *metrics.Counter
backendErrors *metrics.Counter
requestsDuration *metrics.Summary
}
@@ -120,8 +101,6 @@ func (ui *UserInfo) getMaxConcurrentRequests() int {
type Header struct {
Name string
Value string
sOriginal string
}
// UnmarshalYAML unmarshals h from f.
@@ -130,8 +109,6 @@ func (h *Header) UnmarshalYAML(f func(interface{}) error) error {
if err := f(&s); err != nil {
return err
}
h.sOriginal = s
n := strings.IndexByte(s, ':')
if n < 0 {
return fmt.Errorf("missing speparator char ':' between Name and Value in the header %q; expected format - 'Name: Value'", s)
@@ -143,29 +120,21 @@ func (h *Header) UnmarshalYAML(f func(interface{}) error) error {
// MarshalYAML marshals h to yaml.
func (h *Header) MarshalYAML() (interface{}, error) {
return h.sOriginal, nil
s := fmt.Sprintf("%s: %s", h.Name, h.Value)
return s, nil
}
// URLMap is a mapping from source paths to target urls.
type URLMap struct {
// SrcPaths is an optional list of regular expressions, which must match the request path.
SrcPaths []*Regex `yaml:"src_paths,omitempty"`
// SrcHosts is an optional list of regular expressions, which must match the request hostname.
// SrcHosts is the list of regular expressions, which match the request hostname.
SrcHosts []*Regex `yaml:"src_hosts,omitempty"`
// SrcQueryArgs is an optional list of query args, which must match request URL query args.
SrcQueryArgs []QueryArg `yaml:"src_query_args,omitempty"`
// SrcHeaders is an optional list of headers, which must match request headers.
SrcHeaders []Header `yaml:"src_headers,omitempty"`
// SrcPaths is the list of regular expressions, which match the request path.
SrcPaths []*Regex `yaml:"src_paths,omitempty"`
// UrlPrefix contains backend url prefixes for the proxied request url.
URLPrefix *URLPrefix `yaml:"url_prefix,omitempty"`
// DiscoverBackendIPs instructs discovering URLPrefix backend IPs via DNS.
DiscoverBackendIPs *bool `yaml:"discover_backend_ips,omitempty"`
// HeadersConf is the config for augumenting request and response headers.
HeadersConf HeadersConf `yaml:",inline"`
@@ -181,70 +150,25 @@ type URLMap struct {
// Regex represents a regex
type Regex struct {
re *regexp.Regexp
sOriginal string
}
// QueryArg represents HTTP query arg
type QueryArg struct {
Name string
Value string
sOriginal string
}
// UnmarshalYAML unmarshals up from yaml.
func (qa *QueryArg) UnmarshalYAML(f func(interface{}) error) error {
var s string
if err := f(&s); err != nil {
return err
}
qa.sOriginal = s
n := strings.IndexByte(s, '=')
if n >= 0 {
qa.Name = s[:n]
qa.Value = s[n+1:]
}
return nil
}
// MarshalYAML marshals up to yaml.
func (qa *QueryArg) MarshalYAML() (interface{}, error) {
return qa.sOriginal, nil
re *regexp.Regexp
}
// URLPrefix represents passed `url_prefix`
type URLPrefix struct {
n uint32
// the list of backend urls
bus []*backendURL
// requests are re-tried on other backend urls for these http response status codes
retryStatusCodes []int
// load balancing policy used
loadBalancingPolicy string
// how many request path prefix parts to drop before routing the request to backendURL
// how many request path prefix parts to drop before routing the request to backendURL.
dropSrcPathPrefixParts int
// busOriginal contains the original list of backends specified in yaml config.
busOriginal []*url.URL
// n is an atomic counter, which is used for balancing load among available backends.
n atomic.Uint32
// the list of backend urls
//
// the list can be dynamically updated if `discover_backend_ips` option is set.
bus atomic.Pointer[[]*backendURL]
// if this option is set, then backend ips for busOriginal are periodically re-discovered and put to bus.
discoverBackendIPs bool
// The next deadline for DNS-based discovery of backend IPs
nextDiscoveryDeadline atomic.Uint64
// vOriginal contains the original yaml value for URLPrefix.
vOriginal interface{}
}
func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
@@ -260,146 +184,49 @@ func (up *URLPrefix) setLoadBalancingPolicy(loadBalancingPolicy string) error {
}
type backendURL struct {
brokenDeadline atomic.Uint64
concurrentRequests atomic.Int32
url *url.URL
brokenDeadline uint64
concurrentRequests int32
url *url.URL
}
func (bu *backendURL) isBroken() bool {
ct := fasttime.UnixTimestamp()
return ct < bu.brokenDeadline.Load()
return ct < atomic.LoadUint64(&bu.brokenDeadline)
}
func (bu *backendURL) setBroken() {
deadline := fasttime.UnixTimestamp() + uint64((*failTimeout).Seconds())
bu.brokenDeadline.Store(deadline)
atomic.StoreUint64(&bu.brokenDeadline, deadline)
}
func (bu *backendURL) get() {
bu.concurrentRequests.Add(1)
atomic.AddInt32(&bu.concurrentRequests, 1)
}
func (bu *backendURL) put() {
bu.concurrentRequests.Add(-1)
atomic.AddInt32(&bu.concurrentRequests, -1)
}
func (up *URLPrefix) getBackendsCount() int {
pbus := up.bus.Load()
return len(*pbus)
return len(up.bus)
}
// getBackendURL returns the backendURL depending on the load balance policy.
//
// backendURL.put() must be called on the returned backendURL after the request is complete.
func (up *URLPrefix) getBackendURL() *backendURL {
up.discoverBackendIPsIfNeeded()
pbus := up.bus.Load()
bus := *pbus
if up.loadBalancingPolicy == "first_available" {
return getFirstAvailableBackendURL(bus)
return up.getFirstAvailableBackendURL()
}
return getLeastLoadedBackendURL(bus, &up.n)
}
func (up *URLPrefix) discoverBackendIPsIfNeeded() {
if !up.discoverBackendIPs {
// The discovery is disabled.
return
}
ct := fasttime.UnixTimestamp()
deadline := up.nextDiscoveryDeadline.Load()
if ct < deadline {
// There is no need in discovering backends.
return
}
intervalSec := math.Ceil(discoverBackendIPsInterval.Seconds())
if intervalSec <= 0 {
intervalSec = 1
}
nextDeadline := ct + uint64(intervalSec)
if !up.nextDiscoveryDeadline.CompareAndSwap(deadline, nextDeadline) {
// Concurrent goroutine already started the discovery.
return
}
// Discover ips for all the backendURLs
ctx, cancel := context.WithTimeout(context.Background(), time.Second*time.Duration(intervalSec))
hostToIPs := make(map[string][]string)
for _, bu := range up.busOriginal {
host := bu.Hostname()
if hostToIPs[host] != nil {
// ips for the given host have been already discovered
continue
}
addrs, err := resolver.LookupIPAddr(ctx, host)
var ips []string
if err != nil {
logger.Warnf("cannot discover backend IPs for %s: %s; use it literally", bu, err)
ips = []string{host}
} else {
ips = make([]string, len(addrs))
for i, addr := range addrs {
ips[i] = addr.String()
}
// sort ips, so they could be compared below in areEqualBackendURLs()
sort.Strings(ips)
}
hostToIPs[host] = ips
}
cancel()
// generate new backendURLs for the resolved IPs
var busNew []*backendURL
for _, bu := range up.busOriginal {
host := bu.Hostname()
port := bu.Port()
for _, ip := range hostToIPs[host] {
buCopy := *bu
buCopy.Host = ip
if port != "" {
buCopy.Host += ":" + port
}
busNew = append(busNew, &backendURL{
url: &buCopy,
})
}
}
pbus := up.bus.Load()
if areEqualBackendURLs(*pbus, busNew) {
return
}
// Store new backend urls
up.bus.Store(&busNew)
}
func areEqualBackendURLs(a, b []*backendURL) bool {
if len(a) != len(b) {
return false
}
for i, aURL := range a {
bURL := b[i]
if aURL.url.String() != bURL.url.String() {
return false
}
}
return true
}
var resolver = &net.Resolver{
PreferGo: true,
StrictErrors: true,
return up.getLeastLoadedBackendURL()
}
// getFirstAvailableBackendURL returns the first available backendURL, which isn't broken.
//
// backendURL.put() must be called on the returned backendURL after the request is complete.
func getFirstAvailableBackendURL(bus []*backendURL) *backendURL {
func (up *URLPrefix) getFirstAvailableBackendURL() *backendURL {
bus := up.bus
bu := bus[0]
if !bu.isBroken() {
// Fast path - send the request to the first url.
@@ -421,7 +248,8 @@ func getFirstAvailableBackendURL(bus []*backendURL) *backendURL {
// getLeastLoadedBackendURL returns the backendURL with the minimum number of concurrent requests.
//
// backendURL.put() must be called on the returned backendURL after the request is complete.
func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *backendURL {
func (up *URLPrefix) getLeastLoadedBackendURL() *backendURL {
bus := up.bus
if len(bus) == 1 {
// Fast path - return the only backend url.
bu := bus[0]
@@ -430,7 +258,7 @@ func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *
}
// Slow path - select other backend urls.
n := atomicCounter.Add(1)
n := atomic.AddUint32(&up.n, 1)
for i := uint32(0); i < uint32(len(bus)); i++ {
idx := (n + i) % uint32(len(bus))
@@ -438,22 +266,20 @@ func getLeastLoadedBackendURL(bus []*backendURL, atomicCounter *atomic.Uint32) *
if bu.isBroken() {
continue
}
if bu.concurrentRequests.Load() == 0 {
if atomic.CompareAndSwapInt32(&bu.concurrentRequests, 0, 1) {
// Fast path - return the backend with zero concurrently executed requests.
// Do not use CompareAndSwap() instead of Load(), since it is much slower on systems with many CPU cores.
bu.concurrentRequests.Add(1)
return bu
}
}
// Slow path - return the backend with the minimum number of concurrently executed requests.
buMin := bus[n%uint32(len(bus))]
minRequests := buMin.concurrentRequests.Load()
minRequests := atomic.LoadInt32(&buMin.concurrentRequests)
for _, bu := range bus {
if bu.isBroken() {
continue
}
if n := bu.concurrentRequests.Load(); n < minRequests {
if n := atomic.LoadInt32(&bu.concurrentRequests); n < minRequests {
buMin = bu
minRequests = n
}
@@ -468,7 +294,6 @@ func (up *URLPrefix) UnmarshalYAML(f func(interface{}) error) error {
if err := f(&v); err != nil {
return err
}
up.vOriginal = v
var urls []string
switch x := v.(type) {
@@ -491,21 +316,38 @@ func (up *URLPrefix) UnmarshalYAML(f func(interface{}) error) error {
return fmt.Errorf("unexpected type for `url_prefix`: %T; want string or []string", v)
}
bus := make([]*url.URL, len(urls))
bus := make([]*backendURL, len(urls))
for i, u := range urls {
pu, err := url.Parse(u)
if err != nil {
return fmt.Errorf("cannot unmarshal %q into url: %w", u, err)
}
bus[i] = pu
bus[i] = &backendURL{
url: pu,
}
}
up.busOriginal = bus
up.bus = bus
return nil
}
// MarshalYAML marshals up to yaml.
func (up *URLPrefix) MarshalYAML() (interface{}, error) {
return up.vOriginal, nil
var b []byte
if len(up.bus) == 1 {
u := up.bus[0].url.String()
b = strconv.AppendQuote(b, u)
return string(b), nil
}
b = append(b, '[')
for i, bu := range up.bus {
u := bu.url.String()
b = strconv.AppendQuote(b, u)
if i+1 < len(up.bus) {
b = append(b, ',')
}
}
b = append(b, ']')
return string(b), nil
}
func (r *Regex) match(s string) bool {
@@ -526,13 +368,12 @@ func (r *Regex) UnmarshalYAML(f func(interface{}) error) error {
if err := f(&s); err != nil {
return err
}
r.sOriginal = s
sAnchored := "^(?:" + s + ")$"
re, err := regexp.Compile(sAnchored)
if err != nil {
return fmt.Errorf("cannot build regexp from %q: %w", s, err)
}
r.sOriginal = s
r.re = re
return nil
}
@@ -621,19 +462,17 @@ func authConfigReloader(sighupCh <-chan os.Signal) {
// authConfigData needs to be updated each time authConfig is updated.
var authConfigData atomic.Pointer[[]byte]
var (
authConfig atomic.Pointer[AuthConfig]
authUsers atomic.Pointer[map[string]*UserInfo]
authConfigWG sync.WaitGroup
stopCh chan struct{}
)
var authConfig atomic.Pointer[AuthConfig]
var authUsers atomic.Pointer[map[string]*UserInfo]
var authConfigWG sync.WaitGroup
var stopCh chan struct{}
// loadAuthConfig loads and applies the config from *authConfigPath.
// It returns bool value to identify if new config was applied.
// The config can be not applied if there is a parsing error
// or if there are no changes to the current authConfig.
func loadAuthConfig() (bool, error) {
data, err := fscore.ReadFileOrHTTP(*authConfigPath)
data, err := fs.ReadFileOrHTTP(*authConfigPath)
if err != nil {
return false, fmt.Errorf("failed to read -auth.config=%q: %w", *authConfigPath, err)
}
@@ -655,11 +494,6 @@ func loadAuthConfig() (bool, error) {
}
logger.Infof("loaded information about %d users from -auth.config=%q", len(m), *authConfigPath)
prevAc := authConfig.Load()
if prevAc != nil {
metrics.UnregisterSet(prevAc.ms)
}
metrics.RegisterSet(ac.ms)
authConfig.Store(ac)
authConfigData.Store(&data)
authUsers.Store(&m)
@@ -672,13 +506,10 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
if err != nil {
return nil, fmt.Errorf("cannot expand environment vars: %w", err)
}
ac := &AuthConfig{
ms: metrics.NewSet(),
}
if err = yaml.UnmarshalStrict(data, ac); err != nil {
var ac AuthConfig
if err = yaml.UnmarshalStrict(data, &ac); err != nil {
return nil, fmt.Errorf("cannot unmarshal AuthConfig data: %w", err)
}
ui := ac.UnauthorizedUser
if ui != nil {
if ui.Username != "" {
@@ -690,39 +521,29 @@ func parseAuthConfig(data []byte) (*AuthConfig, error) {
if ui.BearerToken != "" {
return nil, fmt.Errorf("field bearer_token can't be specified for unauthorized_user section")
}
if ui.AuthToken != "" {
return nil, fmt.Errorf("field auth_token can't be specified for unauthorized_user section")
}
if ui.Name != "" {
return nil, fmt.Errorf("field name can't be specified for unauthorized_user section")
}
if err := ui.initURLs(); err != nil {
return nil, err
}
metricLabels, err := ui.getMetricLabels()
if err != nil {
return nil, fmt.Errorf("cannot parse metric_labels for unauthorized_user: %w", err)
}
ui.requests = ac.ms.NewCounter(`vmauth_unauthorized_user_requests_total` + metricLabels)
ui.backendErrors = ac.ms.NewCounter(`vmauth_unauthorized_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.NewSummary(`vmauth_unauthorized_user_request_duration_seconds` + metricLabels)
ui.requests = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_requests_total`)
ui.requestsDuration = metrics.GetOrCreateSummary(`vmauth_unauthorized_user_request_duration_seconds`)
ui.concurrencyLimitCh = make(chan struct{}, ui.getMaxConcurrentRequests())
ui.concurrencyLimitReached = ac.ms.NewCounter(`vmauth_unauthorized_user_concurrent_requests_limit_reached_total` + metricLabels)
_ = ac.ms.NewGauge(`vmauth_unauthorized_user_concurrent_requests_capacity`+metricLabels, func() float64 {
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(`vmauth_unauthorized_user_concurrent_requests_limit_reached_total`)
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_capacity`, func() float64 {
return float64(cap(ui.concurrencyLimitCh))
})
_ = ac.ms.NewGauge(`vmauth_unauthorized_user_concurrent_requests_current`+metricLabels, func() float64 {
_ = metrics.GetOrCreateGauge(`vmauth_unauthorized_user_concurrent_requests_current`, func() float64 {
return float64(len(ui.concurrencyLimitCh))
})
tr, err := getTransport(ui.TLSInsecureSkipVerify, ui.TLSCAFile)
if err != nil {
return nil, fmt.Errorf("cannot initialize HTTP transport: %w", err)
}
ui.httpTransport = tr
}
return ac, nil
return &ac, nil
}
func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
@@ -733,34 +554,43 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
byAuthToken := make(map[string]*UserInfo, len(uis))
for i := range uis {
ui := &uis[i]
ats, err := getAuthTokens(ui.AuthToken, ui.BearerToken, ui.Username, ui.Password)
if err != nil {
return nil, err
if ui.BearerToken == "" && ui.Username == "" {
return nil, fmt.Errorf("either bearer_token or username must be set")
}
for _, at := range ats {
if uiOld := byAuthToken[at]; uiOld != nil {
return nil, fmt.Errorf("duplicate auth token=%q found for username=%q, name=%q; the previous one is set for username=%q, name=%q",
at, ui.Username, ui.Name, uiOld.Username, uiOld.Name)
}
if ui.BearerToken != "" && ui.Username != "" {
return nil, fmt.Errorf("bearer_token=%q and username=%q cannot be set simultaneously", ui.BearerToken, ui.Username)
}
at1, at2 := getAuthTokens(ui.BearerToken, ui.Username, ui.Password)
if byAuthToken[at1] != nil {
return nil, fmt.Errorf("duplicate auth token found for bearer_token=%q, username=%q: %q", ui.BearerToken, ui.Username, at1)
}
if byAuthToken[at2] != nil {
return nil, fmt.Errorf("duplicate auth token found for bearer_token=%q, username=%q: %q", ui.BearerToken, ui.Username, at2)
}
if err := ui.initURLs(); err != nil {
return nil, err
}
metricLabels, err := ui.getMetricLabels()
if err != nil {
return nil, fmt.Errorf("cannot parse metric_labels: %w", err)
name := ui.name()
if ui.BearerToken != "" {
if ui.Password != "" {
return nil, fmt.Errorf("password shouldn't be set for bearer_token %q", ui.BearerToken)
}
ui.requests = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_requests_total{username=%q}`, name))
ui.requestsDuration = metrics.GetOrCreateSummary(fmt.Sprintf(`vmauth_user_request_duration_seconds{username=%q}`, name))
}
if ui.Username != "" {
ui.requests = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_requests_total{username=%q}`, name))
ui.requestsDuration = metrics.GetOrCreateSummary(fmt.Sprintf(`vmauth_user_request_duration_seconds{username=%q}`, name))
}
ui.requests = ac.ms.GetOrCreateCounter(`vmauth_user_requests_total` + metricLabels)
ui.backendErrors = ac.ms.GetOrCreateCounter(`vmauth_user_request_backend_errors_total` + metricLabels)
ui.requestsDuration = ac.ms.GetOrCreateSummary(`vmauth_user_request_duration_seconds` + metricLabels)
mcr := ui.getMaxConcurrentRequests()
ui.concurrencyLimitCh = make(chan struct{}, mcr)
ui.concurrencyLimitReached = ac.ms.GetOrCreateCounter(`vmauth_user_concurrent_requests_limit_reached_total` + metricLabels)
_ = ac.ms.GetOrCreateGauge(`vmauth_user_concurrent_requests_capacity`+metricLabels, func() float64 {
ui.concurrencyLimitReached = metrics.GetOrCreateCounter(fmt.Sprintf(`vmauth_user_concurrent_requests_limit_reached_total{username=%q}`, name))
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmauth_user_concurrent_requests_capacity{username=%q}`, name), func() float64 {
return float64(cap(ui.concurrencyLimitCh))
})
_ = ac.ms.GetOrCreateGauge(`vmauth_user_concurrent_requests_current`+metricLabels, func() float64 {
_ = metrics.GetOrCreateGauge(fmt.Sprintf(`vmauth_user_concurrent_requests_current{username=%q}`, name), func() float64 {
return float64(len(ui.concurrencyLimitCh))
})
@@ -770,43 +600,18 @@ func parseAuthConfigUsers(ac *AuthConfig) (map[string]*UserInfo, error) {
}
ui.httpTransport = tr
for _, at := range ats {
byAuthToken[at] = ui
}
byAuthToken[at1] = ui
byAuthToken[at2] = ui
}
return byAuthToken, nil
}
var labelNameRegexp = regexp.MustCompile("^[a-zA-Z_:.][a-zA-Z0-9_:.]*$")
func (ui *UserInfo) getMetricLabels() (string, error) {
name := ui.name()
if len(name) == 0 && len(ui.MetricLabels) == 0 {
// fast path
return "", nil
}
labels := make([]string, 0, len(ui.MetricLabels)+1)
if len(name) > 0 {
labels = append(labels, fmt.Sprintf(`username=%q`, name))
}
for k, v := range ui.MetricLabels {
if !labelNameRegexp.MatchString(k) {
return "", fmt.Errorf("incorrect label name=%q, it must match regex=%q for user=%q", k, labelNameRegexp, name)
}
labels = append(labels, fmt.Sprintf(`%s=%q`, k, v))
}
sort.Strings(labels)
labelsStr := "{" + strings.Join(labels, ",") + "}"
return labelsStr, nil
}
func (ui *UserInfo) initURLs() error {
retryStatusCodes := defaultRetryStatusCodes.Values()
loadBalancingPolicy := *defaultLoadBalancingPolicy
dropSrcPathPrefixParts := 0
discoverBackendIPs := *discoverBackendIPsGlobal
if ui.URLPrefix != nil {
if err := ui.URLPrefix.sanitizeAndInitialize(); err != nil {
if err := ui.URLPrefix.sanitize(); err != nil {
return err
}
if ui.RetryStatusCodes != nil {
@@ -818,35 +623,30 @@ func (ui *UserInfo) initURLs() error {
if ui.DropSrcPathPrefixParts != nil {
dropSrcPathPrefixParts = *ui.DropSrcPathPrefixParts
}
if ui.DiscoverBackendIPs != nil {
discoverBackendIPs = *ui.DiscoverBackendIPs
}
ui.URLPrefix.retryStatusCodes = retryStatusCodes
ui.URLPrefix.dropSrcPathPrefixParts = dropSrcPathPrefixParts
ui.URLPrefix.discoverBackendIPs = discoverBackendIPs
if err := ui.URLPrefix.setLoadBalancingPolicy(loadBalancingPolicy); err != nil {
return err
}
}
if ui.DefaultURL != nil {
if err := ui.DefaultURL.sanitizeAndInitialize(); err != nil {
if err := ui.DefaultURL.sanitize(); err != nil {
return err
}
}
for _, e := range ui.URLMaps {
if len(e.SrcPaths) == 0 && len(e.SrcHosts) == 0 && len(e.SrcQueryArgs) == 0 && len(e.SrcHeaders) == 0 {
return fmt.Errorf("missing `src_paths`, `src_hosts`, `src_query_args` and `src_headers` in `url_map`")
if len(e.SrcPaths) == 0 && len(e.SrcHosts) == 0 {
return fmt.Errorf("missing `src_paths` and `src_hosts` in `url_map`")
}
if e.URLPrefix == nil {
return fmt.Errorf("missing `url_prefix` in `url_map`")
}
if err := e.URLPrefix.sanitizeAndInitialize(); err != nil {
if err := e.URLPrefix.sanitize(); err != nil {
return err
}
rscs := retryStatusCodes
lbp := loadBalancingPolicy
dsp := dropSrcPathPrefixParts
dbd := discoverBackendIPs
if e.RetryStatusCodes != nil {
rscs = e.RetryStatusCodes
}
@@ -856,18 +656,14 @@ func (ui *UserInfo) initURLs() error {
if e.DropSrcPathPrefixParts != nil {
dsp = *e.DropSrcPathPrefixParts
}
if e.DiscoverBackendIPs != nil {
dbd = *e.DiscoverBackendIPs
}
e.URLPrefix.retryStatusCodes = rscs
if err := e.URLPrefix.setLoadBalancingPolicy(lbp); err != nil {
return err
}
e.URLPrefix.dropSrcPathPrefixParts = dsp
e.URLPrefix.discoverBackendIPs = dbd
}
if len(ui.URLMaps) == 0 && ui.URLPrefix == nil {
return fmt.Errorf("missing `url_prefix` or `url_map`")
return fmt.Errorf("missing `url_prefix`")
}
return nil
}
@@ -880,100 +676,39 @@ func (ui *UserInfo) name() string {
return ui.Username
}
if ui.BearerToken != "" {
h := xxhash.Sum64([]byte(ui.BearerToken))
return fmt.Sprintf("bearer_token:hash:%016X", h)
}
if ui.AuthToken != "" {
h := xxhash.Sum64([]byte(ui.AuthToken))
return fmt.Sprintf("auth_token:hash:%016X", h)
return "bearer_token"
}
return ""
}
func getAuthTokens(authToken, bearerToken, username, password string) ([]string, error) {
if authToken != "" {
if bearerToken != "" {
return nil, fmt.Errorf("bearer_token cannot be specified if auth_token is set")
}
if username != "" || password != "" {
return nil, fmt.Errorf("username and password cannot be specified if auth_token is set")
}
at := getHTTPAuthToken(authToken)
return []string{at}, nil
}
func getAuthTokens(bearerToken, username, password string) (string, string) {
if bearerToken != "" {
if username != "" || password != "" {
return nil, fmt.Errorf("username and password cannot be specified if bearer_token is set")
}
// Accept the bearerToken as Basic Auth username with empty password
at1 := getHTTPAuthBearerToken(bearerToken)
at2 := getHTTPAuthBasicToken(bearerToken, "")
return []string{at1, at2}, nil
at1 := getAuthToken(bearerToken, "", "")
at2 := getAuthToken("", bearerToken, "")
return at1, at2
}
if username != "" {
at := getHTTPAuthBasicToken(username, password)
return []string{at}, nil
at := getAuthToken("", username, password)
return at, at
}
func getAuthToken(bearerToken, username, password string) string {
if bearerToken != "" {
return "Bearer " + bearerToken
}
return nil, fmt.Errorf("missing authorization options; bearer_token or username must be set")
}
func getHTTPAuthToken(authToken string) string {
return "http_auth:" + authToken
}
func getHTTPAuthBearerToken(bearerToken string) string {
return "http_auth:Bearer " + bearerToken
}
func getHTTPAuthBasicToken(username, password string) string {
token := username + ":" + password
token64 := base64.StdEncoding.EncodeToString([]byte(token))
return "http_auth:Basic " + token64
return "Basic " + token64
}
var defaultHeaderNames = []string{"Authorization"}
func getAuthTokensFromRequest(r *http.Request) []string {
var ats []string
// Obtain possible auth tokens from one of the allowed auth headers
headerNames := *httpAuthHeader
if len(headerNames) == 0 {
headerNames = defaultHeaderNames
}
for _, headerName := range headerNames {
if ah := r.Header.Get(headerName); ah != "" {
if strings.HasPrefix(ah, "Token ") {
// Handle InfluxDB's proprietary token authentication scheme as a bearer token authentication
// See https://docs.influxdata.com/influxdb/v2.0/api/
ah = strings.Replace(ah, "Token", "Bearer", 1)
}
at := "http_auth:" + ah
ats = append(ats, at)
}
}
return ats
}
func (up *URLPrefix) sanitizeAndInitialize() error {
for i, bu := range up.busOriginal {
puNew, err := sanitizeURLPrefix(bu)
func (up *URLPrefix) sanitize() error {
for _, bu := range up.bus {
puNew, err := sanitizeURLPrefix(bu.url)
if err != nil {
return err
}
up.busOriginal[i] = puNew
bu.url = puNew
}
// Initialize up.bus
bus := make([]*backendURL, len(up.busOriginal))
for i, bu := range up.busOriginal {
bus[i] = &backendURL{
url: bu,
}
}
up.bus.Store(&bus)
return nil
}

View File

@@ -17,9 +17,9 @@ func TestParseAuthConfigFailure(t *testing.T) {
if err != nil {
return
}
users, err := parseAuthConfigUsers(ac)
_, err = parseAuthConfigUsers(ac)
if err == nil {
t.Fatalf("expecting non-nil error; got %v", users)
t.Fatalf("expecting non-nil error")
}
}
@@ -88,22 +88,6 @@ users:
url_prefix: []
`)
// auth_token and username in a single config
f(`
users:
- auth_token: foo
username: bbb
url_prefix: http://foo.bar
`)
// auth_token and bearer_token in a single config
f(`
users:
- auth_token: foo
bearer_token: bbb
url_prefix: http://foo.bar
`)
// Username and bearer_token in a single config
f(`
users:
@@ -208,7 +192,7 @@ users:
- url_prefix: http://foobar
`)
// Invalid regexp in src_paths
// Invalid regexp in src_path.
f(`
users:
- username: a
@@ -226,24 +210,6 @@ users:
url_prefix: http://foobar
`)
// Invalid src_query_args
f(`
users:
- username: a
url_map:
- src_query_args: abc
url_prefix: http://foobar
`)
// Invalid src_headers
f(`
users:
- username: a
url_map:
- src_headers: abc
url_prefix: http://foobar
`)
// Invalid headers in url_map (missing ':')
f(`
users:
@@ -263,14 +229,6 @@ users:
url_prefix: http://foobar
headers:
aaa: bbb
`)
// Invalid metric label name
f(`
users:
- username: foo
url_prefix: http://foo.bar
metric_labels:
not-prometheus-compatible: value
`)
}
@@ -291,9 +249,8 @@ func TestParseAuthConfigSuccess(t *testing.T) {
}
}
insecureSkipVerifyTrue := true
// Single user
insecureSkipVerifyTrue := true
f(`
users:
- username: foo
@@ -302,7 +259,7 @@ users:
max_concurrent_requests: 5
tls_insecure_skip_verify: true
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo", "bar"): {
getAuthToken("", "foo", "bar"): {
Username: "foo",
Password: "bar",
URLPrefix: mustParseURL("http://aaa:343/bbb"),
@@ -311,22 +268,6 @@ users:
},
})
// Single user with auth_token
f(`
users:
- auth_token: foo
url_prefix: http://aaa:343/bbb
max_concurrent_requests: 5
tls_insecure_skip_verify: true
`, map[string]*UserInfo{
getHTTPAuthToken("foo"): {
AuthToken: "foo",
URLPrefix: mustParseURL("http://aaa:343/bbb"),
MaxConcurrentRequests: 5,
TLSInsecureSkipVerify: &insecureSkipVerifyTrue,
},
})
// Multiple url_prefix entries
insecureSkipVerifyFalse := false
f(`
@@ -341,7 +282,7 @@ users:
load_balancing_policy: first_available
drop_src_path_prefix_parts: 1
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo", "bar"): {
getAuthToken("", "foo", "bar"): {
Username: "foo",
Password: "bar",
URLPrefix: mustParseURLs([]string{
@@ -361,60 +302,19 @@ users:
- username: foo
url_prefix: http://foo
- username: bar
url_prefix: https://bar/x/
url_prefix: https://bar/x///
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo", ""): {
getAuthToken("", "foo", ""): {
Username: "foo",
URLPrefix: mustParseURL("http://foo"),
},
getHTTPAuthBasicToken("bar", ""): {
getAuthToken("", "bar", ""): {
Username: "bar",
URLPrefix: mustParseURL("https://bar/x/"),
URLPrefix: mustParseURL("https://bar/x"),
},
})
// non-empty URLMap
sharedUserInfo := &UserInfo{
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcHosts: getRegexs([]string{"foo\\.bar", "baz:1234"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
SrcQueryArgs: []QueryArg{
{
Name: "foo",
Value: "bar",
},
},
SrcHeaders: []Header{
{
Name: "TenantID",
Value: "345",
},
},
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
}),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
{
Name: "foo",
Value: "bar",
},
{
Name: "xxx",
Value: "y",
},
},
},
},
},
}
f(`
users:
- bearer_token: foo
@@ -423,18 +323,71 @@ users:
url_prefix: http://vmselect/select/0/prometheus
- src_paths: ["/api/v1/write"]
src_hosts: ["foo\\.bar", "baz:1234"]
src_query_args: ['foo=bar']
src_headers: ['TenantID: 345']
url_prefix: ["http://vminsert1/insert/0/prometheus","http://vminsert2/insert/0/prometheus"]
headers:
- "foo: bar"
- "xxx: y"
`, map[string]*UserInfo{
getHTTPAuthBearerToken("foo"): sharedUserInfo,
getHTTPAuthBasicToken("foo", ""): sharedUserInfo,
getAuthToken("foo", "", ""): {
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcHosts: getRegexs([]string{"foo\\.bar", "baz:1234"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
}),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
{
Name: "foo",
Value: "bar",
},
{
Name: "xxx",
Value: "y",
},
},
},
},
},
},
getAuthToken("", "foo", ""): {
BearerToken: "foo",
URLMaps: []URLMap{
{
SrcPaths: getRegexs([]string{"/api/v1/query", "/api/v1/query_range", "/api/v1/label/[^./]+/.+"}),
URLPrefix: mustParseURL("http://vmselect/select/0/prometheus"),
},
{
SrcHosts: getRegexs([]string{"foo\\.bar", "baz:1234"}),
SrcPaths: getRegexs([]string{"/api/v1/write"}),
URLPrefix: mustParseURLs([]string{
"http://vminsert1/insert/0/prometheus",
"http://vminsert2/insert/0/prometheus",
}),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
{
Name: "foo",
Value: "bar",
},
{
Name: "xxx",
Value: "y",
},
},
},
},
},
},
})
// Multiple users with the same name - this should work, since these users have different passwords
// Multiple users with the same name
f(`
users:
- username: foo-same
@@ -442,20 +395,19 @@ users:
url_prefix: http://foo
- username: foo-same
password: bar
url_prefix: https://bar/x
url_prefix: https://bar/x///
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo-same", "baz"): {
getAuthToken("", "foo-same", "baz"): {
Username: "foo-same",
Password: "baz",
URLPrefix: mustParseURL("http://foo"),
},
getHTTPAuthBasicToken("foo-same", "bar"): {
getAuthToken("", "foo-same", "bar"): {
Username: "foo-same",
Password: "bar",
URLPrefix: mustParseURL("https://bar/x"),
},
})
// with default url
f(`
users:
@@ -472,7 +424,7 @@ users:
- http://default1/select/0/prometheus
- http://default2/select/0/prometheus
`, map[string]*UserInfo{
getHTTPAuthBearerToken("foo"): {
getAuthToken("foo", "", ""): {
BearerToken: "foo",
URLMaps: []URLMap{
{
@@ -504,7 +456,7 @@ users:
"http://default2/select/0/prometheus",
}),
},
getHTTPAuthBasicToken("foo", ""): {
getAuthToken("", "foo", ""): {
BearerToken: "foo",
URLMaps: []URLMap{
{
@@ -538,41 +490,6 @@ users:
},
})
// With metric_labels
f(`
users:
- username: foo-same
password: baz
url_prefix: http://foo
metric_labels:
dc: eu
team: dev
- username: foo-same
password: bar
url_prefix: https://bar/x
metric_labels:
backend_env: test
team: accounting
`, map[string]*UserInfo{
getHTTPAuthBasicToken("foo-same", "baz"): {
Username: "foo-same",
Password: "baz",
URLPrefix: mustParseURL("http://foo"),
MetricLabels: map[string]string{
"dc": "eu",
"team": "dev",
},
},
getHTTPAuthBasicToken("foo-same", "bar"): {
Username: "foo-same",
Password: "bar",
URLPrefix: mustParseURL("https://bar/x"),
MetricLabels: map[string]string{
"backend_env": "test",
"team": "accounting",
},
},
})
}
func TestParseAuthConfigPassesTLSVerificationConfig(t *testing.T) {
@@ -599,7 +516,7 @@ unauthorized_user:
t.Fatalf("unexpected error: %s", err)
}
ui := m[getHTTPAuthBasicToken("foo", "bar")]
ui := m[getAuthToken("", "foo", "bar")]
if !isSetBool(ui.TLSInsecureSkipVerify, true) || !ui.httpTransport.TLSClientConfig.InsecureSkipVerify {
t.Fatalf("unexpected TLSInsecureSkipVerify value for user foo")
}
@@ -609,86 +526,6 @@ unauthorized_user:
}
}
func TestUserInfoGetMetricLabels(t *testing.T) {
t.Run("empty-labels", func(t *testing.T) {
ui := &UserInfo{
Username: "user1",
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-username", func(t *testing.T) {
ui := &UserInfo{
Username: "user1",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-name", func(t *testing.T) {
ui := &UserInfo{
Name: "user1",
BearerToken: "abc",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="user1"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("non-empty-bearer-token", func(t *testing.T) {
ui := &UserInfo{
BearerToken: "abc",
MetricLabels: map[string]string{
"env": "prod",
"datacenter": "dc1",
},
}
labels, err := ui.getMetricLabels()
if err != nil {
t.Fatalf("unexpected error: %s", err)
}
labelsExpected := `{datacenter="dc1",env="prod",username="bearer_token:hash:44BC2CF5AD770999"}`
if labels != labelsExpected {
t.Fatalf("unexpected labels; got %s; want %s", labels, labelsExpected)
}
})
t.Run("invalid-label", func(t *testing.T) {
ui := &UserInfo{
Username: "foo",
MetricLabels: map[string]string{
",{": "aaaa",
},
}
_, err := ui.getMetricLabels()
if err == nil {
t.Fatalf("expecting non-nil error")
}
})
}
func isSetBool(boolP *bool, expectedValue bool) bool {
if boolP == nil {
return false
@@ -734,7 +571,6 @@ func mustParseURL(u string) *URLPrefix {
func mustParseURLs(us []string) *URLPrefix {
bus := make([]*backendURL, len(us))
urls := make([]*url.URL, len(us))
for i, u := range us {
pu, err := url.Parse(u)
if err != nil {
@@ -743,17 +579,10 @@ func mustParseURLs(us []string) *URLPrefix {
bus[i] = &backendURL{
url: pu,
}
urls[i] = pu
}
up := &URLPrefix{}
if len(us) == 1 {
up.vOriginal = us[0]
} else {
up.vOriginal = us
return &URLPrefix{
bus: bus,
}
up.bus.Store(&bus)
up.busOriginal = urls
return up
}
func intp(n int) *int {

View File

@@ -10,11 +10,6 @@ users:
- bearer_token: "XXXX"
url_prefix: "http://localhost:8428"
# Adds labels to the exported metrics for given user section
# label name must be prometheus compatible and match regex: `^[a-zA-Z_:.][a-zA-Z0-9_:.]*$`
metric_labels:
backend_dc: eu
access_team: dev
# Requests with the 'Authorization: Bearer YYY' header are proxied to http://localhost:8428 ,
# The `X-Scope-OrgID: foobar` http header is added to every proxied request.
# The `X-Server-Hostname:` http header is removed from the proxied response.

View File

@@ -13,7 +13,6 @@ import (
"net/textproto"
"net/url"
"os"
"slices"
"strings"
"sync"
"time"
@@ -25,7 +24,7 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/encoding"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/envflag"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs/fscore"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fs"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/netutil"
@@ -34,8 +33,8 @@ import (
)
var (
httpListenAddrs = flagutil.NewArrayString("httpListenAddr", "TCP address to listen for incoming http requests. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flagutil.NewArrayBool("httpListenAddr.useProxyProtocol", "Whether to use proxy protocol for connections accepted at the corresponding -httpListenAddr . "+
httpListenAddr = flag.String("httpListenAddr", ":8427", "TCP address to listen for http connections. See also -tls and -httpListenAddr.useProxyProtocol")
useProxyProtocol = flag.Bool("httpListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted at -httpListenAddr . "+
"See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt . "+
"With enabled proxy protocol http server cannot serve regular /metrics endpoint. Use -pushmetrics.url for metrics pushing")
maxIdleConnsPerBackend = flag.Int("maxIdleConnsPerBackend", 100, "The maximum number of idle connections vmauth can open per each backend host. "+
@@ -46,7 +45,7 @@ var (
maxConcurrentPerUserRequests = flag.Int("maxConcurrentPerUserRequests", 300, "The maximum number of concurrent requests vmauth can process per each configured user. "+
"Other requests are rejected with '429 Too Many Requests' http status code. See also -maxConcurrentRequests command-line option and max_concurrent_requests option "+
"in per-user config")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
reloadAuthKey = flag.String("reloadAuthKey", "", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
logInvalidAuthTokens = flag.Bool("logInvalidAuthTokens", false, "Whether to log requests with invalid auth tokens. "+
`Such requests are always counted at vmauth_http_request_errors_total{reason="invalid_auth_token"} metric, which is exposed at /metrics page`)
failTimeout = flag.Duration("failTimeout", 3*time.Second, "Sets a delay period for load balancing to skip a malfunctioning backend")
@@ -66,14 +65,10 @@ func main() {
buildinfo.Init()
logger.Init()
listenAddrs := *httpListenAddrs
if len(listenAddrs) == 0 {
listenAddrs = []string{":8427"}
}
logger.Infof("starting vmauth at %q...", listenAddrs)
logger.Infof("starting vmauth at %q...", *httpListenAddr)
startTime := time.Now()
initAuthConfig()
go httpserver.Serve(listenAddrs, useProxyProtocol, requestHandler)
go httpserver.Serve(*httpListenAddr, *useProxyProtocol, requestHandler)
logger.Infof("started vmauth in %.3f seconds", time.Since(startTime).Seconds())
pushmetrics.Init()
@@ -82,8 +77,8 @@ func main() {
pushmetrics.Stop()
startTime = time.Now()
logger.Infof("gracefully shutting down webservice at %q", listenAddrs)
if err := httpserver.Stop(listenAddrs); err != nil {
logger.Infof("gracefully shutting down webservice at %q", *httpListenAddr)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop the webservice: %s", err)
}
logger.Infof("successfully shut down the webservice in %.3f seconds", time.Since(startTime).Seconds())
@@ -94,7 +89,7 @@ func main() {
func requestHandler(w http.ResponseWriter, r *http.Request) bool {
switch r.URL.Path {
case "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
if !httpserver.CheckAuthFlag(w, r, *reloadAuthKey, "reloadAuthKey") {
return true
}
configReloadRequests.Inc()
@@ -102,9 +97,8 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(http.StatusOK)
return true
}
ats := getAuthTokensFromRequest(r)
if len(ats) == 0 {
authToken := r.Header.Get("Authorization")
if authToken == "" {
// Process requests for unauthorized users
ui := authConfig.Load().UnauthorizedUser
if ui != nil {
@@ -116,12 +110,18 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
http.Error(w, "missing `Authorization` request header", http.StatusUnauthorized)
return true
}
if strings.HasPrefix(authToken, "Token ") {
// Handle InfluxDB's proprietary token authentication scheme as a bearer token authentication
// See https://docs.influxdata.com/influxdb/v2.0/api/
authToken = strings.Replace(authToken, "Token", "Bearer", 1)
}
ui := getUserInfoByAuthTokens(ats)
ac := *authUsers.Load()
ui := ac[authToken]
if ui == nil {
invalidAuthTokenRequests.Inc()
if *logInvalidAuthTokens {
err := fmt.Errorf("cannot authorize request with auth tokens %q", ats)
err := fmt.Errorf("cannot find the provided auth token %q in config", authToken)
err = &httpserver.ErrorWithStatusCode{
Err: err,
StatusCode: http.StatusUnauthorized,
@@ -137,17 +137,6 @@ func requestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
}
func getUserInfoByAuthTokens(ats []string) *UserInfo {
ac := *authUsers.Load()
for _, at := range ats {
ui := ac[at]
if ui != nil {
return ui
}
}
return nil
}
func processUserRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
startTime := time.Now()
defer ui.requestsDuration.UpdateDuration(startTime)
@@ -176,7 +165,7 @@ func processUserRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
u := normalizeURL(r.URL)
up, hc := ui.getURLPrefixAndHeaders(u, r.Header)
up, hc := ui.getURLPrefixAndHeaders(u)
isDefault := false
if up == nil {
if ui.DefaultURL == nil {
@@ -212,7 +201,7 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
} else { // Update path for regular routes.
targetURL = mergeURLs(targetURL, u, up.dropSrcPathPrefixParts)
}
ok := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui)
ok := tryProcessingRequest(w, r, targetURL, hc, up.retryStatusCodes, ui.httpTransport)
bu.put()
if ok {
return
@@ -224,23 +213,15 @@ func processRequest(w http.ResponseWriter, r *http.Request, ui *UserInfo) {
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
}
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, ui *UserInfo) bool {
func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url.URL, hc HeadersConf, retryStatusCodes []int, transport *http.Transport) bool {
// This code has been copied from net/http/httputil/reverseproxy.go
req := sanitizeRequestHeaders(r)
req.URL = targetURL
if req.URL.Scheme == "https" {
// Override req.Host only for https requests, since https server verifies hostnames during TLS handshake,
// so it expects the targetURL.Host in the request.
// There is no need in overriding the req.Host for http requests, since it is expected that backend server
// may properly process queries with the original req.Host.
req.Host = targetURL.Host
}
req.Host = targetURL.Host
updateHeadersByConfig(req.Header, hc.RequestHeaders)
res, err := ui.httpTransport.RoundTrip(req)
res, err := transport.RoundTrip(req)
rtb, rtbOK := req.Body.(*readTrackingBody)
if err != nil {
if errors.Is(err, context.Canceled) || errors.Is(err, context.DeadlineExceeded) {
@@ -248,20 +229,15 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
logger.Warnf("remoteAddr: %s; requestURI: %s; error when proxying response body from %s: %s", remoteAddr, requestURI, targetURL, err)
if errors.Is(err, context.DeadlineExceeded) {
// Timed out request must be counted as errors, since this usually means that the backend is slow.
ui.backendErrors.Inc()
}
return true
}
if !rtbOK || !rtb.canRetry() {
// Request body cannot be re-sent to another backend. Return the error to the client then.
err = &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("cannot proxy the request to %s: %w", targetURL, err),
Err: fmt.Errorf("cannot proxy the request to %q: %w", targetURL, err),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
return true
}
// Retry the request if its body wasn't read yet. This usually means that the backend isn't reachable.
@@ -271,20 +247,7 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
logger.Warnf("remoteAddr: %s; requestURI: %s; retrying the request to %s because of response error: %s", remoteAddr, req.URL, targetURL, err)
return false
}
if slices.Contains(retryStatusCodes, res.StatusCode) {
_ = res.Body.Close()
if !rtbOK || !rtb.canRetry() {
// If we get an error from the retry_status_codes list, but cannot execute retry,
// we consider such a request an error as well.
err := &httpserver.ErrorWithStatusCode{
Err: fmt.Errorf("got response status code=%d from %s, but cannot retry the request on another backend, because the request has been already consumed",
res.StatusCode, targetURL),
StatusCode: http.StatusServiceUnavailable,
}
httpserver.Errorf(w, r, "%s", err)
ui.backendErrors.Inc()
return true
}
if (rtbOK && rtb.canRetry()) && hasInt(retryStatusCodes, res.StatusCode) {
// Retry requests at other backends if it matches retryStatusCodes.
// See https://github.com/VictoriaMetrics/VictoriaMetrics/issues/4893
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
@@ -303,7 +266,6 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
copyBuf.B = bytesutil.ResizeNoCopyNoOverallocate(copyBuf.B, 16*1024)
_, err = io.CopyBuffer(w, res.Body, copyBuf.B)
copyBufPool.Put(copyBuf)
_ = res.Body.Close()
if err != nil && !netutil.IsTrivialNetworkError(err) {
remoteAddr := httpserver.GetQuotedRemoteAddr(r)
requestURI := httpserver.GetRequestURI(r)
@@ -313,6 +275,15 @@ func tryProcessingRequest(w http.ResponseWriter, r *http.Request, targetURL *url
return true
}
func hasInt(a []int, n int) bool {
for _, x := range a {
if x == n {
return true
}
}
return false
}
var copyBufPool bytesutil.ByteBufferPool
func copyHeader(dst, src http.Header) {
@@ -422,10 +393,8 @@ func getTransport(insecureSkipVerifyP *bool, caFile string) (*http.Transport, er
return tr, nil
}
var (
transportMap = make(map[string]*http.Transport)
transportMapLock sync.Mutex
)
var transportMap = make(map[string]*http.Transport)
var transportMapLock sync.Mutex
func appendTransportKey(dst []byte, insecureSkipVerify bool, caFile string) []byte {
dst = encoding.MarshalBool(dst, insecureSkipVerify)
@@ -453,7 +422,7 @@ func newTransport(insecureSkipVerify bool, caFile string) (*http.Transport, erro
tlsCfg.ClientSessionCache = tls.NewLRUClientSessionCache(0)
tlsCfg.InsecureSkipVerify = insecureSkipVerify
if caFile != "" {
data, err := fscore.ReadFileOrHTTP(caFile)
data, err := fs.ReadFileOrHTTP(caFile)
if err != nil {
return nil, fmt.Errorf("cannot read tls_ca_file: %w", err)
}

View File

@@ -1,10 +1,8 @@
package main
import (
"net/http"
"net/url"
"path"
"slices"
"strings"
)
@@ -51,22 +49,11 @@ func dropPrefixParts(path string, parts int) string {
return path
}
func (ui *UserInfo) getURLPrefixAndHeaders(u *url.URL, h http.Header) (*URLPrefix, HeadersConf) {
func (ui *UserInfo) getURLPrefixAndHeaders(u *url.URL) (*URLPrefix, HeadersConf) {
for _, e := range ui.URLMaps {
if !matchAnyRegex(e.SrcHosts, u.Host) {
continue
if matchAnyRegex(e.SrcHosts, u.Host) && matchAnyRegex(e.SrcPaths, u.Path) {
return e.URLPrefix, e.HeadersConf
}
if !matchAnyRegex(e.SrcPaths, u.Path) {
continue
}
if !matchAnyQueryArg(e.SrcQueryArgs, u.Query()) {
continue
}
if !matchAnyHeader(e.SrcHeaders, h) {
continue
}
return e.URLPrefix, e.HeadersConf
}
if ui.URLPrefix != nil {
return ui.URLPrefix, ui.HeadersConf
@@ -86,30 +73,6 @@ func matchAnyRegex(rs []*Regex, s string) bool {
return false
}
func matchAnyQueryArg(qas []QueryArg, args url.Values) bool {
if len(qas) == 0 {
return true
}
for _, qa := range qas {
if slices.Contains(args[qa.Name], qa.Value) {
return true
}
}
return false
}
func matchAnyHeader(headers []Header, h http.Header) bool {
if len(headers) == 0 {
return true
}
for _, header := range headers {
if slices.Contains(h.Values(header.Name), header.Value) {
return true
}
}
return false
}
func normalizeURL(uOrig *url.URL) *url.URL {
u := *uOrig
// Prevent from attacks with using `..` in r.URL.Path

View File

@@ -4,7 +4,6 @@ import (
"fmt"
"net/url"
"reflect"
"strings"
"testing"
)
@@ -90,21 +89,19 @@ func TestCreateTargetURLSuccess(t *testing.T) {
t.Fatalf("cannot parse %q: %s", requestURI, err)
}
u = normalizeURL(u)
up, hc := ui.getURLPrefixAndHeaders(u, nil)
up, hc := ui.getURLPrefixAndHeaders(u)
if up == nil {
t.Fatalf("cannot determie backend: %s", err)
}
bu := up.getBackendURL()
bu := up.getLeastLoadedBackendURL()
target := mergeURLs(bu.url, u, up.dropSrcPathPrefixParts)
bu.put()
if target.String() != expectedTarget {
t.Fatalf("unexpected target; got %q; want %q", target, expectedTarget)
}
if s := headersToString(hc.RequestHeaders); s != expectedRequestHeaders {
t.Fatalf("unexpected request headers; got %q; want %q", s, expectedRequestHeaders)
}
if s := headersToString(hc.ResponseHeaders); s != expectedResponseHeaders {
t.Fatalf("unexpected response headers; got %q; want %q", s, expectedResponseHeaders)
headersStr := fmt.Sprintf("%q", hc.RequestHeaders)
if headersStr != expectedRequestHeaders {
t.Fatalf("unexpected request headers; got %s; want %s", headersStr, expectedRequestHeaders)
}
if !reflect.DeepEqual(up.retryStatusCodes, expectedRetryStatusCodes) {
t.Fatalf("unexpected retryStatusCodes; got %d; want %d", up.retryStatusCodes, expectedRetryStatusCodes)
@@ -119,55 +116,41 @@ func TestCreateTargetURLSuccess(t *testing.T) {
// Simple routing with `url_prefix`
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar"),
}, "", "http://foo.bar/.", "", "", nil, "least_loaded", 0)
}, "", "http://foo.bar/.", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar"),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
{
Name: "bb",
Value: "aaa",
},
},
ResponseHeaders: []Header{
{
Name: "x",
Value: "y",
},
},
RequestHeaders: []Header{{
Name: "bb",
Value: "aaa",
}},
},
RetryStatusCodes: []int{503, 501},
LoadBalancingPolicy: "first_available",
DropSrcPathPrefixParts: intp(2),
}, "/a/b/c", "http://foo.bar/c", `bb: aaa`, `x: y`, []int{503, 501}, "first_available", 2)
}, "/a/b/c", "http://foo.bar/c", `[{"bb" "aaa"}]`, `[]`, []int{503, 501}, "first_available", 2)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar/federate"),
}, "/", "http://foo.bar/federate", "", "", nil, "least_loaded", 0)
}, "/", "http://foo.bar/federate", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar"),
}, "a/b?c=d", "http://foo.bar/a/b?c=d", "", "", nil, "least_loaded", 0)
}, "a/b?c=d", "http://foo.bar/a/b?c=d", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("https://sss:3894/x/y"),
}, "/z", "https://sss:3894/x/y/z", "", "", nil, "least_loaded", 0)
}, "/z", "https://sss:3894/x/y/z", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("https://sss:3894/x/y"),
}, "/../../aaa", "https://sss:3894/x/y/aaa", "", "", nil, "least_loaded", 0)
}, "/../../aaa", "https://sss:3894/x/y/aaa", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("https://sss:3894/x/y"),
}, "/./asd/../../aaa?a=d&s=s/../d", "https://sss:3894/x/y/aaa?a=d&s=s%2F..%2Fd", "", "", nil, "least_loaded", 0)
}, "/./asd/../../aaa?a=d&s=s/../d", "https://sss:3894/x/y/aaa?a=d&s=s%2F..%2Fd", "[]", "[]", nil, "least_loaded", 0)
// Complex routing with `url_map`
ui := &UserInfo{
URLMaps: []URLMap{
{
SrcHosts: getRegexs([]string{"host42"}),
SrcPaths: getRegexs([]string{"/vmsingle/api/v1/query"}),
SrcQueryArgs: []QueryArg{
{
Name: "db",
Value: "foo",
},
},
SrcHosts: getRegexs([]string{"host42"}),
SrcPaths: getRegexs([]string{"/vmsingle/api/v1/query"}),
URLPrefix: mustParseURL("http://vmselect/0/prometheus"),
HeadersConf: HeadersConf{
RequestHeaders: []Header{
@@ -212,12 +195,12 @@ func TestCreateTargetURLSuccess(t *testing.T) {
RetryStatusCodes: []int{502},
DropSrcPathPrefixParts: intp(2),
}
f(ui, "http://host42/vmsingle/api/v1/query?query=up&db=foo", "http://vmselect/0/prometheus/api/v1/query?db=foo&query=up",
"xx: aa\nyy: asdf", "qwe: rty", []int{503, 500, 501}, "first_available", 1)
f(ui, "http://host42/vmsingle/api/v1/query?query=up", "http://vmselect/0/prometheus/api/v1/query?query=up",
`[{"xx" "aa"} {"yy" "asdf"}]`, `[{"qwe" "rty"}]`, []int{503, 500, 501}, "first_available", 1)
f(ui, "http://host123/vmsingle/api/v1/query?query=up", "http://default-server/v1/query?query=up",
"bb: aaa", "x: y", []int{502}, "least_loaded", 2)
f(ui, "https://foo-host/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "", "", []int{}, "least_loaded", 0)
f(ui, "https://foo-host/foo/bar/api/v1/query_range", "http://default-server/api/v1/query_range", "bb: aaa", "x: y", []int{502}, "least_loaded", 2)
`[{"bb" "aaa"}]`, `[{"x" "y"}]`, []int{502}, "least_loaded", 2)
f(ui, "https://foo-host/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "[]", "[]", []int{}, "least_loaded", 0)
f(ui, "https://foo-host/foo/bar/api/v1/query_range", "http://default-server/api/v1/query_range", `[{"bb" "aaa"}]`, `[{"x" "y"}]`, []int{502}, "least_loaded", 2)
// Complex routing regexp paths in `url_map`
ui = &UserInfo{
@@ -237,19 +220,19 @@ func TestCreateTargetURLSuccess(t *testing.T) {
},
URLPrefix: mustParseURL("http://default-server"),
}
f(ui, "/api/v1/query?query=up", "http://vmselect/0/prometheus/api/v1/query?query=up", "", "", nil, "least_loaded", 0)
f(ui, "/api/v1/query_range?query=up", "http://vmselect/0/prometheus/api/v1/query_range?query=up", "", "", nil, "least_loaded", 0)
f(ui, "/api/v1/label/foo/values", "http://vmselect/0/prometheus/api/v1/label/foo/values", "", "", nil, "least_loaded", 0)
f(ui, "/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "", "", nil, "least_loaded", 0)
f(ui, "/api/v1/foo/bar", "http://default-server/api/v1/foo/bar", "", "", nil, "least_loaded", 0)
f(ui, "https://vmui.foobar.com/a/b?c=d", "http://vmui.host:1234/vmui/a/b?c=d", "", "", nil, "least_loaded", 0)
f(ui, "/api/v1/query?query=up", "http://vmselect/0/prometheus/api/v1/query?query=up", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/query_range?query=up", "http://vmselect/0/prometheus/api/v1/query_range?query=up", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/label/foo/values", "http://vmselect/0/prometheus/api/v1/label/foo/values", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/write", "http://vminsert/0/prometheus/api/v1/write", "[]", "[]", nil, "least_loaded", 0)
f(ui, "/api/v1/foo/bar", "http://default-server/api/v1/foo/bar", "[]", "[]", nil, "least_loaded", 0)
f(ui, "https://vmui.foobar.com/a/b?c=d", "http://vmui.host:1234/vmui/a/b?c=d", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar?extra_label=team=dev"),
}, "/api/v1/query", "http://foo.bar/api/v1/query?extra_label=team=dev", "", "", nil, "least_loaded", 0)
}, "/api/v1/query", "http://foo.bar/api/v1/query?extra_label=team=dev", "[]", "[]", nil, "least_loaded", 0)
f(&UserInfo{
URLPrefix: mustParseURL("http://foo.bar?extra_label=team=mobile"),
}, "/api/v1/query?extra_label=team=dev", "http://foo.bar/api/v1/query?extra_label=team%3Dmobile", "", "", nil, "least_loaded", 0)
}, "/api/v1/query?extra_label=team=dev", "http://foo.bar/api/v1/query?extra_label=team%3Dmobile", "[]", "[]", nil, "least_loaded", 0)
}
func TestCreateTargetURLFailure(t *testing.T) {
@@ -260,7 +243,7 @@ func TestCreateTargetURLFailure(t *testing.T) {
t.Fatalf("cannot parse %q: %s", requestURI, err)
}
u = normalizeURL(u)
up, hc := ui.getURLPrefixAndHeaders(u, nil)
up, hc := ui.getURLPrefixAndHeaders(u)
if up != nil {
t.Fatalf("unexpected non-empty up=%#v", up)
}
@@ -281,11 +264,3 @@ func TestCreateTargetURLFailure(t *testing.T) {
},
}, "/api/v1/write")
}
func headersToString(hs []Header) string {
a := make([]string, len(hs))
for i, h := range hs {
a[i] = fmt.Sprintf("%s: %s", h.Name, h.Value)
}
return strings.Join(a, "\n")
}

View File

@@ -20,7 +20,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/pushmetrics"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/snapshot"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/snapshot/snapshotutil"
)
var (
@@ -94,8 +93,7 @@ func main() {
}
}
listenAddrs := []string{*httpListenAddr}
go httpserver.Serve(listenAddrs, nil, nil)
go httpserver.Serve(*httpListenAddr, false, nil)
pushmetrics.Init()
err := makeBackup()
@@ -106,8 +104,8 @@ func main() {
pushmetrics.Stop()
startTime := time.Now()
logger.Infof("gracefully shutting down http server for metrics at %q", listenAddrs)
if err := httpserver.Stop(listenAddrs); err != nil {
logger.Infof("gracefully shutting down http server for metrics at %q", *httpListenAddr)
if err := httpserver.Stop(*httpListenAddr); err != nil {
logger.Fatalf("cannot stop http server for metrics: %s", err)
}
logger.Infof("successfully shut down http server for metrics in %.3f seconds", time.Since(startTime).Seconds())
@@ -170,7 +168,7 @@ See the docs at https://docs.victoriametrics.com/vmbackup.html .
}
func newSrcFS() (*fslocal.FS, error) {
if err := snapshotutil.Validate(*snapshotName); err != nil {
if err := snapshot.Validate(*snapshotName); err != nil {
return nil, fmt.Errorf("invalid -snapshotName=%q: %w", *snapshotName, err)
}
snapshotPath := filepath.Join(*storageDataPath, "snapshots", *snapshotName)

View File

@@ -15,6 +15,7 @@ func TestRetry_Do(t *testing.T) {
backoffFactor float64
backoffMinDuration time.Duration
retryableFunc retryableFunc
ctx context.Context
cancelTimeout time.Duration
want uint64
wantErr bool
@@ -24,6 +25,7 @@ func TestRetry_Do(t *testing.T) {
retryableFunc: func() error {
return ErrBadRequest
},
ctx: context.Background(),
want: 0,
wantErr: true,
},
@@ -33,6 +35,7 @@ func TestRetry_Do(t *testing.T) {
time.Sleep(time.Millisecond * 100)
return nil
},
ctx: context.Background(),
want: 0,
wantErr: true,
},
@@ -55,6 +58,7 @@ func TestRetry_Do(t *testing.T) {
}
return nil
},
ctx: context.Background(),
want: 1,
wantErr: false,
},
@@ -71,6 +75,7 @@ func TestRetry_Do(t *testing.T) {
}
return nil
},
ctx: context.Background(),
want: 5,
wantErr: true,
},
@@ -80,8 +85,14 @@ func TestRetry_Do(t *testing.T) {
backoffFactor: 1.7,
backoffMinDuration: time.Millisecond * 10,
retryableFunc: func() error {
return fmt.Errorf("got some error")
t := time.NewTicker(time.Millisecond * 5)
defer t.Stop()
for range t.C {
return fmt.Errorf("got some error")
}
return nil
},
ctx: context.Background(),
cancelTimeout: time.Millisecond * 40,
want: 3,
wantErr: true,
@@ -90,13 +101,12 @@ func TestRetry_Do(t *testing.T) {
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
r := &Backoff{retries: tt.backoffRetries, factor: tt.backoffFactor, minDuration: tt.backoffMinDuration}
ctx := context.Background()
if tt.cancelTimeout != 0 {
newCtx, cancelFn := context.WithTimeout(context.Background(), tt.cancelTimeout)
ctx = newCtx
newCtx, cancelFn := context.WithTimeout(tt.ctx, tt.cancelTimeout)
tt.ctx = newCtx
defer cancelFn()
}
got, err := r.Retry(ctx, tt.retryableFunc)
got, err := r.Retry(tt.ctx, tt.retryableFunc)
if (err != nil) != tt.wantErr {
t.Errorf("Retry() error = %v, wantErr %v", err, tt.wantErr)
return

View File

@@ -40,11 +40,6 @@ const (
vmSignificantFigures = "vm-significant-figures"
vmRoundDigits = "vm-round-digits"
vmDisableProgressBar = "vm-disable-progress-bar"
vmCertFile = "vm-cert-file"
vmKeyFile = "vm-key-file"
vmCAFile = "vm-CA-file"
vmServerName = "vm-server-name"
vmInsecureSkipVerify = "vm-insecure-skip-verify"
// also used in vm-native
vmExtraLabel = "vm-extra-label"
@@ -124,45 +119,19 @@ var (
Name: vmDisableProgressBar,
Usage: "Whether to disable progress bar per each worker during the import.",
},
&cli.StringFlag{
Name: vmCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to '--vmAddr'",
},
&cli.StringFlag{
Name: vmKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to '--vmAddr'",
},
&cli.StringFlag{
Name: vmCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to '--vmAddr'. By default, system CA is used",
},
&cli.StringFlag{
Name: vmServerName,
Usage: "Optional TLS server name to use for connections to '--vmAddr'. By default, the server name from '--vmAddr' is used",
},
&cli.BoolFlag{
Name: vmInsecureSkipVerify,
Usage: "Whether to skip tls verification when connecting to '--vmAddr'",
Value: false,
},
}
)
const (
otsdbAddr = "otsdb-addr"
otsdbConcurrency = "otsdb-concurrency"
otsdbQueryLimit = "otsdb-query-limit"
otsdbOffsetDays = "otsdb-offset-days"
otsdbHardTSStart = "otsdb-hard-ts-start"
otsdbRetentions = "otsdb-retentions"
otsdbFilters = "otsdb-filters"
otsdbNormalize = "otsdb-normalize"
otsdbMsecsTime = "otsdb-msecstime"
otsdbCertFile = "otsdb-cert-file"
otsdbKeyFile = "otsdb-key-file"
otsdbCAFile = "otsdb-CA-file"
otsdbServerName = "otsdb-server-name"
otsdbInsecureSkipVerify = "otsdb-insecure-skip-verify"
otsdbAddr = "otsdb-addr"
otsdbConcurrency = "otsdb-concurrency"
otsdbQueryLimit = "otsdb-query-limit"
otsdbOffsetDays = "otsdb-offset-days"
otsdbHardTSStart = "otsdb-hard-ts-start"
otsdbRetentions = "otsdb-retentions"
otsdbFilters = "otsdb-filters"
otsdbNormalize = "otsdb-normalize"
otsdbMsecsTime = "otsdb-msecstime"
)
var (
@@ -222,27 +191,6 @@ var (
Value: false,
Usage: "Whether to normalize all data received to lower case before forwarding to VictoriaMetrics",
},
&cli.StringFlag{
Name: otsdbCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to -otsdb-addr",
},
&cli.StringFlag{
Name: otsdbKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to -otsdb-addr",
},
&cli.StringFlag{
Name: otsdbCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to -otsdb-addr. By default, system CA is used",
},
&cli.StringFlag{
Name: otsdbServerName,
Usage: "Optional TLS server name to use for connections to -otsdb-addr. By default, the server name from -otsdb-addr is used",
},
&cli.BoolFlag{
Name: otsdbInsecureSkipVerify,
Usage: "Whether to skip tls verification when connecting to -otsdb-addr",
Value: false,
},
}
)
@@ -260,11 +208,6 @@ const (
influxMeasurementFieldSeparator = "influx-measurement-field-separator"
influxSkipDatabaseLabel = "influx-skip-database-label"
influxPrometheusMode = "influx-prometheus-mode"
influxCertFile = "influx-cert-file"
influxKeyFile = "influx-key-file"
influxCAFile = "influx-CA-file"
influxServerName = "influx-server-name"
influxInsecureSkipVerify = "influx-insecure-skip-verify"
)
var (
@@ -329,28 +272,7 @@ var (
},
&cli.BoolFlag{
Name: influxPrometheusMode,
Usage: "Whether to restore the original timeseries name previously written from Prometheus to InfluxDB v1 via remote_write.",
Value: false,
},
&cli.StringFlag{
Name: influxCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to -influx-addr",
},
&cli.StringFlag{
Name: influxKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to -influx-addr",
},
&cli.StringFlag{
Name: influxCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to -influx-addr. By default, system CA is used",
},
&cli.StringFlag{
Name: influxServerName,
Usage: "Optional TLS server name to use for connections to -influx-addr. By default, the server name from -influx-addr is used",
},
&cli.BoolFlag{
Name: influxInsecureSkipVerify,
Usage: "Whether to skip tls verification when connecting to -influx-addr",
Usage: "Wether to restore the original timeseries name previously written from Prometheus to InfluxDB v1 via remote_write.",
Value: false,
},
}
@@ -413,10 +335,6 @@ const (
vmNativeSrcPassword = "vm-native-src-password"
vmNativeSrcHeaders = "vm-native-src-headers"
vmNativeSrcBearerToken = "vm-native-src-bearer-token"
vmNativeSrcCertFile = "vm-native-src-cert-file"
vmNativeSrcKeyFile = "vm-native-src-key-file"
vmNativeSrcCAFile = "vm-native-src-ca-file"
vmNativeSrcServerName = "vm-native-src-server-name"
vmNativeSrcInsecureSkipVerify = "vm-native-src-insecure-skip-verify"
vmNativeDstAddr = "vm-native-dst-addr"
@@ -424,10 +342,6 @@ const (
vmNativeDstPassword = "vm-native-dst-password"
vmNativeDstHeaders = "vm-native-dst-headers"
vmNativeDstBearerToken = "vm-native-dst-bearer-token"
vmNativeDstCertFile = "vm-native-dst-cert-file"
vmNativeDstKeyFile = "vm-native-dst-key-file"
vmNativeDstCAFile = "vm-native-dst-ca-file"
vmNativeDstServerName = "vm-native-dst-server-name"
vmNativeDstInsecureSkipVerify = "vm-native-dst-insecure-skip-verify"
)
@@ -492,28 +406,6 @@ var (
Name: vmNativeSrcBearerToken,
Usage: "Optional bearer auth token to use for the corresponding `--vm-native-src-addr`",
},
&cli.StringFlag{
Name: vmNativeSrcCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to `--vm-native-src-addr`",
},
&cli.StringFlag{
Name: vmNativeSrcKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to `--vm-native-src-addr`",
},
&cli.StringFlag{
Name: vmNativeSrcCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to `--vm-native-src-addr`. By default, system CA is used",
},
&cli.StringFlag{
Name: vmNativeSrcServerName,
Usage: "Optional TLS server name to use for connections to `--vm-native-src-addr`. By default, the server name from `--vm-native-src-addr` is used",
},
&cli.BoolFlag{
Name: vmNativeSrcInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to `--vm-native-src-addr`",
Value: false,
},
&cli.StringFlag{
Name: vmNativeDstAddr,
Usage: "VictoriaMetrics address to perform import to. \n" +
@@ -541,28 +433,6 @@ var (
Name: vmNativeDstBearerToken,
Usage: "Optional bearer auth token to use for the corresponding `--vm-native-dst-addr`",
},
&cli.StringFlag{
Name: vmNativeDstCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to `--vm-native-dst-addr`",
},
&cli.StringFlag{
Name: vmNativeDstKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to `--vm-native-dst-addr`",
},
&cli.StringFlag{
Name: vmNativeDstCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to `--vm-native-dst-addr`. By default, system CA is used",
},
&cli.StringFlag{
Name: vmNativeDstServerName,
Usage: "Optional TLS server name to use for connections to `--vm-native-dst-addr`. By default, the server name from `--vm-native-dst-addr` is used",
},
&cli.BoolFlag{
Name: vmNativeDstInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to `--vm-native-dst-addr`",
Value: false,
},
&cli.StringSliceFlag{
Name: vmExtraLabel,
Value: nil,
@@ -598,6 +468,16 @@ var (
"Non-binary export/import API is less efficient, but supports deduplication if it is configured on vm-native-src-addr side.",
Value: false,
},
&cli.BoolFlag{
Name: vmNativeSrcInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to the source address",
Value: false,
},
&cli.BoolFlag{
Name: vmNativeDstInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to the destination address",
Value: false,
},
}
)
@@ -616,10 +496,6 @@ const (
remoteReadPassword = "remote-read-password"
remoteReadHTTPTimeout = "remote-read-http-timeout"
remoteReadHeaders = "remote-read-headers"
remoteReadCertFile = "remote-read-cert-file"
remoteReadKeyFile = "remote-read-key-file"
remoteReadCAFile = "remote-read-CA-file"
remoteReadServerName = "remote-read-server-name"
remoteReadInsecureSkipVerify = "remote-read-insecure-skip-verify"
remoteReadDisablePathAppend = "remote-read-disable-path-append"
)
@@ -698,22 +574,6 @@ var (
"For example, --remote-read-headers='My-Auth:foobar' would send 'My-Auth: foobar' HTTP header with every request to the corresponding remote source storage. \n" +
"Multiple headers must be delimited by '^^': --remote-read-headers='header1:value1^^header2:value2'",
},
&cli.StringFlag{
Name: remoteReadCertFile,
Usage: "Optional path to client-side TLS certificate file to use when connecting to -remote-read-src-addr",
},
&cli.StringFlag{
Name: remoteReadKeyFile,
Usage: "Optional path to client-side TLS key to use when connecting to -remote-read-src-addr",
},
&cli.StringFlag{
Name: remoteReadCAFile,
Usage: "Optional path to TLS CA file to use for verifying connections to -remote-read-src-addr. By default, system CA is used",
},
&cli.StringFlag{
Name: remoteReadServerName,
Usage: "Optional TLS server name to use for connections to remoteReadSrcAddr. By default, the server name from -remote-read-src-addr is used",
},
&cli.BoolFlag{
Name: remoteReadInsecureSkipVerify,
Usage: "Whether to skip TLS certificate verification when connecting to the remote read address",

View File

@@ -1,7 +1,6 @@
package influx
import (
"crypto/tls"
"fmt"
"io"
"log"
@@ -34,8 +33,7 @@ type Config struct {
Retention string
ChunkSize int
Filter Filter
TLSConfig *tls.Config
Filter Filter
}
// Filter contains configuration for filtering
@@ -88,10 +86,10 @@ type LabelPair struct {
// configured with passed Config
func NewClient(cfg Config) (*Client, error) {
c := influx.HTTPConfig{
Addr: cfg.Addr,
Username: cfg.Username,
Password: cfg.Password,
TLSConfig: cfg.TLSConfig,
Addr: cfg.Addr,
Username: cfg.Username,
Password: cfg.Password,
InsecureSkipVerify: true,
}
hc, err := influx.NewHTTPClient(c)
if err != nil {

View File

@@ -2,6 +2,7 @@ package main
import (
"context"
"crypto/tls"
"fmt"
"log"
"net/http"
@@ -24,7 +25,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/prometheus"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/buildinfo"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httputils"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/native/stream"
)
@@ -49,20 +49,8 @@ func main() {
Action: func(c *cli.Context) error {
fmt.Println("OpenTSDB import mode")
// create Transport with given TLS config
certFile := c.String(otsdbCertFile)
keyFile := c.String(otsdbKeyFile)
caFile := c.String(otsdbCAFile)
serverName := c.String(otsdbServerName)
insecureSkipVerify := c.Bool(otsdbInsecureSkipVerify)
addr := c.String(otsdbAddr)
tr, err := httputils.Transport(addr, certFile, keyFile, caFile, serverName, insecureSkipVerify)
if err != nil {
return fmt.Errorf("failed to create Transport: %s", err)
}
oCfg := opentsdb.Config{
Addr: addr,
Addr: c.String(otsdbAddr),
Limit: c.Int(otsdbQueryLimit),
Offset: c.Int64(otsdbOffsetDays),
HardTS: c.Int64(otsdbHardTSStart),
@@ -70,17 +58,13 @@ func main() {
Filters: c.StringSlice(otsdbFilters),
Normalize: c.Bool(otsdbNormalize),
MsecsTime: c.Bool(otsdbMsecsTime),
Transport: tr,
}
otsdbClient, err := opentsdb.NewClient(oCfg)
if err != nil {
return fmt.Errorf("failed to create opentsdb client: %s", err)
}
vmCfg, err := initConfigVM(c)
if err != nil {
return fmt.Errorf("failed to init VM configuration: %s", err)
}
vmCfg := initConfigVM(c)
// disable progress bars since openTSDB implementation
// does not use progress bar pool
vmCfg.DisableProgressBar = true
@@ -100,18 +84,6 @@ func main() {
Action: func(c *cli.Context) error {
fmt.Println("InfluxDB import mode")
// create TLS config
certFile := c.String(influxCertFile)
keyFile := c.String(influxKeyFile)
caFile := c.String(influxCAFile)
serverName := c.String(influxServerName)
insecureSkipVerify := c.Bool(influxInsecureSkipVerify)
tc, err := httputils.TLSConfig(certFile, keyFile, caFile, serverName, insecureSkipVerify)
if err != nil {
return fmt.Errorf("failed to create TLS Config: %s", err)
}
iCfg := influx.Config{
Addr: c.String(influxAddr),
Username: c.String(influxUser),
@@ -124,18 +96,13 @@ func main() {
TimeEnd: c.String(influxFilterTimeEnd),
},
ChunkSize: c.Int(influxChunkSize),
TLSConfig: tc,
}
influxClient, err := influx.NewClient(iCfg)
if err != nil {
return fmt.Errorf("failed to create influx client: %s", err)
}
vmCfg, err := initConfigVM(c)
if err != nil {
return fmt.Errorf("failed to init VM configuration: %s", err)
}
vmCfg := initConfigVM(c)
importer, err = vm.NewImporter(ctx, vmCfg)
if err != nil {
return fmt.Errorf("failed to create VM importer: %s", err)
@@ -158,42 +125,24 @@ func main() {
Usage: "Migrate time series via Prometheus remote-read protocol",
Flags: mergeFlags(globalFlags, remoteReadFlags, vmFlags),
Action: func(c *cli.Context) error {
fmt.Println("Remote-read import mode")
addr := c.String(remoteReadSrcAddr)
// create TLS config
certFile := c.String(remoteReadCertFile)
keyFile := c.String(remoteReadKeyFile)
caFile := c.String(remoteReadCAFile)
serverName := c.String(remoteReadServerName)
insecureSkipVerify := c.Bool(remoteReadInsecureSkipVerify)
tr, err := httputils.Transport(addr, certFile, keyFile, caFile, serverName, insecureSkipVerify)
if err != nil {
return fmt.Errorf("failed to create transport: %s", err)
}
rr, err := remoteread.NewClient(remoteread.Config{
Addr: addr,
Transport: tr,
Username: c.String(remoteReadUser),
Password: c.String(remoteReadPassword),
Timeout: c.Duration(remoteReadHTTPTimeout),
UseStream: c.Bool(remoteReadUseStream),
Headers: c.String(remoteReadHeaders),
LabelName: c.String(remoteReadFilterLabel),
LabelValue: c.String(remoteReadFilterLabelValue),
DisablePathAppend: c.Bool(remoteReadDisablePathAppend),
Addr: c.String(remoteReadSrcAddr),
Username: c.String(remoteReadUser),
Password: c.String(remoteReadPassword),
Timeout: c.Duration(remoteReadHTTPTimeout),
UseStream: c.Bool(remoteReadUseStream),
Headers: c.String(remoteReadHeaders),
LabelName: c.String(remoteReadFilterLabel),
LabelValue: c.String(remoteReadFilterLabelValue),
InsecureSkipVerify: c.Bool(remoteReadInsecureSkipVerify),
DisablePathAppend: c.Bool(remoteReadDisablePathAppend),
})
if err != nil {
return fmt.Errorf("error create remote read client: %s", err)
}
vmCfg, err := initConfigVM(c)
if err != nil {
return fmt.Errorf("failed to init VM configuration: %s", err)
}
vmCfg := initConfigVM(c)
importer, err := vm.NewImporter(ctx, vmCfg)
if err != nil {
return fmt.Errorf("failed to create VM importer: %s", err)
@@ -222,10 +171,7 @@ func main() {
Action: func(c *cli.Context) error {
fmt.Println("Prometheus import mode")
vmCfg, err := initConfigVM(c)
if err != nil {
return fmt.Errorf("failed to init VM configuration: %s", err)
}
vmCfg := initConfigVM(c)
importer, err = vm.NewImporter(ctx, vmCfg)
if err != nil {
return fmt.Errorf("failed to create VM importer: %s", err)
@@ -267,6 +213,7 @@ func main() {
var srcExtraLabels []string
srcAddr := strings.Trim(c.String(vmNativeSrcAddr), "/")
srcInsecureSkipVerify := c.Bool(vmNativeSrcInsecureSkipVerify)
srcAuthConfig, err := auth.Generate(
auth.WithBasicAuth(c.String(vmNativeSrcUser), c.String(vmNativeSrcPassword)),
auth.WithBearer(c.String(vmNativeSrcBearerToken)),
@@ -274,26 +221,16 @@ func main() {
if err != nil {
return fmt.Errorf("error initilize auth config for source: %s", srcAddr)
}
// create TLS config
srcCertFile := c.String(vmNativeSrcCertFile)
srcKeyFile := c.String(vmNativeSrcKeyFile)
srcCAFile := c.String(vmNativeSrcCAFile)
srcServerName := c.String(vmNativeSrcServerName)
srcInsecureSkipVerify := c.Bool(vmNativeSrcInsecureSkipVerify)
srcTC, err := httputils.TLSConfig(srcCertFile, srcKeyFile, srcCAFile, srcServerName, srcInsecureSkipVerify)
if err != nil {
return fmt.Errorf("failed to create TLS Config: %s", err)
}
srcHTTPClient := &http.Client{Transport: &http.Transport{
DisableKeepAlives: disableKeepAlive,
TLSClientConfig: srcTC,
TLSClientConfig: &tls.Config{
InsecureSkipVerify: srcInsecureSkipVerify,
},
}}
dstAddr := strings.Trim(c.String(vmNativeDstAddr), "/")
dstExtraLabels := c.StringSlice(vmExtraLabel)
dstInsecureSkipVerify := c.Bool(vmNativeDstInsecureSkipVerify)
dstAuthConfig, err := auth.Generate(
auth.WithBasicAuth(c.String(vmNativeDstUser), c.String(vmNativeDstPassword)),
auth.WithBearer(c.String(vmNativeDstBearerToken)),
@@ -301,22 +238,11 @@ func main() {
if err != nil {
return fmt.Errorf("error initilize auth config for destination: %s", dstAddr)
}
// create TLS config
dstCertFile := c.String(vmNativeDstCertFile)
dstKeyFile := c.String(vmNativeDstKeyFile)
dstCAFile := c.String(vmNativeDstCAFile)
dstServerName := c.String(vmNativeDstServerName)
dstInsecureSkipVerify := c.Bool(vmNativeDstInsecureSkipVerify)
dstTC, err := httputils.TLSConfig(dstCertFile, dstKeyFile, dstCAFile, dstServerName, dstInsecureSkipVerify)
if err != nil {
return fmt.Errorf("failed to create TLS Config: %s", err)
}
dstHTTPClient := &http.Client{Transport: &http.Transport{
DisableKeepAlives: disableKeepAlive,
TLSClientConfig: dstTC,
TLSClientConfig: &tls.Config{
InsecureSkipVerify: dstInsecureSkipVerify,
},
}}
p := vmNativeProcessor{
@@ -372,15 +298,14 @@ func main() {
if err != nil {
return cli.Exit(fmt.Errorf("cannot open exported block at path=%q err=%w", blockPath, err), 1)
}
defer f.Close()
var blocksCount atomic.Uint64
if err := stream.Parse(f, isBlockGzipped, func(_ *stream.Block) error {
blocksCount.Add(1)
var blocksCount uint64
if err := stream.Parse(f, isBlockGzipped, func(block *stream.Block) error {
atomic.AddUint64(&blocksCount, 1)
return nil
}); err != nil {
return cli.Exit(fmt.Errorf("cannot parse block at path=%q, blocksCount=%d, err=%w", blockPath, blocksCount.Load(), err), 1)
return cli.Exit(fmt.Errorf("cannot parse block at path=%q, blocksCount=%d, err=%w", blockPath, blocksCount, err), 1)
}
log.Printf("successfully verified block at path=%q, blockCount=%d", blockPath, blocksCount.Load())
log.Printf("successfully verified block at path=%q, blockCount=%d", blockPath, blocksCount)
return nil
},
},
@@ -405,24 +330,9 @@ func main() {
log.Printf("Total time: %v", time.Since(start))
}
func initConfigVM(c *cli.Context) (vm.Config, error) {
addr := c.String(vmAddr)
// create Transport with given TLS config
certFile := c.String(vmCertFile)
keyFile := c.String(vmKeyFile)
caFile := c.String(vmCAFile)
serverName := c.String(vmServerName)
insecureSkipVerify := c.Bool(vmInsecureSkipVerify)
tr, err := httputils.Transport(addr, certFile, keyFile, caFile, serverName, insecureSkipVerify)
if err != nil {
return vm.Config{}, fmt.Errorf("failed to create Transport: %s", err)
}
func initConfigVM(c *cli.Context) vm.Config {
return vm.Config{
Addr: addr,
Transport: tr,
Addr: c.String(vmAddr),
User: c.String(vmUser),
Password: c.String(vmPassword),
Concurrency: uint8(c.Int(vmConcurrency)),
@@ -434,5 +344,5 @@ func initConfigVM(c *cli.Context) (vm.Config, error) {
ExtraLabels: c.StringSlice(vmExtraLabel),
RateLimit: c.Int64(vmRateLimit),
DisableProgressBar: c.Bool(vmDisableProgressBar),
}, nil
}
}

View File

@@ -6,7 +6,6 @@ import (
"fmt"
"io"
"net/http"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/auth"
)
@@ -35,7 +34,7 @@ type Response struct {
}
// Explore finds metric names by provided filter from api/v1/label/__name__/values
func (c *Client) Explore(ctx context.Context, f Filter, tenantID string, start, end time.Time) ([]string, error) {
func (c *Client) Explore(ctx context.Context, f Filter, tenantID string) ([]string, error) {
url := fmt.Sprintf("%s/%s", c.Addr, nativeMetricNamesAddr)
if tenantID != "" {
url = fmt.Sprintf("%s/select/%s/prometheus/%s", c.Addr, tenantID, nativeMetricNamesAddr)
@@ -46,8 +45,12 @@ func (c *Client) Explore(ctx context.Context, f Filter, tenantID string, start,
}
params := req.URL.Query()
params.Set("start", start.Format(time.RFC3339))
params.Set("end", end.Format(time.RFC3339))
if f.TimeStart != "" {
params.Set("start", f.TimeStart)
}
if f.TimeEnd != "" {
params.Set("end", f.TimeEnd)
}
params.Set("match[]", f.Match)
req.URL.RawQuery = params.Encode()
@@ -60,7 +63,11 @@ func (c *Client) Explore(ctx context.Context, f Filter, tenantID string, start,
if err := json.NewDecoder(resp.Body).Decode(&response); err != nil {
return nil, fmt.Errorf("cannot decode series response: %s", err)
}
return response.MetricNames, resp.Body.Close()
if err := resp.Body.Close(); err != nil {
return nil, fmt.Errorf("cannot close series response body: %s", err)
}
return response.MetricNames, nil
}
// ImportPipe uses pipe reader in request to process data

View File

@@ -47,8 +47,6 @@ type Client struct {
Normalize bool
HardTS int64
MsecsTime bool
c *http.Client
}
// Config contains fields required
@@ -62,7 +60,6 @@ type Config struct {
Filters []string
Normalize bool
MsecsTime bool
Transport *http.Transport
}
// TimeRange contains data about time ranges to query
@@ -110,8 +107,7 @@ type Metric struct {
// FindMetrics discovers all metrics that OpenTSDB knows about (given a filter)
// e.g. /api/suggest?type=metrics&q=system&max=100000
func (c Client) FindMetrics(q string) ([]string, error) {
resp, err := c.c.Get(q)
resp, err := http.Get(q)
if err != nil {
return nil, fmt.Errorf("failed to send GET request to %q: %s", q, err)
}
@@ -135,7 +131,7 @@ func (c Client) FindMetrics(q string) ([]string, error) {
// e.g. /api/search/lookup?m=system.load5&limit=1000000
func (c Client) FindSeries(metric string) ([]Meta, error) {
q := fmt.Sprintf("%s/api/search/lookup?m=%s&limit=%d", c.Addr, metric, c.Limit)
resp, err := c.c.Get(q)
resp, err := http.Get(q)
if err != nil {
return nil, fmt.Errorf("failed to set GET request to %q: %s", q, err)
}
@@ -188,7 +184,7 @@ func (c Client) GetData(series Meta, rt RetentionMeta, start int64, end int64, m
series.Metric, tagStr)
q := fmt.Sprintf("%s/api/query?%s", c.Addr, queryStr)
resp, err := c.c.Get(q)
resp, err := http.Get(q)
if err != nil {
return Metric{}, fmt.Errorf("failed to send GET request to %q: %s", q, err)
}
@@ -329,7 +325,6 @@ func NewClient(cfg Config) (*Client, error) {
Normalize: cfg.Normalize,
HardTS: cfg.HardTS,
MsecsTime: cfg.MsecsTime,
c: &http.Client{Transport: cfg.Transport},
}
return client, nil
}

View File

@@ -2,7 +2,6 @@ package main
import (
"context"
"net/http"
"testing"
"time"
@@ -62,8 +61,7 @@ func TestRemoteRead(t *testing.T) {
{
name: "step month on month time range",
remoteReadConfig: remoteread.Config{Addr: "", LabelName: "__name__", LabelValue: ".*"},
vmCfg: vm.Config{Addr: "", Concurrency: 1, DisableProgressBar: true,
Transport: http.DefaultTransport.(*http.Transport)},
vmCfg: vm.Config{Addr: "", Concurrency: 1, DisableProgressBar: true},
start: "2022-09-26T11:23:05+02:00",
end: "2022-11-26T11:24:05+02:00",
numOfSamples: 2,

View File

@@ -11,6 +11,7 @@ import (
"strings"
"time"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/utils"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmctl/vm"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/gogo/protobuf/proto"
@@ -45,8 +46,6 @@ type Client struct {
type Config struct {
// Addr of remote storage
Addr string
// Transport allows specifying custom http.Transport
Transport *http.Transport
// DisablePathAppend disable automatic appending of the remote read path
DisablePathAppend bool
// Timeout defines timeout for HTTP requests
@@ -65,6 +64,8 @@ type Config struct {
// LabelName, LabelValue stands for label=~value pair used for read requests.
// Is optional.
LabelName, LabelValue string
// TLSSkipVerify defines whether to skip TLS certificate verification when connecting to the remote read address.
InsecureSkipVerify bool
}
// Filter defines a list of filters applied to requested data
@@ -102,13 +103,11 @@ func NewClient(cfg Config) (*Client, error) {
}
}
client := &http.Client{Timeout: cfg.Timeout}
if cfg.Transport != nil {
client.Transport = cfg.Transport
}
c := &Client{
c: client,
c: &http.Client{
Timeout: cfg.Timeout,
Transport: utils.Transport(cfg.Addr, cfg.InsecureSkipVerify),
},
addr: strings.TrimSuffix(cfg.Addr, "/"),
disablePathAppend: cfg.DisablePathAppend,
user: cfg.Username,
@@ -171,7 +170,7 @@ func (c *Client) fetch(ctx context.Context, data []byte, streamCb StreamCallback
if c.disablePathAppend {
u = c.addr
}
req, err := http.NewRequestWithContext(ctx, http.MethodPost, u, r)
req, err := http.NewRequest(http.MethodPost, u, r)
if err != nil {
return fmt.Errorf("failed to create new HTTP request: %w", err)
}
@@ -184,7 +183,7 @@ func (c *Client) fetch(ctx context.Context, data []byte, streamCb StreamCallback
}
req.Header.Set("X-Prometheus-Remote-Read-Version", "0.1.0")
resp, err := c.do(req)
resp, err := c.do(req.WithContext(ctx))
if err != nil {
return fmt.Errorf("error while sending request to %s: %w; Data len %d(%d)",
req.URL.Redacted(), err, len(data), r.Size())

View File

@@ -13,13 +13,13 @@ const (
maxTimeMsecs = int64(1<<63-1) / 1e6
)
// ParseTime parses time in s string and returns time.Time object
// if parse correctly or error if not
func ParseTime(s string) (time.Time, error) {
msecs, err := promutils.ParseTimeMsec(s)
// GetTime returns time from the given string.
func GetTime(s string) (time.Time, error) {
secs, err := promutils.ParseTime(s)
if err != nil {
return time.Time{}, fmt.Errorf("cannot parse %s: %w", s, err)
}
msecs := int64(secs * 1e3)
if msecs < minTimeMsecs {
msecs = 0
}

View File

@@ -165,7 +165,7 @@ func TestGetTime(t *testing.T) {
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
got, err := ParseTime(tt.s)
got, err := GetTime(tt.s)
if (err != nil) != tt.wantErr {
t.Errorf("ParseTime() error = %v, wantErr %v", err, tt.wantErr)
return

25
app/vmctl/utils/tls.go Normal file
View File

@@ -0,0 +1,25 @@
package utils
import (
"crypto/tls"
"net/http"
"strings"
)
// Transport creates http.Transport object based on provided URL.
// Returns Transport with TLS configuration if URL contains `https` prefix
func Transport(URL string, insecureSkipVerify bool) *http.Transport {
t := http.DefaultTransport.(*http.Transport).Clone()
if !strings.HasPrefix(URL, "https") {
return t
}
t.TLSClientConfig = TLSConfig(insecureSkipVerify)
return t
}
// TLSConfig creates tls.Config object from provided arguments
func TLSConfig(insecureSkipVerify bool) *tls.Config {
return &tls.Config{
InsecureSkipVerify: insecureSkipVerify,
}
}

View File

@@ -26,8 +26,6 @@ type Config struct {
// --httpListenAddr value for single node version
// --httpListenAddr value of vmselect component for cluster version
Addr string
// Transport allows specifying custom http.Transport
Transport *http.Transport
// Concurrency defines number of worker
// performing the import requests concurrently
Concurrency uint8
@@ -64,7 +62,6 @@ type Config struct {
// see https://docs.victoriametrics.com/#how-to-import-time-series-data
type Importer struct {
addr string
client *http.Client
importPath string
compress bool
user string
@@ -131,14 +128,8 @@ func NewImporter(ctx context.Context, cfg Config) (*Importer, error) {
return nil, err
}
client := &http.Client{}
if cfg.Transport != nil {
client.Transport = cfg.Transport
}
im := &Importer{
addr: addr,
client: client,
importPath: importPath,
compress: cfg.Compress,
user: cfg.User,
@@ -300,7 +291,7 @@ func (im *Importer) Ping() error {
if im.user != "" {
req.SetBasicAuth(im.user, im.password)
}
resp, err := im.client.Do(req)
resp, err := http.DefaultClient.Do(req)
if err != nil {
return err
}
@@ -330,7 +321,7 @@ func (im *Importer) Import(tsBatch []*TimeSeries) error {
errCh := make(chan error)
go func() {
errCh <- im.do(req)
errCh <- do(req)
close(errCh)
}()
@@ -384,8 +375,8 @@ func (im *Importer) Import(tsBatch []*TimeSeries) error {
// ErrBadRequest represents bad request error.
var ErrBadRequest = errors.New("bad request")
func (im *Importer) do(req *http.Request) error {
resp, err := im.client.Do(req)
func do(req *http.Request) error {
resp, err := http.DefaultClient.Do(req)
if err != nil {
return fmt.Errorf("unexpected error when performing request: %s", err)
}

View File

@@ -54,14 +54,14 @@ func (p *vmNativeProcessor) run(ctx context.Context) error {
startTime: time.Now(),
}
start, err := utils.ParseTime(p.filter.TimeStart)
start, err := utils.GetTime(p.filter.TimeStart)
if err != nil {
return fmt.Errorf("failed to parse %s, provided: %s, error: %w", vmNativeFilterTimeStart, p.filter.TimeStart, err)
}
end := time.Now().In(start.Location())
if p.filter.TimeEnd != "" {
end, err = utils.ParseTime(p.filter.TimeEnd)
end, err = utils.GetTime(p.filter.TimeEnd)
if err != nil {
return fmt.Errorf("failed to parse %s, provided: %s, error: %w", vmNativeFilterTimeEnd, p.filter.TimeEnd, err)
}
@@ -175,29 +175,28 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
dstURL = fmt.Sprintf("%s/insert/%s/prometheus/%s", p.dst.Addr, tenantID, importAddr)
}
barPrefix := "Requests to make"
initMessage := "Initing import process from %q to %q with filter %s"
initParams := []interface{}{srcURL, dstURL, p.filter.String()}
if p.interCluster {
barPrefix = fmt.Sprintf("Requests to make for tenant %s", tenantID)
initMessage = "Initing import process from %q to %q with filter %s for tenant %s"
initParams = []interface{}{srcURL, dstURL, p.filter.String(), tenantID}
}
fmt.Println("") // extra line for better output formatting
log.Printf(initMessage, initParams...)
if len(ranges) > 1 {
log.Printf("Selected time range will be split into %d ranges according to %q step", len(ranges), p.filter.Chunk)
}
var foundSeriesMsg string
var requestsToMake int
var metrics = map[string][][]time.Time{
"": ranges,
}
metrics := []string{p.filter.Match}
if !p.disablePerMetricRequests {
metrics, err = p.explore(ctx, p.src, tenantID, ranges, silent)
log.Printf("Exploring metrics...")
metrics, err = p.src.Explore(ctx, p.filter, tenantID)
if err != nil {
return fmt.Errorf("failed to explore metric names: %s", err)
return fmt.Errorf("cannot get metrics from source %s: %w", p.src.Addr, err)
}
if len(metrics) == 0 {
errMsg := "no metrics found"
if tenantID != "" {
@@ -206,10 +205,7 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
log.Println(errMsg)
return nil
}
for _, m := range metrics {
requestsToMake += len(m)
}
foundSeriesMsg = fmt.Sprintf("Found %d unique metric names to import. Total import/export requests to make %d", len(metrics), requestsToMake)
foundSeriesMsg = fmt.Sprintf("Found %d metrics to import", len(metrics))
}
if !p.interCluster {
@@ -223,13 +219,15 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
log.Print(foundSeriesMsg)
}
var bar *pb.ProgressBar
barPrefix := "Requests to make"
if p.interCluster {
barPrefix = fmt.Sprintf("Requests to make for tenant %s", tenantID)
processingMsg := fmt.Sprintf("Requests to make: %d", len(metrics)*len(ranges))
if len(ranges) > 1 {
processingMsg = fmt.Sprintf("Selected time range will be split into %d ranges according to %q step. %s", len(ranges), p.filter.Chunk, processingMsg)
}
log.Print(processingMsg)
var bar *pb.ProgressBar
if !silent {
bar = barpool.NewSingleProgress(fmt.Sprintf(nativeWithBackoffTpl, barPrefix), requestsToMake)
bar = barpool.NewSingleProgress(fmt.Sprintf(nativeWithBackoffTpl, barPrefix), len(metrics)*len(ranges))
if p.disablePerMetricRequests {
bar = barpool.NewSingleProgress(nativeSingleProcessTpl, 0)
}
@@ -265,19 +263,20 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
}
// any error breaks the import
for mName, mRanges := range metrics {
match, err := buildMatchWithFilter(p.filter.Match, mName)
for _, s := range metrics {
match, err := buildMatchWithFilter(p.filter.Match, s)
if err != nil {
logger.Errorf("failed to build filter %q for metric name %q: %s", p.filter.Match, mName, err)
logger.Errorf("failed to build export filters: %s", err)
continue
}
for _, times := range mRanges {
for _, times := range ranges {
select {
case <-ctx.Done():
return fmt.Errorf("context canceled")
case infErr := <-errCh:
return fmt.Errorf("export/import error: %s", infErr)
return fmt.Errorf("native error: %s", infErr)
case filterCh <- native.Filter{
Match: match,
TimeStart: times[0].Format(time.RFC3339),
@@ -298,32 +297,6 @@ func (p *vmNativeProcessor) runBackfilling(ctx context.Context, tenantID string,
return nil
}
func (p *vmNativeProcessor) explore(ctx context.Context, src *native.Client, tenantID string, ranges [][]time.Time, silent bool) (map[string][][]time.Time, error) {
log.Printf("Exploring metrics...")
var bar *pb.ProgressBar
if !silent {
bar = barpool.NewSingleProgress(fmt.Sprintf(nativeWithBackoffTpl, "Explore requests to make"), len(ranges))
bar.Start()
defer bar.Finish()
}
metrics := make(map[string][][]time.Time)
for _, r := range ranges {
ms, err := src.Explore(ctx, p.filter, tenantID, r[0], r[1])
if err != nil {
return nil, fmt.Errorf("cannot get metrics from %s on interval %v-%v: %w", src.Addr, r[0], r[1], err)
}
for i := range ms {
metrics[ms[i]] = append(metrics[ms[i]], r)
}
if bar != nil {
bar.Increment()
}
}
return metrics, nil
}
// stats represents client statistic
// when processing data
type stats struct {

View File

@@ -142,7 +142,7 @@ func (ctx *InsertCtx) ApplyRelabeling() {
// FlushBufs flushes buffered rows to the underlying storage.
func (ctx *InsertCtx) FlushBufs() error {
sas := sasGlobal.Load()
if (sas != nil || deduplicator != nil) && !ctx.skipStreamAggr {
if sas != nil && !ctx.skipStreamAggr {
matchIdxs := matchIdxsPool.Get()
matchIdxs.B = ctx.streamAggrCtx.push(ctx.mrs, matchIdxs.B)
if !*streamAggrKeepInput {

View File

@@ -9,10 +9,10 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vmstorage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/fasttime"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/logger"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/procutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promrelabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
"github.com/VictoriaMetrics/metrics"
@@ -28,12 +28,8 @@ var (
streamAggrDropInput = flag.Bool("streamAggr.dropInput", false, "Whether to drop all the input samples after the aggregation with -streamAggr.config. "+
"By default, only aggregated samples are dropped, while the remaining samples are stored in the database. "+
"See also -streamAggr.keepInput and https://docs.victoriametrics.com/stream-aggregation.html")
streamAggrDedupInterval = flag.Duration("streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before optional aggregation with -streamAggr.config . "+
"See also -streamAggr.dropInputLabels and -dedup.minScrapeInterval and https://docs.victoriametrics.com/stream-aggregation.html#deduplication")
streamAggrDropInputLabels = flagutil.NewArrayString("streamAggr.dropInputLabels", "An optional list of labels to drop from samples "+
"before stream de-duplication and aggregation . See https://docs.victoriametrics.com/stream-aggregation.html#dropping-unneeded-labels")
streamAggrIgnoreOldSamples = flag.Bool("streamAggr.ignoreOldSamples", false, "Whether to ignore input samples with old timestamps outside the current aggregation interval. "+
"See https://docs.victoriametrics.com/stream-aggregation.html#ignoring-old-samples")
streamAggrDedupInterval = flag.Duration("streamAggr.dedupInterval", 0, "Input samples are de-duplicated with this interval before being aggregated. "+
"Only the last sample per each time series per each interval is aggregated if the interval is greater than zero")
)
var (
@@ -45,8 +41,7 @@ var (
saCfgSuccess = metrics.NewGauge(`vminsert_streamagg_config_last_reload_successful`, nil)
saCfgTimestamp = metrics.NewCounter(`vminsert_streamagg_config_last_reload_success_timestamp_seconds`)
sasGlobal atomic.Pointer[streamaggr.Aggregators]
deduplicator *streamaggr.Deduplicator
sasGlobal atomic.Pointer[streamaggr.Aggregators]
)
// CheckStreamAggrConfig checks config pointed by -stramaggr.config
@@ -54,13 +49,8 @@ func CheckStreamAggrConfig() error {
if *streamAggrConfig == "" {
return nil
}
pushNoop := func(_ []prompbmarshal.TimeSeries) {}
opts := &streamaggr.Options{
DedupInterval: *streamAggrDedupInterval,
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: *streamAggrIgnoreOldSamples,
}
sas, err := streamaggr.LoadFromFile(*streamAggrConfig, pushNoop, opts)
pushNoop := func(tss []prompbmarshal.TimeSeries) {}
sas, err := streamaggr.LoadFromFile(*streamAggrConfig, pushNoop, *streamAggrDedupInterval)
if err != nil {
return fmt.Errorf("error when loading -streamAggr.config=%q: %w", *streamAggrConfig, err)
}
@@ -75,24 +65,15 @@ func InitStreamAggr() {
saCfgReloaderStopCh = make(chan struct{})
if *streamAggrConfig == "" {
if *streamAggrDedupInterval > 0 {
deduplicator = streamaggr.NewDeduplicator(pushAggregateSeries, *streamAggrDedupInterval, *streamAggrDropInputLabels)
}
return
}
sighupCh := procutil.NewSighupChan()
opts := &streamaggr.Options{
DedupInterval: *streamAggrDedupInterval,
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: *streamAggrIgnoreOldSamples,
}
sas, err := streamaggr.LoadFromFile(*streamAggrConfig, pushAggregateSeries, opts)
sas, err := streamaggr.LoadFromFile(*streamAggrConfig, pushAggregateSeries, *streamAggrDedupInterval)
if err != nil {
logger.Fatalf("cannot load -streamAggr.config=%q: %s", *streamAggrConfig, err)
}
sasGlobal.Store(sas)
saCfgSuccess.Set(1)
saCfgTimestamp.Set(fasttime.UnixTimestamp())
@@ -116,12 +97,7 @@ func reloadStreamAggrConfig() {
logger.Infof("reloading -streamAggr.config=%q", *streamAggrConfig)
saCfgReloads.Inc()
opts := &streamaggr.Options{
DedupInterval: *streamAggrDedupInterval,
DropInputLabels: *streamAggrDropInputLabels,
IgnoreOldSamples: *streamAggrIgnoreOldSamples,
}
sasNew, err := streamaggr.LoadFromFile(*streamAggrConfig, pushAggregateSeries, opts)
sasNew, err := streamaggr.LoadFromFile(*streamAggrConfig, pushAggregateSeries, *streamAggrDedupInterval)
if err != nil {
saCfgSuccess.Set(0)
saCfgReloadErr.Inc()
@@ -148,102 +124,62 @@ func MustStopStreamAggr() {
sas := sasGlobal.Swap(nil)
sas.MustStop()
if deduplicator != nil {
deduplicator.MustStop()
deduplicator = nil
}
}
type streamAggrCtx struct {
mn storage.MetricName
tss []prompbmarshal.TimeSeries
labels []prompbmarshal.Label
samples []prompbmarshal.Sample
buf []byte
mn storage.MetricName
tss [1]prompbmarshal.TimeSeries
}
func (ctx *streamAggrCtx) Reset() {
ctx.mn.Reset()
clear(ctx.tss)
ctx.tss = ctx.tss[:0]
clear(ctx.labels)
ctx.labels = ctx.labels[:0]
ctx.samples = ctx.samples[:0]
ctx.buf = ctx.buf[:0]
ts := &ctx.tss[0]
promrelabel.CleanLabels(ts.Labels)
}
func (ctx *streamAggrCtx) push(mrs []storage.MetricRow, matchIdxs []byte) []byte {
mn := &ctx.mn
tss := ctx.tss
labels := ctx.labels
samples := ctx.samples
buf := ctx.buf
matchIdxs = bytesutil.ResizeNoCopyMayOverallocate(matchIdxs, len(mrs))
for i := 0; i < len(matchIdxs); i++ {
matchIdxs[i] = 0
}
tssLen := len(tss)
for _, mr := range mrs {
mn := &ctx.mn
tss := ctx.tss[:]
ts := &tss[0]
labels := ts.Labels
samples := ts.Samples
sas := sasGlobal.Load()
var matchIdxsLocal []byte
for idx, mr := range mrs {
if err := mn.UnmarshalRaw(mr.MetricNameRaw); err != nil {
logger.Panicf("BUG: cannot unmarshal recently marshaled MetricName: %s", err)
}
labelsLen := len(labels)
bufLen := len(buf)
buf = append(buf, mn.MetricGroup...)
metricGroup := bytesutil.ToUnsafeString(buf[bufLen:])
labels = append(labels, prompbmarshal.Label{
labels = append(labels[:0], prompbmarshal.Label{
Name: "__name__",
Value: metricGroup,
Value: bytesutil.ToUnsafeString(mn.MetricGroup),
})
for _, tag := range mn.Tags {
bufLen = len(buf)
buf = append(buf, tag.Key...)
name := bytesutil.ToUnsafeString(buf[bufLen:])
bufLen = len(buf)
buf = append(buf, tag.Value...)
value := bytesutil.ToUnsafeString(buf[bufLen:])
labels = append(labels, prompbmarshal.Label{
Name: name,
Value: value,
Name: bytesutil.ToUnsafeString(tag.Key),
Value: bytesutil.ToUnsafeString(tag.Value),
})
}
samplesLen := len(samples)
samples = append(samples, prompbmarshal.Sample{
samples = append(samples[:0], prompbmarshal.Sample{
Timestamp: mr.Timestamp,
Value: mr.Value,
})
tss = append(tss, prompbmarshal.TimeSeries{
Labels: labels[labelsLen:],
Samples: samples[samplesLen:],
})
}
ctx.tss = tss
ctx.labels = labels
ctx.samples = samples
ctx.buf = buf
ts.Labels = labels
ts.Samples = samples
tss = tss[tssLen:]
sas := sasGlobal.Load()
if sas != nil {
matchIdxs = sas.Push(tss, matchIdxs)
} else if deduplicator != nil {
matchIdxs = bytesutil.ResizeNoCopyMayOverallocate(matchIdxs, len(tss))
for i := range matchIdxs {
matchIdxs[i] = 1
matchIdxsLocal = sas.Push(tss, matchIdxsLocal)
if matchIdxsLocal[0] != 0 {
matchIdxs[idx] = 1
}
deduplicator.Push(tss)
}
ctx.Reset()
return matchIdxs
}
@@ -275,3 +211,8 @@ func pushAggregateSeries(tss []prompbmarshal.TimeSeries) {
logger.Errorf("cannot flush aggregate series: %s", err)
}
}
// GetAggregators returns default aggregator for vmsingle.
func GetAggregators() map[string]*streamaggr.Aggregators {
return map[string]*streamaggr.Aggregators{"default": sasGlobal.Load()}
}

View File

@@ -1,85 +0,0 @@
package datadogsketches
import (
"net/http"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/relabel"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
parserCommon "github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogsketches"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogsketches/stream"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/datadogutils"
"github.com/VictoriaMetrics/metrics"
)
var (
rowsInserted = metrics.NewCounter(`vm_rows_inserted_total{type="datadogsketches"}`)
rowsPerInsert = metrics.NewHistogram(`vm_rows_per_insert{type="datadogsketches"}`)
)
// InsertHandlerForHTTP processes remote write for DataDog POST /api/beta/sketches request.
func InsertHandlerForHTTP(req *http.Request) error {
extraLabels, err := parserCommon.GetExtraLabels(req)
if err != nil {
return err
}
ce := req.Header.Get("Content-Encoding")
return stream.Parse(req.Body, ce, func(sketches []*datadogsketches.Sketch) error {
return insertRows(sketches, extraLabels)
})
}
func insertRows(sketches []*datadogsketches.Sketch, extraLabels []prompbmarshal.Label) error {
ctx := common.GetInsertCtx()
defer common.PutInsertCtx(ctx)
rowsLen := 0
for _, sketch := range sketches {
rowsLen += sketch.RowsCount()
}
ctx.Reset(rowsLen)
rowsTotal := 0
hasRelabeling := relabel.HasRelabeling()
for _, sketch := range sketches {
ms := sketch.ToSummary()
for _, m := range ms {
ctx.Labels = ctx.Labels[:0]
ctx.AddLabel("", m.Name)
for _, label := range m.Labels {
ctx.AddLabel(label.Name, label.Value)
}
for _, tag := range sketch.Tags {
name, value := datadogutils.SplitTag(tag)
if name == "host" {
name = "exported_host"
}
ctx.AddLabel(name, value)
}
for j := range extraLabels {
label := &extraLabels[j]
ctx.AddLabel(label.Name, label.Value)
}
if hasRelabeling {
ctx.ApplyRelabeling()
}
if len(ctx.Labels) == 0 {
// Skip metric without labels.
continue
}
ctx.SortLabelsIfNeeded()
var metricNameRaw []byte
var err error
for _, p := range m.Points {
metricNameRaw, err = ctx.WriteDataPointExt(metricNameRaw, ctx.Labels, p.Timestamp, p.Value)
if err != nil {
return err
}
}
rowsTotal += len(m.Points)
}
}
rowsInserted.Add(rowsTotal)
rowsPerInsert.Update(float64(rowsTotal))
return ctx.FlushBufs()
}

View File

@@ -6,13 +6,13 @@ import (
"fmt"
"net/http"
"strings"
"sync/atomic"
"time"
"github.com/VictoriaMetrics/metrics"
vminsertCommon "github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/common"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/csvimport"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadogsketches"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadogv1"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/datadogv2"
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/graphite"
@@ -29,7 +29,6 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/app/vminsert/vmimport"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/auth"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/bytesutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/flagutil"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/httpserver"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/influxutils"
graphiteserver "github.com/VictoriaMetrics/VictoriaMetrics/lib/ingestserver/graphite"
@@ -40,8 +39,8 @@ import (
"github.com/VictoriaMetrics/VictoriaMetrics/lib/prompbmarshal"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/promscrape"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/common"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/protoparser/opentelemetry/firehose"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/storage"
"github.com/VictoriaMetrics/VictoriaMetrics/lib/streamaggr"
)
var (
@@ -64,8 +63,7 @@ var (
"See also -opentsdbHTTPListenAddr.useProxyProtocol")
opentsdbHTTPUseProxyProtocol = flag.Bool("opentsdbHTTPListenAddr.useProxyProtocol", false, "Whether to use proxy protocol for connections accepted "+
"at -opentsdbHTTPListenAddr . See https://www.haproxy.org/download/1.8/doc/proxy-protocol.txt")
configAuthKey = flagutil.NewPassword("configAuthKey", "Authorization key for accessing /config page. It must be passed via authKey query arg")
reloadAuthKey = flagutil.NewPassword("reloadAuthKey", "Auth key for /-/reload http endpoint. It must be passed as authKey=...")
configAuthKey = flag.String("configAuthKey", "", "Authorization key for accessing /config page. It must be passed via authKey query arg")
maxLabelsPerTimeseries = flag.Int("maxLabelsPerTimeseries", 30, "The maximum number of labels accepted per time series. Superfluous labels are dropped. In this case the vm_metrics_with_dropped_labels_total metric at /metrics page is incremented")
maxLabelValueLen = flag.Int("maxLabelValueLen", 16*1024, "The maximum length of label values in the accepted time series. Longer label values are truncated. In this case the vm_too_long_label_values_total metric at /metrics page is incremented")
)
@@ -101,7 +99,7 @@ func Init() {
if len(*opentsdbHTTPListenAddr) > 0 {
opentsdbhttpServer = opentsdbhttpserver.MustStart(*opentsdbHTTPListenAddr, *opentsdbHTTPUseProxyProtocol, opentsdbhttp.InsertHandler)
}
promscrape.Init(func(_ *auth.Token, wr *prompbmarshal.WriteRequest) {
promscrape.Init(func(at *auth.Token, wr *prompbmarshal.WriteRequest) {
prompush.Push(wr)
})
}
@@ -163,7 +161,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
path = strings.TrimSuffix(path, "/")
}
switch path {
case "/prometheus/api/v1/write", "/api/v1/write", "/api/v1/push", "/prometheus/api/v1/push":
case "/prometheus/api/v1/write", "/api/v1/write":
if common.HandleVMProtoServerHandshake(w, r) {
return true
}
@@ -217,14 +215,14 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
addInfluxResponseHeaders(w)
influxutils.WriteDatabaseNames(w)
return true
case "/opentelemetry/api/v1/push", "/opentelemetry/v1/metrics":
case "/opentelemetry/api/v1/push":
opentelemetryPushRequests.Inc()
if err := opentelemetry.InsertHandler(r); err != nil {
opentelemetryPushErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
firehose.WriteSuccessResponse(w, r)
w.WriteHeader(http.StatusOK)
return true
case "/newrelic":
newrelicCheckRequest.Inc()
@@ -272,15 +270,6 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
w.WriteHeader(202)
fmt.Fprintf(w, `{"status":"ok"}`)
return true
case "/datadog/api/beta/sketches":
datadogsketchesWriteRequests.Inc()
if err := datadogsketches.InsertHandlerForHTTP(r); err != nil {
datadogsketchesWriteErrors.Inc()
httpserver.Errorf(w, r, "%s", err)
return true
}
w.WriteHeader(202)
return true
case "/datadog/api/v1/validate":
datadogValidateRequests.Inc()
// See https://docs.datadoghq.com/api/latest/authentication/#validate-api-key
@@ -327,7 +316,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
}
return true
case "/prometheus/config", "/config":
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeConfigRequests.Inc()
@@ -336,7 +325,7 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
return true
case "/prometheus/api/v1/status/config", "/api/v1/status/config":
// See https://prometheus.io/docs/prometheus/latest/querying/api/#config
if !httpserver.CheckAuthFlag(w, r, configAuthKey.Get(), "configAuthKey") {
if !httpserver.CheckAuthFlag(w, r, *configAuthKey, "configAuthKey") {
return true
}
promscrapeStatusConfigRequests.Inc()
@@ -346,15 +335,15 @@ func RequestHandler(w http.ResponseWriter, r *http.Request) bool {
fmt.Fprintf(w, `{"status":"success","data":{"yaml":%q}}`, bb.B)
return true
case "/prometheus/-/reload", "/-/reload":
if !httpserver.CheckAuthFlag(w, r, reloadAuthKey.Get(), "reloadAuthKey") {
return true
}
promscrapeConfigReloadRequests.Inc()
procutil.SelfSIGHUP()
w.WriteHeader(http.StatusOK)
w.WriteHeader(http.StatusNoContent)
return true
case "/stream-agg":
streamaggr.WriteHumanReadableState(w, r, vminsertCommon.GetAggregators())
return true
case "/ready":
if rdy := promscrape.PendingScrapeConfigs.Load(); rdy > 0 {
if rdy := atomic.LoadInt32(&promscrape.PendingScrapeConfigs); rdy > 0 {
errMsg := fmt.Sprintf("waiting for scrape config to init targets, configs left: %d", rdy)
http.Error(w, errMsg, http.StatusTooEarly)
} else {
@@ -404,16 +393,13 @@ var (
datadogv2WriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogv2WriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/datadog/api/v2/series", protocol="datadog"}`)
datadogsketchesWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/beta/sketches", protocol="datadog"}`)
datadogsketchesWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/datadog/api/beta/sketches", protocol="datadog"}`)
datadogValidateRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/validate", protocol="datadog"}`)
datadogCheckRunRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/check_run", protocol="datadog"}`)
datadogIntakeRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/intake", protocol="datadog"}`)
datadogMetadataRequests = metrics.NewCounter(`vm_http_requests_total{path="/datadog/api/v1/metadata", protocol="datadog"}`)
opentelemetryPushRequests = metrics.NewCounter(`vm_http_requests_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/opentelemetry/v1/metrics", protocol="opentelemetry"}`)
opentelemetryPushRequests = metrics.NewCounter(`vm_http_requests_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
opentelemetryPushErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/opentelemetry/api/v1/push", protocol="opentelemetry"}`)
newrelicWriteRequests = metrics.NewCounter(`vm_http_requests_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
newrelicWriteErrors = metrics.NewCounter(`vm_http_request_errors_total{path="/newrelic/infra/v2/metrics/events/bulk", protocol="newrelic"}`)
@@ -435,12 +421,12 @@ var (
promscrapeConfigReloadRequests = metrics.NewCounter(`vm_http_requests_total{path="/-/reload"}`)
_ = metrics.NewGauge(`vm_metrics_with_dropped_labels_total`, func() float64 {
return float64(storage.MetricsWithDroppedLabels.Load())
return float64(atomic.LoadUint64(&storage.MetricsWithDroppedLabels))
})
_ = metrics.NewGauge(`vm_too_long_label_names_total`, func() float64 {
return float64(storage.TooLongLabelNames.Load())
return float64(atomic.LoadUint64(&storage.TooLongLabelNames))
})
_ = metrics.NewGauge(`vm_too_long_label_values_total`, func() float64 {
return float64(storage.TooLongLabelValues.Load())
return float64(atomic.LoadUint64(&storage.TooLongLabelValues))
})
)

Some files were not shown because too many files have changed in this diff Show More