22 KiB
sort
| sort |
|---|
| 13 |
MetricsQL
VictoriaMetrics implements MetricsQL - query language inspired by PromQL. It is backwards compatible with PromQL, so Grafana dashboards backed by Prometheus datasource should work the same after switching from Prometheus to VictoriaMetrics. Standalone MetricsQL package can be used for parsing MetricsQL in external apps.
If you are unfamiliar with PromQL, then it is suggested reading this tutorial for beginners.
The following functionality is implemented differently in MetricsQL comparing to PromQL in order to improve user experience:
- MetricsQL takes into account the previous point before the window in square brackets for range functions such as
rateandincrease. It also doesn't extrapolate range function results. This addresses this issue from Prometheus. - MetricsQL returns the expected non-empty responses for requests with
stepvalues smaller than scrape interval. This addresses this issue from Grafana. - MetricsQL treats
scalartype the same asinstant vectorwithout labels, since subtle difference between these types usually confuses users. See the corresponding Prometheus docs for details. - MetricsQL removes all the
NaNvalues from the output, so some queries like(-1)^0.5return empty results in VictoriaMetrics, while returning a series ofNaNvalues in Prometheus. Note that Grafana doesn't draw any lines or dots forNaNvalues, so usually the end result looks the same for both VictoriaMetrics and Prometheus. - MetricsQL keeps metric names after applying functions, which don't change the meaining of the original time series. For example,
min_over_time(foo)orround(foo)leavefoometric name in the result. See this issue for details.
Other PromQL functionality should work the same in MetricsQL. File an issue if you notice discrepancies between PromQL and MetricsQL results other than mentioned above.
MetricsQL provides additional functionality mentioned below, which is aimed towards solving practical cases. Feel free filing a feature request if you think MetricsQL misses certain useful functionality.
Note that the functionality mentioned below doesn't work in PromQL, so it is impossible switching back to Prometheus after you start using it.
This functionality can be tried at an editable Grafana dashboard.
-
WITHtemplates. This feature simplifies writing and managing complex queries. Go toWITHtemplates playground and try it. -
Graphite-compatible filters can be passed via
{__graphite__="foo.*.bar"}syntax. This is equivalent to{__name__=~"foo[.][^.]*[.]bar"}, but usually works faster and is easier to use when migrating from Graphite to VictoriaMetrics. -
Range duration in functions such as rate may be omitted. VictoriaMetrics automatically selects range duration depending on the current step used for building the graph. For instance, the following query is valid in VictoriaMetrics:
rate(node_network_receive_bytes_total). -
All the aggregate functions support optional
limit Nsuffix in order to limit the number of output series. For example,sum(x) by (y) limit 10limits the number of output time series after the aggregation to 10. All the other time series are dropped. -
Metric names and metric labels may contain escaped chars. For instance,
foo\-bar{baz\=aa="b"}is valid expression. It returns time series with namefoo-barcontaining labelbaz=aawith valueb. Additionally,\xXXescape sequence is supported, whereXXis hexadecimal representation of escaped char. -
offset, range duration and step value for range vector may refer to the current step aka$__intervalvalue from Grafana. For instance,rate(metric[10i] offset 5i)would return per-second rate over a range covering 10 previous steps with the offset of 5 steps. -
offsetmay be put anywere in the query. For instance,sum(foo) offset 24h. -
offsetmay be negative. For example,q offset -1h. -
Range duration and offset may be fractional. For instance,
rate(node_network_receive_bytes_total[1.5m] offset 0.5d). -
defaultbinary operator.q1 default q2fills gaps inq1with the corresponding values fromq2. -
Most aggregate functions accept arbitrary number of args. For example,
avg(q1, q2, q3)would return the average values for every point acrossq1,q2andq3. -
histogram_quantileaccepts optional third arg -boundsLabel. In this case it returnslowerandupperbounds for the estimated percentile. See this issue for details. -
ifbinary operator.q1 if q2removes values fromq1for missing values fromq2. -
ifnotbinary operator.q1 ifnot q2removes values fromq1for existing values fromq2. -
Trailing commas on all the lists are allowed - label filters, function args and with expressions. For instance, the following queries are valid:
m{foo="bar",},f(a, b,),WITH (x=y,) x. This simplifies maintenance of multi-line queries. -
String literals may be concatenated. This is useful with
WITHtemplates:WITH (commonPrefix="long_metric_prefix_") {__name__=commonPrefix+"suffix1"} / {__name__=commonPrefix+"suffix2"}. -
Comments starting with
#and ending with newline. For instance,up # this is a comment for 'up' metric. -
Rollup functions -
rollup(m[d]),rollup_rate(m[d]),rollup_deriv(m[d]),rollup_increase(m[d]),rollup_delta(m[d])- returnmin,maxandavgvalues for all themdata points overdduration. -
rollup_candlestick(m[d])- returnsopen,close,lowandhighvalues (OHLC) for all themdata points overdduration. This function is useful for financial applications. -
union(q1, ... qN)function for building multiple graphs forq1, ...qNsubqueries with a single query. Theunionfunction name may be skipped - the following queries are equivalent:union(q1, q2)and(q1, q2). -
ru(freeResources, maxResources)function for returning resource utilization percentage in the range0% - 100%. For instance,ru(node_memory_MemFree_bytes, node_memory_MemTotal_bytes)returns memory utilization over node_exporter metrics. -
ttf(slowlyChangingFreeResources)function for returning the time in seconds when the givenslowlyChangingFreeResourcesexpression reaches zero. For instance,ttf(node_filesystem_avail_byte)returns the time to storage space exhaustion. This function may be useful for capacity planning. -
Functions for label manipulation:
alias(q, name)for setting metric name across all the time seriesq. For example,alias(foo, "bar")would givebarname to all thefooseries.label_set(q, label1, value1, ... labelN, valueN)for setting the given values for the given labels onq. For example,label_set(foo, "bar", "baz")would add{bar="baz"}label to all thefooseries.label_map(q, label, srcValue1, dstValue1, ... srcValueN, dstValueN)for mappinglabelvalues fromsrc*todst*. For example,label_map(foo, "instance", "127.0.0.1", "locahost")would renamefoo{instance="127.0.0.1"}tofoo{instance="localhost"}.label_uppercase(q, label1, ... labelN)for uppercasing values for the given labels. For example,label_uppercase(foo, "instance")would transformfoo{instance="bar"}tofoo{instance="BAR"}.label_lowercase(q, label2, ... labelN)for lowercasing value for the given labels. For example,label_lowercase(foo, "instance")would transformfoo{instance="BAR"}tofoo{instance="bar"}.label_del(q, label1, ... labelN)for deleting the given labels fromq. For example,label_del(foo, "bar")would deletebarlabel from all thefooseries.label_keep(q, label1, ... labelN)for deleting all the labels except the given labels fromq. For example,label_keep(foo, "bar")would delete all the labels exceptbarfromfooseries.label_copy(q, src_label1, dst_label1, ... src_labelN, dst_labelN)for copying label values fromsrc_*todst_*. Ifsrc_labelis empty, thendst_labelis left untouched. For example,label_copy(foo, "bar", baz")would transformfoo{bar="x"}tofoo{bar="x",baz="x"}.label_move(q, src_label1, dst_label1, ... src_labelN, dst_labelN)for moving label values fromsrc_*todst_*. Ifsrc_labelis empty, thendst_labelis left untouched. For example,label_move(foo, "bar", "baz")would transformfoo{bar="x"}tofoo{baz="x"}.label_transform(q, label, regexp, replacement)for replacing all theregexpoccurences withreplacementin thelabelvalues fromq. For example,label_transform(foo, "bar", "-", "_")would transformfoo{bar="a-b-c"}tofoo{bar="a_b_c"}.label_value(q, label)- returns numeric values for the givenlabelfromq. For example, iflabel_value(foo, "bar")is applied tofoo{bar="1.234"}, then it will return a time seriesfoo{bar="1.234"}with1.234value.
-
label_match(q, label, regexp)andlabel_mismatch(q, label, regexp)for filtering time series with labels matching (or not matching) the given regexps. -
sort_by_label(q, label1, ... labelN)andsort_by_label_desc(q, label1, ... labelN)for sorting time series by the given set of labels. For example,sort_by_label(foo, "bar")would sortfooseries by values of the labelbarin these series. -
step()function for returning the step in seconds used in the query. -
start()andend()functions for returning the start and end timestamps of the[start ... end]range used in the query. -
integrate(m[d])for returning integral over the given durationdfor the given metricm. -
ideriv(m[d])- for calculatinginstantderivative for the metricmover the durationd. -
increase_pure(m[d])- for calculating increase ofmoverdwithout edge-case handling compared toincrease(m[d]). See this issue for details. -
deriv_fast(m[d])- for calculatingfastderivative formbased on the first and the last points from durationd. -
running_functions -running_sum,running_min,running_max,running_avg- for calculating running values on the selected time range. -
range_functions -range_sum,range_min,range_max,range_avg,range_first,range_last,range_median,range_quantile- for calculating global value over the selected time range. Note that global value is based on calculated datapoints for the inner query. The calculated datapoints can differ from raw datapoints stored in the database. See these docs for details. -
smooth_exponential(q, sf)- smoothsqusing exponential moving average with the given smooth factorsf. -
remove_resets(q)- removes counter resets fromq. -
lag(m[d])- returns lag between the current timestamp and the timestamp from the previous data point inmoverd. -
lifetime(m[d])- returns lifetime ofqoverdin seconds. It is expected thatdexceeds the lifetime ofm. -
scrape_interval(m[d])- returns the average interval in seconds between data points ofmoverdakascrape interval. -
Trigonometric functions -
sin(q),cos(q),asin(q),acos(q)andpi(). -
range_over_time(m[d])- returns value range formoverdtime window, i.e.max_over_time(m[d])-min_over_time(m[d]). -
median_over_time(m[d])- calculates median values formoverdtime window. Shorthand toquantile_over_time(0.5, m[d]). -
median(q)- median aggregate. Shorthand toquantile(0.5, q). -
limitk(k, q) by (group_labels)- limits the number of time series returned fromqtokper eachgroup_labels. The returned set ofktime series per eachgroup_labelscan change with each call. -
any(q) by (x)- returns any time series fromqfor each group inx. -
keep_last_value(q)- fills missing data (gaps) inqwith the previous non-empty value. -
keep_next_value(q)- fills missing data (gaps) inqwith the next non-empty value. -
interpolate(q)- fills missing data (gaps) inqwith linearly interpolated values. -
distinct_over_time(m[d])- returns distinct number of values formdata points overdduration. -
distinct(q)- returns a time series with the number of unique values for each timestamp inq. -
sum2_over_time(m[d])- returns sum of squares for all themvalues overdduration. -
sum2(q)- returns a time series with sum of square values for each timestamp inq. -
geomean_over_time(m[d])- returns geomean value for all themvalue overdduration. -
geomean(q)- returns a time series with geomean value for each timestamp inq. -
rand(),rand_normal()andrand_exponential()functions - for generating pseudo-random series with even, normal and exponential distribution. -
increases_over_time(m[d])anddecreases_over_time(m[d])- returns the number ofmincreases or decreases over the given durationd. -
prometheus_buckets(q)- converts VictoriaMetrics histogram buckets to Prometheus buckets withlelabels. -
buckets_limit(k, q)- limits the number of buckets (Prometheus-style or VictoriaMetrics-style) per each metric returned by byqtok. It also converts VictoriaMetrics-style buckets to Prometheus-style buckets, i.e. the end result are buckets with withlelabels. -
histogram(q)- calculates aggregate histogram overqtime series for each point on the graph. See this article for more details. -
histogram_over_time(m[d])- calculates VictoriaMetrics histogram formoverd. For example, the following query calculates median temperature by country over the last 24 hours:histogram_quantile(0.5, sum(histogram_over_time(temperature[24h])) by (vmrange,country)). -
histogram_share(le, buckets)- returns share (in the range 0..1) forbucketsthat fall belowle. Useful for calculating SLI and SLO. For instance, the following query returns the share of requests which are performed under 1.5 seconds during the last 5 minutes:histogram_share(1.5, sum(rate(request_duration_seconds_bucket[5m])) by (le)). -
histogram_avg(buckets)- returns the average value for the given buckets. It can be used for calculating the average over the given time range across multiple time series. For exmple,histogram_avg(sum(histogram_over_time(response_time_duration_seconds[5m])) by (vmrange,job))would return the average response time per eachjobover the last 5 minutes. -
histogram_stdvar(buckets)- returns standard variance for the given buckets. It can be used for calculating standard deviation over the given time range across multiple time series. For example,histogram_stdvar(sum(histogram_over_time(temperature[24])) by (vmrange,country))would return standard deviation for the temperature per each country over the last 24 hours. -
histogram_stddev(buckets)- returns standard deviation for the given buckets. -
topk_*andbottomk_*aggregate functions, which return up to K time series. Note that the standardtopkfunction may return more than K time series - see this article for details.topk_min(k, q)- returns top K time series with the max minimums on the given time rangetopk_max(k, q)- returns top K time series with the max maximums on the given time rangetopk_avg(k, q)- returns top K time series with the max averages on the given time rangetopk_median(k, q)- returns top K time series with the max medians on the given time rangebottomk_min(k, q)- returns bottom K time series with the min minimums on the given time rangebottomk_max(k, q)- returns bottom K time series with the min maximums on the given time rangebottomk_avg(k, q)- returns bottom K time series with the min averages on the given time rangebottomk_median(k, q)- returns bottom K time series with the min medians on the given time range.
All the
topk_*andbottomk_*functions accept optional third argument - label to add to the sum of the remaining time series outside top K or bottom K time series. For example,topk_max(3, sum(process_resident_memory_bytes) by (job), "job=other")would return up to 3 time series with the maximum value forsum(process_resident_memory_bytes) by (job)plus fourth time series with the sum of the remaining time series if any. The fourth time series will containjob="other"label. -
share_le_over_time(m[d], le)- returns share (in the range 0..1) of values inmoverd, which are smaller or equal tole. Useful for calculating SLI and SLO. Example:share_le_over_time(memory_usage_bytes[24h], 100*1024*1024)returns the share of time series values for the last 24 hours when memory usage was below or equal to 100MB. -
share_gt_over_time(m[d], gt)- returns share (in the range 0..1) of values inmoverd, which are bigger thangt. Useful for calculating SLI and SLO. Example:share_gt_over_time(up[24h], 0)- returns service availability for the last 24 hours. -
count_le_over_time(m[d], le)- returns the number of raw samples formoverd, which don't exceedle. -
count_gt_over_time(m[d], gt)- returns the number of raw samples formoverd, which are bigger thangt. -
count_eq_over_time(m[d], N)- returns the number of raw samples formoverdwith values equal toN. -
count_ne_over_time(m[d], N)- returns the number of raw samples formoverdwith values not equal toN. -
tmin_over_time(m[d])- returns timestamp for the minimum value formoverdtime range. -
tmax_over_time(m[d])- returns timestamp for the maximum value formoverdtime range. -
tfirst_over_time(m[d])- returns timestamp for the first sample formoverdtime range. -
tlast_over_time(m[d])- returns timestamp for the last sample formoverdtime range. -
aggr_over_time(("aggr_func1", "aggr_func2", ...), m[d])- simultaneously calculates all the listedaggr_func*formoverdtime range.aggr_func*can contain any functions that accept range vector. For instance,aggr_over_time(("min_over_time", "max_over_time", "rate"), m[d])would calculatemin_over_time,max_over_timeandrateform[d]. -
hoeffding_bound_upper(phi, m[d])andhoeffding_bound_lower(phi, m[d])- return upper and lower Hoeffding bounds for the givenphiin the range[0..1]. -
last_over_time(m[d])- returns the last value formon the time ranged. -
first_over_time(m[d])- returns the first value formon the time ranged. -
outliersk(N, q) by (group)- returns up toNoutlier time series forqin everygroup. Outlier time series have the highest deviation from themedian(q). This aggregate function is useful to detect anomalies across groups of similar time series. -
ascent_over_time(m[d])- returns the sum of positive deltas between adjacent data points inmoverd. Useful for tracking height gains in GPS track. -
descent_over_time(m[d])- returns the absolute sum of negative deltas between adjacent data points inmoverd. Useful for tracking height loss in GPS track. -
mode_over_time(m[d])- returns mode formvalues overd. It is expected thatmvalues are discrete. -
mode(q) by (x)- returns mode for each point inqgrouped byx. It is expected thatqpoints are discrete. -
rate_over_sum(m[d])- returns rate over the sum ofmvalues overdduration. -
zscore_over_time(m[d])- returns z-score formvalues overdduration. Useful for detecting anomalies in time series comparing to historical samples. -
zscore(q) by (group)- returns independent z-score values for every point in everygroupofq. Useful for detecting anomalies in the group of related time series. -
timezone_offset("tz")- returns offset in seconds for the given timezonetzrelative to UTC. This can be useful when combining with datetime-related functions. For example,day_of_week(time()+timezone_offset("America/Los_Angeles"))would return weekdays forAmerica/Los_Angelestime zone. SpecialLocaltime zone can be used for returning an offset for the time zone set on the host where VictoriaMetrics runs. See the list of supported timezones.