This function is used to calculate the population covariance.
Name: COV
Input Series: Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE.
Output Series: Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the population covariance.
Note:
- If a row contains missing points, null points or
NaN, it will be ignored; - If all rows are ignored,
NaNwill be output.
Input series:
+-----------------------------+---------------+---------------+
| Time|root.test.d2.s1|root.test.d2.s2|
+-----------------------------+---------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0| 101.0|
|2020-01-01T00:00:03.000+08:00| 101.0| null|
|2020-01-01T00:00:04.000+08:00| 102.0| 101.0|
|2020-01-01T00:00:06.000+08:00| 104.0| 102.0|
|2020-01-01T00:00:08.000+08:00| 126.0| 102.0|
|2020-01-01T00:00:10.000+08:00| 108.0| 103.0|
|2020-01-01T00:00:12.000+08:00| null| 103.0|
|2020-01-01T00:00:14.000+08:00| 112.0| 104.0|
|2020-01-01T00:00:15.000+08:00| 113.0| null|
|2020-01-01T00:00:16.000+08:00| 114.0| 104.0|
|2020-01-01T00:00:18.000+08:00| 116.0| 105.0|
|2020-01-01T00:00:20.000+08:00| 118.0| 105.0|
|2020-01-01T00:00:22.000+08:00| 100.0| 106.0|
|2020-01-01T00:00:26.000+08:00| 124.0| 108.0|
|2020-01-01T00:00:28.000+08:00| 126.0| 108.0|
|2020-01-01T00:00:30.000+08:00| NaN| 108.0|
+-----------------------------+---------------+---------------+
SQL for query:
select cov(s1,s2) from root.test.d2Output series:
+-----------------------------+-------------------------------------+
| Time|cov(root.test.d2.s1, root.test.d2.s2)|
+-----------------------------+-------------------------------------+
|1970-01-01T08:00:00.000+08:00| 12.291666666666666|
+-----------------------------+-------------------------------------+
This function is used to calculate the DTW distance between two input series.
Name: DTW
Input Series: Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE.
Output Series: Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the DTW distance.
Note:
- If a row contains missing points, null points or
NaN, it will be ignored; - If all rows are ignored,
0will be output.
Input series:
+-----------------------------+---------------+---------------+
| Time|root.test.d2.s1|root.test.d2.s2|
+-----------------------------+---------------+---------------+
|1970-01-01T08:00:00.001+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.002+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.003+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.004+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.005+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.006+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.007+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.008+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.009+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.010+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.011+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.012+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.013+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.014+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.015+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.016+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.017+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.018+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.019+08:00| 1.0| 2.0|
|1970-01-01T08:00:00.020+08:00| 1.0| 2.0|
+-----------------------------+---------------+---------------+
SQL for query:
select dtw(s1,s2) from root.test.d2Output series:
+-----------------------------+-------------------------------------+
| Time|dtw(root.test.d2.s1, root.test.d2.s2)|
+-----------------------------+-------------------------------------+
|1970-01-01T08:00:00.000+08:00| 20.0|
+-----------------------------+-------------------------------------+
This function is used to calculate the Pearson Correlation Coefficient.
Name: PEARSON
Input Series: Only support two input series. The types are both INT32 / INT64 / FLOAT / DOUBLE.
Output Series: Output a single series. The type is DOUBLE. There is only one data point in the series, whose timestamp is 0 and value is the Pearson Correlation Coefficient.
Note:
- If a row contains missing points, null points or
NaN, it will be ignored; - If all rows are ignored,
NaNwill be output.
Input series:
+-----------------------------+---------------+---------------+
| Time|root.test.d2.s1|root.test.d2.s2|
+-----------------------------+---------------+---------------+
|2020-01-01T00:00:02.000+08:00| 100.0| 101.0|
|2020-01-01T00:00:03.000+08:00| 101.0| null|
|2020-01-01T00:00:04.000+08:00| 102.0| 101.0|
|2020-01-01T00:00:06.000+08:00| 104.0| 102.0|
|2020-01-01T00:00:08.000+08:00| 126.0| 102.0|
|2020-01-01T00:00:10.000+08:00| 108.0| 103.0|
|2020-01-01T00:00:12.000+08:00| null| 103.0|
|2020-01-01T00:00:14.000+08:00| 112.0| 104.0|
|2020-01-01T00:00:15.000+08:00| 113.0| null|
|2020-01-01T00:00:16.000+08:00| 114.0| 104.0|
|2020-01-01T00:00:18.000+08:00| 116.0| 105.0|
|2020-01-01T00:00:20.000+08:00| 118.0| 105.0|
|2020-01-01T00:00:22.000+08:00| 100.0| 106.0|
|2020-01-01T00:00:26.000+08:00| 124.0| 108.0|
|2020-01-01T00:00:28.000+08:00| 126.0| 108.0|
|2020-01-01T00:00:30.000+08:00| NaN| 108.0|
+-----------------------------+---------------+---------------+
SQL for query:
select pearson(s1,s2) from root.test.d2Output series:
+-----------------------------+-----------------------------------------+
| Time|pearson(root.test.d2.s1, root.test.d2.s2)|
+-----------------------------+-----------------------------------------+
|1970-01-01T08:00:00.000+08:00| 0.5630881927754872|
+-----------------------------+-----------------------------------------+
This function is used to find all symmetric subseries in the input whose degree of symmetry is less than the threshold. The degree of symmetry is calculated by DTW. The smaller the degree, the more symmetrical the series is.
Name: PATTERNSYMMETRIC
Input Series: Only support a single input series. The type is INT32 / INT64 / FLOAT / DOUBLE
Parameter:
window: The length of the symmetric subseries. It's a positive integer and the default value is 10.threshold: The threshold of the degree of symmetry. It's non-negative. Only the subseries whose degree of symmetry is below it will be output. By default, all subseries will be output.
Output Series: Output a single series. The type is DOUBLE. Each data point in the output series corresponds to a symmetric subseries. The output timestamp is the starting timestamp of the subseries and the output value is the degree of symmetry.
Input series:
+-----------------------------+---------------+
| Time|root.test.d1.s4|
+-----------------------------+---------------+
|2021-01-01T12:00:00.000+08:00| 1.0|
|2021-01-01T12:00:01.000+08:00| 2.0|
|2021-01-01T12:00:02.000+08:00| 3.0|
|2021-01-01T12:00:03.000+08:00| 2.0|
|2021-01-01T12:00:04.000+08:00| 1.0|
|2021-01-01T12:00:05.000+08:00| 1.0|
|2021-01-01T12:00:06.000+08:00| 1.0|
|2021-01-01T12:00:07.000+08:00| 1.0|
|2021-01-01T12:00:08.000+08:00| 2.0|
|2021-01-01T12:00:09.000+08:00| 3.0|
|2021-01-01T12:00:10.000+08:00| 2.0|
|2021-01-01T12:00:11.000+08:00| 1.0|
+-----------------------------+---------------+
SQL for query:
select ptnsym(s4, 'window'='5', 'threshold'='0') from root.test.d1Output series:
+-----------------------------+------------------------------------------------------+
| Time|ptnsym(root.test.d1.s4, "window"="5", "threshold"="0")|
+-----------------------------+------------------------------------------------------+
|2021-01-01T12:00:00.000+08:00| 0.0|
|2021-01-01T12:00:07.000+08:00| 0.0|
+-----------------------------+------------------------------------------------------+
This function is used to calculate the cross correlation function of given two time series.
For discrete time series, cross correlation is given by
Name: XCORR
Input Series: Only support two input numeric series. The type is INT32 / INT64 / FLOAT / DOUBLE.
Output Series: Output a single series with DOUBLE as datatype.
There are
Note:
nullandNaNvalues in the input series will be ignored and treated as 0.
Input series:
+-----------------------------+---------------+---------------+
| Time|root.test.d1.s1|root.test.d1.s2|
+-----------------------------+---------------+---------------+
|2020-01-01T00:00:01.000+08:00| null| 6|
|2020-01-01T00:00:02.000+08:00| 2| 7|
|2020-01-01T00:00:03.000+08:00| 3| NaN|
|2020-01-01T00:00:04.000+08:00| 4| 9|
|2020-01-01T00:00:05.000+08:00| 5| 10|
+-----------------------------+---------------+---------------+
SQL for query:
select xcorr(s1, s2) from root.test.d1 where time <= 2020-01-01 00:00:05Output series:
+-----------------------------+---------------------------------------+
| Time|xcorr(root.test.d1.s1, root.test.d1.s2)|
+-----------------------------+---------------------------------------+
|1970-01-01T08:00:00.001+08:00| 0.0|
|1970-01-01T08:00:00.002+08:00| 4.0|
|1970-01-01T08:00:00.003+08:00| 9.6|
|1970-01-01T08:00:00.004+08:00| 13.4|
|1970-01-01T08:00:00.005+08:00| 20.0|
|1970-01-01T08:00:00.006+08:00| 15.6|
|1970-01-01T08:00:00.007+08:00| 9.2|
|1970-01-01T08:00:00.008+08:00| 11.8|
|1970-01-01T08:00:00.009+08:00| 6.0|
+-----------------------------+---------------------------------------+