Skip to content

Commit cefac8d

Browse files
gururaj1512kgryte
andauthored
feat: add stats/base/ndarray/scovarmtk
PR-URL: #7693 Co-authored-by: Athan Reines <[email protected]> Reviewed-by: Athan Reines <[email protected]>
1 parent c2cf04d commit cefac8d

File tree

10 files changed

+1046
-0
lines changed

10 files changed

+1046
-0
lines changed
Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
<!--
2+
3+
@license Apache-2.0
4+
5+
Copyright (c) 2025 The Stdlib Authors.
6+
7+
Licensed under the Apache License, Version 2.0 (the "License");
8+
you may not use this file except in compliance with the License.
9+
You may obtain a copy of the License at
10+
11+
http://www.apache.org/licenses/LICENSE-2.0
12+
13+
Unless required by applicable law or agreed to in writing, software
14+
distributed under the License is distributed on an "AS IS" BASIS,
15+
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16+
See the License for the specific language governing permissions and
17+
limitations under the License.
18+
19+
-->
20+
21+
# scovarmtk
22+
23+
> Calculate the [covariance][covariance] of two one-dimensional single-precision floating-point ndarrays provided known means and using a one-pass textbook algorithm.
24+
25+
<section class="intro">
26+
27+
The population [covariance][covariance] of two finite size populations of size `N` is given by
28+
29+
<!-- <equation class="equation" label="eq:population_covariance" align="center" raw="\operatorname{\mathrm{cov_N}} = \frac{1}{N} \sum_{i=0}^{N-1} (x_i - \mu_x)(y_i - \mu_y)" alt="Equation for the population covariance."> -->
30+
31+
```math
32+
\mathop{\mathrm{cov_N}} = \frac{1}{N} \sum_{i=0}^{N-1} (x_i - \mu_x)(y_i - \mu_y)
33+
```
34+
35+
<!-- </equation> -->
36+
37+
where the population means are given by
38+
39+
<!-- <equation class="equation" label="eq:population_mean_for_x" align="center" raw="\mu_x = \frac{1}{N} \sum_{i=0}^{N-1} x_i" alt="Equation for the population mean for first array."> -->
40+
41+
```math
42+
\mu_x = \frac{1}{N} \sum_{i=0}^{N-1} x_i
43+
```
44+
45+
<!-- </equation> -->
46+
47+
and
48+
49+
<!-- <equation class="equation" label="eq:population_mean_for_y" align="center" raw="\mu_y = \frac{1}{N} \sum_{i=0}^{N-1} y_i" alt="Equation for the population mean for second array."> -->
50+
51+
```math
52+
\mu_y = \frac{1}{N} \sum_{i=0}^{N-1} y_i
53+
```
54+
55+
<!-- </equation> -->
56+
57+
Often in the analysis of data, the true population [covariance][covariance] is not known _a priori_ and must be estimated from samples drawn from population distributions. If one attempts to use the formula for the population [covariance][covariance], the result is biased and yields a **biased sample covariance**. To compute an **unbiased sample covariance** for samples of size `n`,
58+
59+
<!-- <equation class="equation" label="eq:unbiased_sample_covariance" align="center" raw="\operatorname{\mathrm{cov_n}} = \frac{1}{n-1} \sum_{i=0}^{n-1} (x_i - \bar{x}_n)(y_i - \bar{y}_n)" alt="Equation for computing an unbiased sample variance."> -->
60+
61+
```math
62+
\mathop{\mathrm{cov_n}} = \frac{1}{n-1} \sum_{i=0}^{n-1} (x_i - \bar{x}_n)(y_i - \bar{y}_n)
63+
```
64+
65+
<!-- </equation> -->
66+
67+
where sample means are given by
68+
69+
<!-- <equation class="equation" label="eq:sample_mean_for_x" align="center" raw="\bar{x} = \frac{1}{n} \sum_{i=0}^{n-1} x_i" alt="Equation for the sample mean for first array."> -->
70+
71+
```math
72+
\bar{x} = \frac{1}{n} \sum_{i=0}^{n-1} x_i
73+
```
74+
75+
<!-- </equation> -->
76+
77+
and
78+
79+
<!-- <equation class="equation" label="eq:sample_mean_for_y" align="center" raw="\bar{y} = \frac{1}{n} \sum_{i=0}^{n-1} y_i" alt="Equation for the sample mean for second array."> -->
80+
81+
```math
82+
\bar{y} = \frac{1}{n} \sum_{i=0}^{n-1} y_i
83+
```
84+
85+
<!-- </equation> -->
86+
87+
The use of the term `n-1` is commonly referred to as Bessel's correction. Depending on the characteristics of the population distributions, other correction factors (e.g., `n-1.5`, `n+1`, etc) can yield better estimators.
88+
89+
</section>
90+
91+
<!-- /.intro -->
92+
93+
<section class="usage">
94+
95+
## Usage
96+
97+
```javascript
98+
var scovarmtk = require( '@stdlib/stats/base/ndarray/scovarmtk' );
99+
```
100+
101+
#### scovarmtk( arrays )
102+
103+
Computes the covariance of two one-dimensional single-precision floating-point ndarrays provided known means and using a one-pass textbook algorithm.
104+
105+
```javascript
106+
var Float32Array = require( '@stdlib/array/float32' );
107+
var scalar2ndarray = require( '@stdlib/ndarray/from-scalar' );
108+
var ndarray = require( '@stdlib/ndarray/base/ctor' );
109+
110+
var opts = {
111+
'dtype': 'float32'
112+
};
113+
114+
var xbuf = new Float32Array( [ 1.0, -2.0, 2.0 ] );
115+
var x = new ndarray( opts.dtype, xbuf, [ 3 ], [ 1 ], 0, 'row-major' );
116+
117+
var ybuf = new Float32Array( [ 2.0, -2.0, 1.0 ] );
118+
var y = new ndarray( opts.dtype, ybuf, [ 3 ], [ 1 ], 0, 'row-major' );
119+
120+
var correction = scalar2ndarray( 1.0, opts );
121+
var meanx = scalar2ndarray( 1.0/3.0, opts );
122+
var meany = scalar2ndarray( 1.0/3.0, opts );
123+
124+
var v = scovarmtk( [ x, y, correction, meanx, meany ] );
125+
// returns ~3.8333
126+
```
127+
128+
The function has the following parameters:
129+
130+
- **arrays**: array-like object containing the following ndarrays in order:
131+
132+
1. first one-dimensional input ndarray.
133+
2. second one-dimensional input ndarray.
134+
3. a zero-dimensional ndarray specifying the degrees of freedom adjustment. Setting this parameter to a value other than `0` has the effect of adjusting the divisor during the calculation of the [covariance][covariance] according to `N-c` where `c` corresponds to the provided degrees of freedom adjustment and `N` corresponds to the number of elements in each input ndarray. When computing the population [covariance][covariance], setting this parameter to `0` is the standard choice (i.e., the provided arrays contain data constituting entire populations). When computing the unbiased sample [covariance][covariance], setting this parameter to `1` is the standard choice (i.e., the provided arrays contain data sampled from larger populations; this is commonly referred to as Bessel's correction).
135+
4. a zero-dimensional ndarray specifying the mean of the first one-dimensional ndarray.
136+
5. a zero-dimensional ndarray specifying the mean of the second one-dimensional ndarray.
137+
138+
</section>
139+
140+
<!-- /.usage -->
141+
142+
<section class="notes">
143+
144+
## Notes
145+
146+
- Both input ndarrays should have the same number of elements.
147+
- If provided empty one-dimensional ndarrays, the function returns `NaN`.
148+
149+
</section>
150+
151+
<!-- /.notes -->
152+
153+
<section class="examples">
154+
155+
## Examples
156+
157+
<!-- eslint no-undef: "error" -->
158+
159+
```javascript
160+
var discreteUniform = require( '@stdlib/random/array/discrete-uniform' );
161+
var ndarray = require( '@stdlib/ndarray/base/ctor' );
162+
var ndarray2array = require( '@stdlib/ndarray/to-array' );
163+
var scalar2ndarray = require( '@stdlib/ndarray/from-scalar' );
164+
var scovarmtk = require( '@stdlib/stats/base/ndarray/scovarmtk' );
165+
166+
// Define array options:
167+
var opts = {
168+
'dtype': 'float32'
169+
};
170+
171+
// Create one-dimensional ndarrays containing pseudorandom numbers:
172+
var xbuf = discreteUniform( 10, -50, 50, opts );
173+
var x = new ndarray( opts.dtype, xbuf, [ xbuf.length ], [ 1 ], 0, 'row-major' );
174+
console.log( ndarray2array( x ) );
175+
176+
var ybuf = discreteUniform( 10, -50, 50, opts );
177+
var y = new ndarray( opts.dtype, ybuf, [ ybuf.length ], [ 1 ], 0, 'row-major' );
178+
console.log( ndarray2array( y ) );
179+
180+
// Specify the degrees of freedom adjustment:
181+
var correction = scalar2ndarray( 1.0, opts );
182+
183+
// Specify the known means:
184+
var meanx = scalar2ndarray( 0.0, opts );
185+
var meany = scalar2ndarray( 0.0, opts );
186+
187+
// Calculate the sample covariance:
188+
var v = scovarmtk( [ x, y, correction, meanx, meany ] );
189+
console.log( v );
190+
```
191+
192+
</section>
193+
194+
<!-- /.examples -->
195+
196+
<!-- Section for related `stdlib` packages. Do not manually edit this section, as it is automatically populated. -->
197+
198+
<section class="related">
199+
200+
</section>
201+
202+
<!-- /.related -->
203+
204+
<!-- Section for all links. Make sure to keep an empty line after the `section` element and another before the `/section` close. -->
205+
206+
<section class="links">
207+
208+
[covariance]: https://en.wikipedia.org/wiki/Covariance
209+
210+
</section>
211+
212+
<!-- /.links -->
Lines changed: 115 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,115 @@
1+
/**
2+
* @license Apache-2.0
3+
*
4+
* Copyright (c) 2025 The Stdlib Authors.
5+
*
6+
* Licensed under the Apache License, Version 2.0 (the "License");
7+
* you may not use this file except in compliance with the License.
8+
* You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
'use strict';
20+
21+
// MODULES //
22+
23+
var bench = require( '@stdlib/bench' );
24+
var uniform = require( '@stdlib/random/array/uniform' );
25+
var isnan = require( '@stdlib/math/base/assert/is-nan' );
26+
var pow = require( '@stdlib/math/base/special/pow' );
27+
var ndarray = require( '@stdlib/ndarray/base/ctor' );
28+
var scalar2ndarray = require( '@stdlib/ndarray/base/from-scalar' );
29+
var pkg = require( './../package.json' ).name;
30+
var scovarmtk = require( './../lib' );
31+
32+
33+
// VARIABLES //
34+
35+
var options = {
36+
'dtype': 'float32'
37+
};
38+
39+
40+
// FUNCTIONS //
41+
42+
/**
43+
* Creates a benchmark function.
44+
*
45+
* @private
46+
* @param {PositiveInteger} len - array length
47+
* @returns {Function} benchmark function
48+
*/
49+
function createBenchmark( len ) {
50+
var correction;
51+
var meanx;
52+
var meany;
53+
var xbuf;
54+
var ybuf;
55+
var x;
56+
var y;
57+
58+
xbuf = uniform( len, -10.0, 10.0, options );
59+
x = new ndarray( options.dtype, xbuf, [ len ], [ 1 ], 0, 'row-major' );
60+
61+
ybuf = uniform( len, -10.0, 10.0, options );
62+
y = new ndarray( options.dtype, ybuf, [ len ], [ 1 ], 0, 'row-major' );
63+
64+
correction = scalar2ndarray( 1.0, options.dtype, 'row-major' );
65+
meanx = scalar2ndarray( 0.0, options.dtype, 'row-major' );
66+
meany = scalar2ndarray( 0.0, options.dtype, 'row-major' );
67+
68+
return benchmark;
69+
70+
function benchmark( b ) {
71+
var v;
72+
var i;
73+
74+
b.tic();
75+
for ( i = 0; i < b.iterations; i++ ) {
76+
v = scovarmtk( [ x, y, correction, meanx, meany ] );
77+
if ( isnan( v ) ) {
78+
b.fail( 'should not return NaN' );
79+
}
80+
}
81+
b.toc();
82+
if ( isnan( v ) ) {
83+
b.fail( 'should not return NaN' );
84+
}
85+
b.pass( 'benchmark finished' );
86+
b.end();
87+
}
88+
}
89+
90+
91+
// MAIN //
92+
93+
/**
94+
* Main execution sequence.
95+
*
96+
* @private
97+
*/
98+
function main() {
99+
var len;
100+
var min;
101+
var max;
102+
var f;
103+
var i;
104+
105+
min = 1; // 10^min
106+
max = 6; // 10^max
107+
108+
for ( i = min; i <= max; i++ ) {
109+
len = pow( 10, i );
110+
f = createBenchmark( len );
111+
bench( pkg+':len='+len, f );
112+
}
113+
}
114+
115+
main();
Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
2+
{{alias}}( arrays )
3+
Computes the covariance of two one-dimensional single-precision floating-
4+
point ndarrays provided known means and using a one-pass textbook algorithm.
5+
6+
Both input ndarrays should have the same number of elements.
7+
8+
If provided empty one-dimensional ndarrays, the function returns `NaN`.
9+
10+
Parameters
11+
----------
12+
arrays: ArrayLikeObject<ndarray>
13+
Array-like object containing the following ndarrays in order:
14+
15+
- first one-dimensional input ndarray.
16+
- second one-dimensional input ndarray.
17+
- a zero-dimensional ndarray specifying the degrees of freedom
18+
adjustment. Setting this parameter to a value other than `0` has the
19+
effect of adjusting the divisor during the calculation of the
20+
degrees of freedom adjustment and `N` corresponds to the number of
21+
elements in each input ndarray. When computing the population
22+
degrees of freedom adjustment. When computing the population
23+
covariance, setting this parameter to `0` is the standard choice (i.e.,
24+
the provided arrays contain data constituting entire populations). When
25+
computing the unbiased sample covariance, setting this parameter to `1`
26+
is the standard choice (i.e., the provided arrays contain data sampled
27+
from larger populations; this is commonly referred to as Bessel's
28+
correction).
29+
- a zero-dimensional ndarray specifying the mean of the first one-
30+
dimensional ndarray.
31+
- a zero-dimensional ndarray specifying the mean of the second one-
32+
dimensional ndarray.
33+
34+
Returns
35+
-------
36+
out: number
37+
The covariance.
38+
39+
Examples
40+
--------
41+
// Create the input ndarrays:
42+
> var xbuf = new {{alias:@stdlib/array/float32}}( [ 1.0, -2.0, 2.0 ] );
43+
> var ybuf = new {{alias:@stdlib/array/float32}}( [ 2.0, -2.0, 1.0 ] );
44+
> var dt = 'float32';
45+
> var sh = [ xbuf.length ];
46+
> var st = [ 1 ];
47+
> var oo = 0;
48+
> var ord = 'row-major';
49+
> var x = new {{alias:@stdlib/ndarray/ctor}}( dt, xbuf, sh, st, oo, ord );
50+
> var y = new {{alias:@stdlib/ndarray/ctor}}( dt, ybuf, sh, st, oo, ord );
51+
52+
// Specify the degrees of freedom adjustment:
53+
> var opts = { 'dtype': dt };
54+
> var correction = new {{alias:@stdlib/ndarray/from-scalar}}( 1.0, opts );
55+
56+
// Specify the known means:
57+
> var meanx = new {{alias:@stdlib/ndarray/from-scalar}}( 1.0/3.0, opts );
58+
> var meany = new {{alias:@stdlib/ndarray/from-scalar}}( 1.0/3.0, opts );
59+
60+
// Calculate the sample covariance:
61+
> {{alias}}( [ x, y, correction, meanx, meany ] )
62+
~3.8333
63+
64+
See Also
65+
--------
66+

0 commit comments

Comments
 (0)