stats.mstats_extras

Additional statistics functions with support for masked arrays.

Module Contents

Functions

hdquantiles(data,prob=list,axis=None,var=False) Computes quantile estimates with the Harrell-Davis method.
hdmedian(data,axis=None,var=False) Returns the Harrell-Davis estimate of the median along the given axis.
hdquantiles_sd(data,prob=list,axis=None) The standard error of the Harrell-Davis quantile estimates by jackknife.
trimmed_mean_ci(data,limits=tuple,inclusive=tuple,alpha=0.05,axis=None) Selected confidence interval of the trimmed mean along the given axis.
mjci(data,prob=list,axis=None) Returns the Maritz-Jarrett estimators of the standard error of selected
mquantiles_cimj(data,prob=list,alpha=0.05,axis=None) Computes the alpha confidence interval for the selected quantiles of the
median_cihs(data,alpha=0.05,axis=None) Computes the alpha-level confidence interval for the median of the data.
compare_medians_ms(group_1,group_2,axis=None) Compares the medians from two independent groups along the given axis.
idealfourths(data,axis=None) Returns an estimate of the lower and upper quartiles.
rsh(data,points=None) Evaluates Rosenblatt’s shifted histogram estimators for each data point.
hdquantiles(data, prob=list, axis=None, var=False)

Computes quantile estimates with the Harrell-Davis method.

The quantile estimates are calculated as a weighted linear combination of order statistics.

data : array_like
Data array.
prob : sequence, optional
Sequence of quantiles to compute.
axis : int or None, optional
Axis along which to compute the quantiles. If None, use a flattened array.
var : bool, optional
Whether to return the variance of the estimate.
hdquantiles : MaskedArray
A (p,) array of quantiles (if var is False), or a (2,p) array of quantiles and variances (if var is True), where p is the number of quantiles.

hdquantiles_sd

hdmedian(data, axis=None, var=False)

Returns the Harrell-Davis estimate of the median along the given axis.

data : ndarray
Data array.
axis : int, optional
Axis along which to compute the quantiles. If None, use a flattened array.
var : bool, optional
Whether to return the variance of the estimate.
hdmedian : MaskedArray
The median values. If var=True, the variance is returned inside the masked array. E.g. for a 1-D array the shape change from (1,) to (2,).
hdquantiles_sd(data, prob=list, axis=None)

The standard error of the Harrell-Davis quantile estimates by jackknife.

data : array_like
Data array.
prob : sequence, optional
Sequence of quantiles to compute.
axis : int, optional
Axis along which to compute the quantiles. If None, use a flattened array.
hdquantiles_sd : MaskedArray
Standard error of the Harrell-Davis quantile estimates.

hdquantiles

trimmed_mean_ci(data, limits=tuple, inclusive=tuple, alpha=0.05, axis=None)

Selected confidence interval of the trimmed mean along the given axis.

data : array_like
Input data.
limits : {None, tuple}, optional

None or a two item tuple. Tuple of the percentages to cut on each side of the array, with respect to the number of unmasked data, as floats between 0. and 1. If n is the number of unmasked data before trimming, then (n * limits[0])th smallest data and (n * limits[1])th largest data are masked. The total number of unmasked data after trimming is n * (1. - sum(limits)). The value of one limit can be set to None to indicate an open interval.

Defaults to (0.2, 0.2).

inclusive : (2,) tuple of boolean, optional

If relative==False, tuple indicating whether values exactly equal to the absolute limits are allowed. If relative==True, tuple indicating whether the number of data being masked on each side should be rounded (True) or truncated (False).

Defaults to (True, True).

alpha : float, optional

Confidence level of the intervals.

Defaults to 0.05.

axis : int, optional

Axis along which to cut. If None, uses a flattened version of data.

Defaults to None.

trimmed_mean_ci : (2,) ndarray
The lower and upper confidence intervals of the trimmed data.
mjci(data, prob=list, axis=None)

Returns the Maritz-Jarrett estimators of the standard error of selected experimental quantiles of the data.

data : ndarray
Data array.
prob : sequence, optional
Sequence of quantiles to compute.
axis : int or None, optional
Axis along which to compute the quantiles. If None, use a flattened array.
mquantiles_cimj(data, prob=list, alpha=0.05, axis=None)

Computes the alpha confidence interval for the selected quantiles of the data, with Maritz-Jarrett estimators.

data : ndarray
Data array.
prob : sequence, optional
Sequence of quantiles to compute.
alpha : float, optional
Confidence level of the intervals.
axis : int or None, optional
Axis along which to compute the quantiles. If None, use a flattened array.
ci_lower : ndarray
The lower boundaries of the confidence interval. Of the same length as prob.
ci_upper : ndarray
The upper boundaries of the confidence interval. Of the same length as prob.
median_cihs(data, alpha=0.05, axis=None)

Computes the alpha-level confidence interval for the median of the data.

Uses the Hettmasperger-Sheather method.

data : array_like
Input data. Masked values are discarded. The input should be 1D only, or axis should be set to None.
alpha : float, optional
Confidence level of the intervals.
axis : int or None, optional
Axis along which to compute the quantiles. If None, use a flattened array.
median_cihs
Alpha level confidence interval.
compare_medians_ms(group_1, group_2, axis=None)

Compares the medians from two independent groups along the given axis.

The comparison is performed using the McKean-Schrader estimate of the standard error of the medians.

group_1 : array_like
First dataset. Has to be of size >=7.
group_2 : array_like
Second dataset. Has to be of size >=7.
axis : int, optional
Axis along which the medians are estimated. If None, the arrays are flattened. If axis is not None, then group_1 and group_2 should have the same shape.
compare_medians_ms : {float, ndarray}
If axis is None, then returns a float, otherwise returns a 1-D ndarray of floats with a length equal to the length of group_1 along axis.
idealfourths(data, axis=None)

Returns an estimate of the lower and upper quartiles.

Uses the ideal fourths algorithm.

data : array_like
Input array.
axis : int, optional
Axis along which the quartiles are estimated. If None, the arrays are flattened.
idealfourths : {list of floats, masked array}
Returns the two internal values that divide data into four parts using the ideal fourths algorithm either along the flattened array (if axis is None) or along axis of data.
rsh(data, points=None)

Evaluates Rosenblatt’s shifted histogram estimators for each data point.

Rosenblatt’s estimator is a centered finite-difference approximation to the derivative of the empirical cumulative distribution function.

data : sequence
Input data, should be 1-D. Masked values are ignored.
points : sequence or None, optional
Sequence of points where to evaluate Rosenblatt shifted histogram. If None, use the data.