R/binary-calibration-tests.R
gaffke_test.Rd
Test a null hypothesis about the mean of i.i.d. samples. The test is based on Gaffke 2005, though a more detailed analysis and exposition can be found in Learned-Miller & Thomas 2020.
number of bootstrap samples for the null distribution
the level of the test
the mean under null hypothesis
the alternative for the test.
a vector of observed values
the lower bound for x
the upper bound for x
gaffke_test
returns an object of class htest
, gaffke_p
and
gaffke_ci
return just the p-value / CI as numeric for easier use in batch
workflows.
The test is expected to be valid for any bounded distribution without further assumptions. The test has been proven valid only for special cases but no counterexample is known despite some efforts in the literature to find some. The test also provides way more power (and tighter confidence intervals) than other non-parametric tests --- in fact its power converges quite quickly to that of the t-test.
In SBC the test is useful for testing the data-averaged posterior for binary variables.
Gaffke, N. (2005). “Three test statistics for a nonparametric one-sided hypothesis on the mean of a nonnegative variable.” Mathematical Methods of Statistics, 14(4): 451–467.
Learned-Miller, E. and Thomas, P. S. (2020). “A New Confidence Interval for the Mean of a Bounded Random Variable.” https://arxiv.org/abs/1905.06208