Published online by Cambridge University Press: 24 October 2008
In a recent paper we have discussed certain general principles underlying the determination of the most efficient tests of statistical hypotheses, but the method of approach did not involve any detailed consideration of the question of a priori probability. We propose now to consider more fully the bearing of the earlier results on this question and in particular to discuss what statements of value to the statistician in reaching his final judgment can be made from an analysis of observed data, which would not be modified by any change in the probabilities a priori. In dealing with the problem of statistical estimation, R. A. Fisher has shown how, under certain conditions, what may be described as rules of behaviour can be employed which will lead to results independent of these probabilities; in this connection he has discussed the important conception of what he terms fiducial limits. But the testing of statistical hypotheses cannot be treated as a problem in estimation, and it is necessary to discuss afresh in what sense tests can be employed which are independent of a priori probability laws.
* Neyman, and Pearson, , Phil. Trans. Roy. Soc. A, 231 (1933), 289.CrossRefGoogle Scholar
† Fisher, , Proc. Camb. Phil. Soc. 26 (1930), 528CrossRefGoogle Scholar; Proc. Roy. Soc. A, 139 (1933), 343.Google Scholar
* This aspect of the error problem is very evident in a number of fields where tests must be used in a routine manner, and errors of judgment lead to waste of energy or financial loss. Such is the case in sampling inspection problems in mass-production industry.
* In problems dealing with grouped frequencies this may not be so, for if the chance of an individual unit falling into one of k alternative categories is
then C (H) may be allowed to include all possible sets of non-negative p's, subject to the sole condition Σ(p t)=1. In this way the χ2 tests, although based on certain mathematical approximations, are of wider application than those dealing with criteria based on the symmetric functions of continuous variables.
* The question, which is clearly important, of how far it matters accepting H 0 falsely when the true hypothesis differs only slightly from it, is referred to again below.
† It is true that we might remove from the summation in (6) certain of the H 3 differing only slightly from H 0, on the grounds that the consequence of accepting H 0 when they were true was not serious enough to be termed an “error”; and in this way we might find a region, w, for which ε was much less than ½. But even then it is not clear that a region satisfying (8) would exist.
* It would be a composite and not a simple hypothesis.
* Neyman, and Pearson, , Phil. Trans. Roy. Soc. Series A, 231 (1933), 289–337.CrossRefGoogle Scholar
* We have shown (loc. cit.) that in this case there is no common best critical region, i.e. no test satisfying Definition B.
* Neyman, and Pearson, , Biometrika, 20 A (1928), 175 and 263Google Scholar; Bull. Acad. Polonaise Sci. Lettres, Série A (1930), p. 73.Google Scholar
* It is assumed that a region w 0, minimising II (w 0), exists.
* Phil. Trans. Roy. Soc. A, 231 (1933), 289.Google Scholar
* In other words, if a test T satisfies Definition D, it will be uniformly more powerful with regard to the class of alternatives C(h) than any other equivalent test; and vice versa.
† loc. cit.
Such is “Student's” test, also R. A. Fisher's tests for comparing the means and variances in two samples from normal populations.
* loc. cit. pp. 302–304.