Charles Darwin (1809-1882), author of The Origin of Species (1859) later investi-
gated the effect of cross-fertilization on the size of plants. Pairs of plants, one cross-
and one self-fertilized at the same time and whose parents were grown from the same
seed, were planted and grown in the same pot. The numbers of pairs of plants were
not large because the time and care needed to carry out the experiments were sub-
stantial. Darwin's experiments had taken 11 years. Darwin had sent the data for
several species to his cousin, Francis Galton. Galton (1822-1911), an eminent statis-
tician, was unaware of any rigorous method for making an inference about the
mean of a population when its standard deviation was unknown. Certainly that was
the case for Darwin's differences in sizes of pairs of plants. The results of one of
Darwin's experiments (given by R.A. Fisher) are presented in the datafile.
W.S. Gosset (1876-1937) was employed by the Guniess Brewing Company of Dublin.
Sample sizes available for experimentation in brewing were necessarily small, and
Gosset knew that a correct way of dealing with small samples was needed. He con-
sulted Karl Pearson (1857-1936) of Universiy College in London about the problem.
Pearson told him the current state of knowledge was unsatisfactory. The following
year Gosset undertook a course of study under Pearson. An outcome of his study
was the publication in 1908 of Gosset's paper on "The Probable Error of a Mean,"
which introduced a form of what later became known as Student's t-distribution.
Gosset's paper was published under the pseudonym "Student." The modern form
of Student's t-distribution was derived by R.A. Fisher and first published in 1925.
The datafile Student includes a set of data given by Gosset as an example in his
1908 paper. Today's student will benefit from applying the Student t-distribution
to obtain a confidence interval or test a null hypothesis using Darwin's paired
plant sizes data as well as the sleep differences data given by Gosset. Given the
modest sample sizes, it is instructive as well to apply single sample non-parametric
methods for the median to the same sets of data.