Fifty species of oak trees grow in the United States. Twenty eight species of oak from the Atlantic region and 11 from the California region were studied. The size of each species' acorns was measured to see whether acorn size is related to geographic range. It is suggested that a plant's...

Methods: Outlier, Transformation, Regression,
Topics: Environment,
Datafile Name: SMSA

Researchers at General Motors collected data on 60 U.S. Standard Metropolitan Statistical Areas (SMSA's) in a study of whether air pollution contributes to mortality. The dependent variable for analysis is age adjusted mortality (called "Mortality"). The data include variables measu...

Methods: ANCOVA, Distribution, Interaction, Transformation,
Topics: Economics,
Datafile Name: Billionaires 92

Fortune magazine publishes a list of the world's billionaires each year. The 1992 list includes 233 individuals. Their wealth, age, and geographic location (Asia, Europe, Middle East, United States, and Other) are reported.

The variable 'wealth' is right skewed, so the mean ...

Methods: ANOVA, Boxplot, Transformation,
Topics: Medical,
Datafile Name: Cancer Survival

Patients with advanced cancers of the stomach, bronchus, colon, ovary or breast were treated with ascorbate. The purpose of the study was to determine if patient survival differed with respect to the organ affected by the cancer.

A one-way ANOVA with Organ as the discrete factor and Sur...

Methods: Two Sample T-Test, Transformation, Boxplot,
Topics: Environment,
Datafile Name: Clouds

Clouds were randomly seeded or not with silver nitrate. Rainfall amounts were recorded from the clouds. The purpose of the experiment was to determine if cloud seeding increases rainfall.

The rainfall distributions are more nearly symmetric after a log transformation. The log transforma...

Methods: MANOVA, Multivariate Normality, Plot Matrix, Post Hoc Test, Transformation,
Topics: Education,
Datafile Name: Colleges

This dataset contains information on six continuous dependent variables along with a single discrete variable School Type, which distinguishes between research universities and liberal arts institutions. Twenty-five of each type of school were surveyed. We will use MANOVA to determine if the scho...

Methods: ANOVA, Boxplot, Histogram, Median, Transformation,
Topics: Nutrition,
Datafile Name: Calories

"Let the buyer beware" is a phrase that comes to mind when buying a used car, not when buying food. However, Allison, Heshka, Sepulveda, and Heymsfield (1993) think that this phrase should apply to purchasing "diet" and "health" foods as well. They purchased 40 such ...

Methods: ANCOVA, ANOVA, Transformation,
Topics: Engineering, Health,
Datafile Name: Crash

Stock automobiles containing dummies in the driver and front passenger seats crashed into a wall at 35 miles per hour. National Transportation Safety Board officials collected information how the crash affected the dummies. The injury variables describe the extent of head injuries, chest decelera...

Methods: Correlation, Diagnostics, Paired T-Test, Regression, Transformation,
Topics: Consumer, Economics,
Datafile Name: Fish Prices

The price of fish varies by species and time. The average price recieved by fishermen and vessel owners for several species of fish increased from 41 cents per pound in 1970 to $1.10 per pound in 1980. A paired t-test shows that this increase is highly significant.

There is a strong cor...

Methods: ANCOVA, Transformation,
Topics: Economics,
Datafile Name: Companies

This dataset holds several facts about 77 companies selected from the Forbes 500 list for 1986. This is a 1/10 systematic sample from the alphabetical list of companies. The Forbes 500 includes all companies in the top 500 on any of the criteria, and thus has almost 800 companies in the list. Com...

Methods: Boxplot, Two Sample T-Test, Pooled T-Test, Transformation,
Topics: Psychology,
Datafile Name: Fusion Time

This dataset contains results from an experiment in visual perception using random dot sterograms, such as that shown below. Both images appear to be composed entirely of random dots. However, they are constructed so that a 3D image (of a diamond) will be seen, if the images are viewed with a ste...

Methods: Nonlinear Regression, Transformation, Regression,
Topics: Biology,
Datafile Name: Medflies

By using Mediterranean fruit flies, Gompertz's 1825 theory that mortality rates increase at an exponential rate as age increases is examined. (i.e. as an organism gets older, its chance of dying per unit of time increases exponentially.) 1,203,646 fruit flies comprised the population for this...

Methods: Residuals, Transformation,
Topics: Consumer, Engineering,
Datafile Name: Cars

In the United States fuel efficiency for cars is typically measured in miles driven per gallon of fuel consumed (miles per gallon, MPG). This is not true in some other parts of the world where fuel efficiency is measured as the amount of fuel consumed while travelled a fixed distance (e.g. Gallon...

Methods: Regression, Transformation,
Topics: Health, Biology,
Datafile Name: Mercury in Bass

Mercury contamination of edible freshwater fish poses a direct threat to our health. Largemouth bass were studied in 53 different Florida lakes to examine the factors that influence the level of mercury contamination. Water samples were collected from the surface of the middle of each lake in Aug...

Methods: Chi Square Test, Transformation, Confidence Interval,
Topics: Consumer, Economics,
Datafile Name: Montanac Outlook Poll

The data contain the outcomes for two items in the Montana Economic Outlook Poll conducted in May 1992, with accompanying demographics for 209 out of 418 poll respondents. The items are whether the respondent feels his/her financial status is worse, the same, or better than a year ago, and whethe...

Methods: Scatterplot, Time Series, Transformation,
Topics: Economics, Energy,
Datafile Name: Oil Production

The increase in annual world crude oil production from 1880 to 1973 follows a pattern of exponential growth. In order to fit a linear model to these data, the oil production variable must be transformed by taking the natural log. A scatterplot of the log of oil production vs. year follows a strai...

Methods: Regression, Transformation, Polynomial Regression,
Topics: Automotive, Engineering, Consumer,
Datafile Name: Passenger Car Mileage

Variation in gasoline mileage among makes and models of automobiles is influenced substantially by the weight and horsepower of the vehicles. When miles per gallon and horsepower are transformed to logarithms, the linearity of the regression is improved. A negative second order term is required t...

Methods: Regression, Dummy Variable, Interaction, Transformation,
Topics: Consumer, Engineering,
Datafile Name: Nambeware Polishing Times

The relation between polishing time and product diameters as well as type of product (casserole, other) is one which is useful to the company for estimating the polishing time for new products which are designed or suggested for design and manufacture. A necessary regression assumption is that th...

Methods: Outlier, Regression, Residuals, Transformation, Nonlinear Regression, Dummy Variable,
Topics: Health, Medical, Social Science,
Datafile Name: Smoking and Cancer

Nevada and the District of Columbia are outliers in the distribution of cigarette consumption (sale) per capita by states in 1960. How the most extreme observa- tion, Nevada, should be handled in the regressions of various cancer death rates on cigarette consumption, however, varies. In addition,...

Methods: Diagnostics, Regression, Outlier, Transformation,
Topics: Economics, Government, Consumer,
Datafile Name: Home Prices

How taxes change in response to changing market value of homes is a question of concern to citizens as a policy matter as well as a personal financial concern. The tax data included in the datafile Home Prices permit an examination of this question for used homes in Albuquerque in 1993. The linea...

Methods: Dummy Variable, Diagnostics, Interaction, Outlier, Residuals, Transformation,
Topics: Economics, Education, Government,
Datafile Name: Teacher Pay by States

The scatter diagram below shows one potential influential observation, namely Alaska, which is an outlier in terms of spending per pupil. In addition, the question can be raised whether the level and slope of an appropriate regression line would be the same for the three regions of the country. S...

Methods: Transformation, Regression, Assumptions, Regression,
Topics: Miscellaneous,
Datafile Name: Transformations

Four sets of data are presented in which the relation between X and Y is not best described as linear in the original numbers. Therefore, some transformation of X, Y, or both is appropriate. Suggestions are log Y, log X, log Y and log X, 1/Y, and 1/X. Two of the sets can be described as learning ...

Methods: Regression, Polynomial Regression, Outlier, Transformation,
Topics: Economics,
Datafile Name: TV Ad Yields

The scatter diagram below suggests that the relation between advertising
yield and spending is not linear. An alternative is to fit a regression line with
a second order term, which is shown. However, the logic of the second order
regression line, which turns down within the ...

