Ch. 10 - Correlation and Regression

Triola - Elementary Statistics 14th Edition

Triola14th EditionElementary StatisticsISBN: 9780137366446Not the one you use?Change textbook

Triola 14th Edition

Ch. 10 - Correlation and Regression

Problem 10.5.2

Chapter 10, Problem 10.5.2

Super Bowl and R^2 Let x represent years coded as 1,1,3,... for years starting in 1980, and let y represent the numbers of points scored in each annual Super Bowl beginning in 1980. Using the data from 1980 to the last Super Bowl at the time of this writing, we obtain the following values of R^2 for the different models: linear: 0.008; quadratic: 0.023; logarithmic: 0.0004; exponential: 0.027; power: 0.007. Based on these results, which model is best? Is the best model a good model? What do the results suggest about predicting the number of points scored in a future Super Bowl game?

Verified step by step guidance

Step 1: Understand the meaning of $R^2$ (coefficient of determination). It measures the proportion of variance in the dependent variable ($y$, points scored) that is explained by the independent variable ($x$, years) in the model. Values of $R^2$ range from 0 to 1, where values closer to 1 indicate a better fit of the model to the data.

Step 2: Compare the given $R^2$ values for each model: linear (0.008), quadratic (0.023), logarithmic (0.0004), exponential (0.027), and power (0.007). Identify which model has the highest $R^2$ value, as this model explains the most variance in the points scored.

Step 3: Evaluate whether the best model is a good model by considering the magnitude of its $R^2$ value. Since all $R^2$ values are very low (close to 0), even the best model explains only a small fraction of the variability in points scored, indicating a poor fit.

Step 4: Interpret what these low $R^2$ values imply about the predictability of points scored in future Super Bowl games. Low $R^2$ suggests that the year (time) is not a strong predictor of points scored, so predictions based on these models will likely be unreliable.

Step 5: Summarize the conclusion: the model with the highest $R^2$ (exponential model with $R^2 = 0.027$) is technically the best among those tested, but since its $R^2$ is very low, it is not a good model for prediction. This suggests that other factors beyond just the year should be considered to better predict Super Bowl points.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.

Video duration:

Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Coefficient of Determination (R²)

R² measures the proportion of variance in the dependent variable explained by the independent variable(s) in a regression model. Values range from 0 to 1, where higher values indicate a better fit. A low R² suggests the model does not explain much of the variability in the data.

Model Comparison and Selection

Comparing models involves evaluating metrics like R² to determine which model best fits the data. The model with the highest R² is typically preferred, but the difference must be meaningful. Other factors like simplicity and interpretability also influence model choice.

Predictive Power and Model Adequacy

A good model should not only fit existing data well but also reliably predict future outcomes. Low R² values indicate weak predictive power, suggesting caution when using the model for forecasting. This highlights the importance of assessing model adequacy beyond just fit statistics.

Recommended video:

Guided course

09:00

Prediction Intervals

Related Practice

Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

Stock Market Listed below in order by row are the annual high values of the Dow Jones Industrial Average for each year beginning with 2000. Find the best model and then predict the value for the last year listed. Is the predicted value close to the actual value of 26,828.4?

views

Textbook Question

Making Predictions

In Exercises 5–8, let the predictor variable x be the first variable given. Use the given data to find the regression equation and the best predicted value of the response variable. Be sure to follow the prediction procedure summarized in Figure 10-5. Use a 0.05 significance level.

Bear Measurements Head widths (in.) and weights (lb) were measured for 20 randomly selected bears (from Data Set 18 “Bear Measurements” in Appendix B). The 20 pairs of measurements yield xbar = 6.9 in., ybar = 214.3 lb, r = 0.879 P-value = 0.000 and y^ = -212 + 61.9x. Find the best predicted weight of a bear given that the bear has a head width of 6.5 in.

Textbook Question

Testing for a Linear Correlation

In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of α = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)

Taxis The table below includes data from New York City taxi rides (from Data Set 32 “Taxis” in Appendix B). The distances are in miles, the times are in minutes, the fares are in dollars, and the tips are in dollars. Is there sufficient evidence to support the claim that there is a linear correlation between the time of the ride and the tip amount? Does it appear that riders base their tips on the time of the ride?

Textbook Question

Finding the Best Model

CD Yields The table lists the value y (in dollars) of \$1000 deposited in a certificate of deposit at Bank of New York (based on rates currently in effect).

views

Textbook Question

Moore’s Law In 1965, Intel cofounder Gordon Moore initiated what has since become known as Moore’s law: The number of transistors per square inch on integrated circuits will double approximately every 18 months. In the table below, the first row lists different years and the second row lists the number of transistors (in thousands) for different years.

Ignoring the listed data and assuming that Moore’s law is correct and transistors per square inch double every 18 months, which mathematical model best describes this law: linear, quadratic, logarithmic, exponential, power? What specific function describes Moore’s law?

Which mathematical model best fits the listed sample data?

Compare the results from parts (a) and (b). Does Moore’s law appear to be working reasonably well?

views

Textbook Question

Correlation and Slope What is the relationship between the linear correlation coefficient r and the slope b1 of a regression line?