Skip to main content
Ch. 10 - Correlation and Regression
Triola - Elementary Statistics 14th Edition
Triola14th EditionElementary StatisticsISBN: 9780137366446Not the one you use?Change textbook
Chapter 10, Problem 10.1.15

Testing for a Linear Correlation
In Exercises 13–28, construct a scatterplot, and find the value of the linear correlation coefficient r. Also find the P-value or the critical values of r from Table A-6. Use a significance level of α = 0.05. Determine whether there is sufficient evidence to support a claim of a linear correlation between the two variables. (Save your work because the same data sets will be used in Section 10-2 exercises.)
Taxis The table below includes data from New York City taxi rides (from Data Set 32 “Taxis” in Appendix B). The distances are in miles, the times are in minutes, the fares are in dollars, and the tips are in dollars. Is there sufficient evidence to support the claim that there is a linear correlation between the time of the ride and the tip amount? Does it appear that riders base their tips on the time of the ride?


Table showing taxi ride data: distance in miles, time in minutes, fare in dollars, and tip in dollars.

Verified step by step guidance
1
Step 1: Construct a scatterplot using the given data. Plot 'Time' (in minutes) on the x-axis and 'Tip' (in dollars) on the y-axis. Each pair of values from the table represents a point on the scatterplot.
Step 2: Calculate the linear correlation coefficient (r) using the formula: r = (nΣxy - ΣxΣy) / sqrt((nΣx² - (Σx)²)(nΣy² - (Σy)²)), where x represents 'Time' and y represents 'Tip'. Substitute the values from the table into the formula.
Step 3: Determine the P-value or critical values of r using Table A-6 for a significance level of α = 0.05. The degrees of freedom (df) for this test are calculated as df = n - 2, where n is the number of data points.
Step 4: Compare the calculated r value and the P-value (or critical values) to the significance level α = 0.05. If the P-value is less than α or if r exceeds the critical value, there is sufficient evidence to support the claim of a linear correlation.
Step 5: Interpret the results. If there is sufficient evidence of a linear correlation, discuss whether the scatterplot and correlation coefficient suggest that riders base their tips on the time of the ride. If not, explain why the evidence does not support the claim.

Verified video answer for a similar problem:

This video solution was recommended by our tutors as helpful for the problem above.
Video duration:
3m
Was this helpful?

Key Concepts

Here are the essential concepts you must grasp in order to answer the question correctly.

Linear Correlation Coefficient (r)

The linear correlation coefficient, denoted as r, quantifies the strength and direction of a linear relationship between two variables. Its value ranges from -1 to 1, where 1 indicates a perfect positive correlation, -1 indicates a perfect negative correlation, and 0 indicates no correlation. In this context, calculating r will help determine if there is a significant relationship between the time of the taxi ride and the tip amount.
Recommended video:
Guided course
05:43
Correlation Coefficient

P-value

The P-value is a statistical measure that helps determine the significance of results obtained from a hypothesis test. It indicates the probability of observing the data, or something more extreme, assuming the null hypothesis is true. In this case, the P-value will be used to assess whether the observed correlation between ride time and tips is statistically significant at the α = 0.05 level.
Recommended video:
Guided course
06:50
Step 3: Get P-Value

Scatterplot

A scatterplot is a graphical representation of two quantitative variables, where each point represents an observation in the dataset. It helps visualize the relationship between the variables, making it easier to identify patterns, trends, or correlations. In this exercise, constructing a scatterplot of ride time versus tip amount will provide a visual context for the correlation analysis.
Recommended video:
Guided course
06:36
Scatterplots & Intro to Correlation
Related Practice
Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

Dirt Cheap The Cherry Hill Construction company in Branford, CT sells screened topsoil by the “yard,” which is actually a cubic yard. Let the variable x be the length (yd) of each side of a cube of screened topsoil. The table below lists the values of x along with the corresponding cost (dollars).

Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

Stock Market Listed below in order by row are the annual high values of the Dow Jones Industrial Average for each year beginning with 2000. Find the best model and then predict the value for the last year listed. Is the predicted value close to the actual value of 26,828.4?

1
views
Textbook Question

Making Predictions

In Exercises 5–8, let the predictor variable x be the first variable given. Use the given data to find the regression equation and the best predicted value of the response variable. Be sure to follow the prediction procedure summarized in Figure 10-5. Use a 0.05 significance level.


Bear Measurements Head widths (in.) and weights (lb) were measured for 20 randomly selected bears (from Data Set 18 “Bear Measurements” in Appendix B). The 20 pairs of measurements yield xbar = 6.9 in., ybar = 214.3 lb, r = 0.879 P-value = 0.000 and y^ = -212 + 61.9x. Find the best predicted weight of a bear given that the bear has a head width of 6.5 in.

Textbook Question

Super Bowl and R^2 Let x represent years coded as 1,1,3,... for years starting in 1980, and let y represent the numbers of points scored in each annual Super Bowl beginning in 1980. Using the data from 1980 to the last Super Bowl at the time of this writing, we obtain the following values of R^2 for the different models: linear: 0.008; quadratic: 0.023; logarithmic: 0.0004; exponential: 0.027; power: 0.007. Based on these results, which model is best? Is the best model a good model? What do the results suggest about predicting the number of points scored in a future Super Bowl game?

1
views
Textbook Question

Finding the Best Model

In Exercises 5–16, construct a scatterplot and identify the mathematical model that best fits the given data. Assume that the model is to be used only for the scope of the given data, and consider only linear, quadratic, logarithmic, exponential, and power models.

CD Yields The table lists the value y (in dollars) of \$1000 deposited in a certificate of deposit at Bank of New York (based on rates currently in effect).

1
views
Textbook Question

Moore’s Law In 1965, Intel cofounder Gordon Moore initiated what has since become known as Moore’s law: The number of transistors per square inch on integrated circuits will double approximately every 18 months. In the table below, the first row lists different years and the second row lists the number of transistors (in thousands) for different years.

Ignoring the listed data and assuming that Moore’s law is correct and transistors per square inch double every 18 months, which mathematical model best describes this law: linear, quadratic, logarithmic, exponential, power? What specific function describes Moore’s law?

Which mathematical model best fits the listed sample data?

Compare the results from parts (a) and (b). Does Moore’s law appear to be working reasonably well?

1
views