13. Chi-Square Tests & Goodness of Fit

Goodness of Fit Test

Multiple Choice
Which of the following lists the requirements that must be met to perform a $χ ²$ goodness-of-fit test?
Textbook Question
If the expected count of a category is less than 1, what can be done to the categories so that a goodness-of-fit test can still be performed?
Textbook Question
Exercises 1–5 refer to the sample data in the following table, which summarizes the frequencies of 500 digits randomly generated by Statdisk. Assume that we want to use a 0.05 significance level to test the claim that Statdisk generates the digits in a way that they are equally likely.

Is the hypothesis test left-tailed, right-tailed, or two-tailed?
Multiple Choice
In a chi-square goodness of fit test for absences across days of the week, if the observed number of absences on Monday is $O$ and the expected number is $E$ , what is the contribution of the Monday absences to the calculation of the chi-square test statistic?
Textbook Question
True or False? In Exercises 5 and 6, determine whether the statement is true or false. If it is false, rewrite it as a true statement.

When the test statistic for the chi-square independence test is large, you will, in most cases, reject the null hypothesis.
Textbook Question
Cybersecurity The table below lists the frequency of leading digits of Internet traffic interarrival times for a computer, along with the percentages of each leading digit expected with Benford’s law.

b. Identify the observed and expected values for the leading digit of 2.

" style="max-width: 100%; white-space-collapse: preserve;" width="650">
Multiple Choice
A scientist conducts a chi-square goodness of fit test with $4$ categories and obtains observed frequencies of $18$ , $22$ , $20$ , and $20$ . The expected frequency for each category is $20$ . Which of the following values is closest to the chi-square value the scientist calculated?
Textbook Question
In Section 10.2, we tested hypotheses regarding a population proportion using a z-test. However, we can also use the chi-square goodness-of-fit test to test hypotheses with k = 2 possible outcomes. In Problems 25 and 26, we test hypotheses with the use of both methods.
Living Alone? In 2000, 25.8% of Americans 15 years of age or older lived alone, according to the Census Bureau. A sociologist, who believes that this percentage is greater today, conducts a random sample of 400 Americans 15 years of age or older and finds that 164 are living alone.
b. Test the sociologist’s belief at the alpha=0.05 level of significance using the goodness-of-fit test.
Textbook Question
Testing for Normality Using a chi-square goodness-of-fit test, you can decide, with some degree of certainty, whether a variable is normally distributed. In all chi-square tests for normality, the null and alternative hypotheses are as listed below.

H₀: The variable has a normal distribution.

Hₐ: The variable does not have a normal distribution.

To determine the expected frequencies when performing a chi-square test for normality, first estimate the mean and standard deviation of the frequency distribution. Then, use the mean and standard deviation to compute the z-score for each class boundary. Then, use the z-scores to calculate the area under the standard normal curve for each class. Multiplying the resulting class areas by the sample size yields the expected frequency for each class.In Exercises 17 and 18, (a) find the expected frequencies, (b) find the critical value and identify the rejection region, (c) find the chi-square test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim.

In Exercises 17 and 18, (a) find the expected frequencies, (b) find the critical value and identify the rejection region, (c) find the chi-square test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim.

Test Scores At α=0.05, test the claim that the 400 test scores shown in the frequency distribution are normally distributed.
Textbook Question
Benford’s Law, Part I Our number system consists of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. The first significant digit in any number must be 1, 2, 3, 4, 5, 6, 7, 8, or 9 because we do not write numbers such as 12 as 012. Although we may think that each first digit appears with equal frequency so that each digit has a 1/9 probability of being the first significant digit, this is not true. In 1881, Simon Newcomb discovered that first digits do not occur with equal frequency. This same result was discovered again in 1938 by physicist Frank Benford. After studying much data, he was able to assign probabilities of occurrence to the first digit in a number as shown.
[Image]
Source: T. P. Hill, “The First Digit Phenomenon,” American Scientist, July—August, 1998.
The probability distribution is now known as Benford’s Law and plays a major role in identifying fraudulent data on tax returns and accounting books. For example, the following distribution represents the first digits in 200 allegedly fraudulent checks written to a bogus company by an employee attempting to embezzle funds from his employer.
a. Because these data are meant to prove that someone is guilty of fraud, what would be an appropriate level of significance when performing a goodness-of-fit test?
Textbook Question
Performing a Chi-Square Goodness-of-Fit Test
In Exercises 7–16, (c) find the chi-square test statistic.

Ways to Pay A financial analyst claims that the distribution of people’s preferences on how to pay for goods is different from the distribution shown in the figure. You randomly select 600 people and record their preferences on how to pay for goods. The table shows the results. At α=0.01, test the financial analyst’s claim. (Adapted from Travis Credit Union)
Textbook Question
Performing a Chi-Square Goodness-of-Fit Test
In Exercises 7–16, (a) identify the claim and state H₀ and Hₐ, (b) find the critical value and identify the rejection region, (c) find the chi-square test statistic, (d) decide whether to reject or fail to reject the null hypothesis, and (e) interpret the decision in the context of the original claim.

Homicides by Month A researcher claims that the number of homicide crimes in California by month is uniformly distributed. To test this claim, you randomly select 2000 homicides from a recent year and record the month when each happened. The table shows the results. At α=0.10, test the researcher’s claim. (Adapted from California Department of Justice)
Textbook Question
Game Boss In video games, a game boss is a powerful non-player character created by game developers as an opponent to players of the game. Suppose a game is set up where a player must defeat three bosses and the probability of defeating any boss is 0.20. Assuming each boss battle is independent, the probability distribution for the number of bosses defeated by a player is as follows:
Suppose the game is played by a random sample of 1000 players with the number of bosses defeated recorded. The results are shown below.
b. Compare the observed and expected counts for each number of defeats. What does this information tell you?"
Multiple Choice
Which of the following is not a characteristic of the $χ^2$ distribution?
Textbook Question
[DATA] Pedestrian Deaths A researcher wanted to determine whether pedestrian deaths were uniformly distributed over the days of the week. She randomly selected 300 pedestrian deaths, recorded the day of the week on which the death occurred, and obtained the following results (the data are based on information obtained from the Insurance Institute for Highway Safety). Test the belief that the day of the week on which a fatality happens involving a pedestrian occurs with equal frequency at the alpha = 0.05 level of significance.