Section 10-1 Goodness of Fit Goodness of Fit Multinomial Experiment a fixed number of trials in which there are more than two possible outcomes for each independent trial. The probability of each outcome is fixed, and each outcome is classified into categories. Chi-Square Goodness of Fit Test Used to test whether a frequency distribution fits an expected distribution (claim). Two conditions that must be met in order to conduct a ChiSquare Goodness of Fit test 1) Samples must be randomly selected 2) Each E (expected value) must be 5. You may need to combine categories to achieve this. Section 10-1 Goodness of Fit Guidelines for conducting the test. 1) Write the hypotheses and identify claim H0: Simply state the claimed percentages for each category. Ha: State that the distribution is different from what is

listed above. 2) Identify . 3) Identify degrees of freedom (k 1) where k is the number of categories. 4) STAT Edit Enter the claimed percentages for each category into L1 (as decimals) Enter the Observed Frequencies for each category into L2 Highlight L3 and enter L1*n (Use the actual number for n) Check to be sure that all values in L3 are > 5. Section 10-1 Goodness of Fit Guidelines for conducting the test. 5) Run STATTESTD (GOF-Test) Enter L2 for Observed: Enter L3 for Expected: Enter the degrees of freedom (k 1) 6) Make a decision to reject H0 or Fail to Reject H0. If , reject the null. If , fail to reject the null.

Example 1 (Page 553) A marketing executive randomly selects 500 radio music listeners from the broadcast region and asks each whether he or she prefers classical, country, gospel, oldies, pop, or rock music. The results are shown in the table. Find the observed frequencies and the expected frequencies for each type of music. Survey Results (n = 500) Type of Music Classical Country Gospel Oldies Pop Rock Claimed % 4% 36% 11% 2% 18% 29% Observed

8 210 72 10 75 125 Expected The observed frequency for each type of music is the number of radio music listeners naming that particular type of music. This is what is given in the Observed column above. The expected frequency for each type of music is the claimed percentage for that music type times the total number in the sample (n). (Table above, far right column) Example 1 (Page 553) A marketing executive randomly selects 500 radio music listeners from the broadcast region and asks each whether he or she prefers classical, country, gospel, oldies, pop, or rock music. The results are shown in the table. Find the observed frequencies and the expected frequencies for each type of music. Survey Results (n = 500) Type of Music

Classical Country Gospel Oldies Pop Rock Claimed % 4% 36% 11% 2% 18% 29% Observed 8 210 72 10 75 125 Expected

20 180 55 10 90 145 CHECK to be sure that ALL values in the Expected column are 5!! On the calculator, enter the Claimed % column into L1 in decimal form. .04, .36, .11, .02, .18, and .29 Enter the Observed column into L2 8, 210, 72, 10, 75, and 125 Highlight L3 and type in L1*500; this will give you the results shown in the Expected column. Example 2 (Page 555) The music preferences of the listeners in a radio stations broadcast

region are distributed as shown in the table from Example 1. You randomly select 500 radio music listeners from the broadcast region and ask each whether he or she prefers classical, country, gospel, oldies, pop, or rock music. The survey results are shown in the table from Example 1 (2nd column). Using = 0.01, perform a chi-square goodness-of-fit test to test whether the distributions are different. Write the hypotheses: The distribution of music preferences in the broadcast region is 4% classical, 36% country, 11% gospel, 2% oldies, 18% pop, and 29% rock. The distribution of music preferences in the broadcast region differs from the claimed or expected distribution. (claim). Example 2 (Page 555) Youve already put the data into STAT Edit, and confirmed that all expected values are at least 5, so its time to run the test. STAT TEST D Designate L2 for your Observed Designate L3 for your Expected Designate 5 as your degrees of freedom (6 categories minus 1) (this is your standardized test statistic).

, or .000383 Since , reject the null hypothesis. This means that the actual distribution is different from the claimed distribution. At the 1% significance level, there is enough evidence to conclude that the distribution of music preferences differs from the radio stations claimed or expected distribution. Example 3 (Page 556) The display below (left) shows two distributions describing opinions on what is more important to save for. You work for a financial services company and want to test the distribution describing mens opinions. To test the distribution, you randomly select 400 men and ask each which is more important saving for retirement or saving for childrens college education. The results are shown below (right). At = 0.05, test the claimed or expected distribution. Opinions on what is more important to save for. Retirement Childrens College Not sure Men 44%

40% 16% Women 41% 46% 13% Mens Survey Results Retirement 186 Childrens College 143 Not sure 71 Example 3 (Page 556) Write the hypotheses: The distribution of mens opinions on saving is 44% retirement, 40% college, and 16% not sure. (claim)

The distribution of mens opinions on savings differs from the claimed or expected distribution. STAT Edit L1 Enter the % claimed for each category, in decimal form. (.44, .40, .16) L2 Enter the observed values for each category (186, 143, 71) L3 Highlight the L3, then type in L1*400 (n in this survey). This gives you the expected values. Check to be sure that all of these values are 5!! Now, check to be sure that the Chi-square goodness-of-fit test can be used. Example 3 (Page 556) STAT TEST D Designate L2 for your Observed Designate L3 for your Expected Designate 2 as your degrees of freedom (3 categories minus

1) (this is your standardized test statistic). Since , fail to reject the null hypothesis. This means that the actual distribution is not different from the claimed distribution. At the 5% significance level, there is not enough evidence to dispute the claimed distribution of mens opinions on savings. Example 4 (Page 558) The manufacturer of M&Ms candies claims that the number of differentcolored candies in bags of dark chocolate M&Ms is uniformly distributed. To test this claim, you randomly select a bag that contains 500 dark chocolate M&Ms. The results are shown in the table. Using = .10, perform a chi-square goodness-of-fit test to test the claimed or expected distribution. What can you conclude? Color Brown Yellow Red Blue Orange Green Frequency 80 95 88

83 76 78 The claim is that the distribution is uniform; so the expected frequencies of the colors are equal. To find each expected percentage, divide 1 by the number of categories. In this case, divide 1 by 6. The expected percentage for each color is 1/6. Example 4 (Page 558) Color Brown Yellow Red Blue Orange Green Frequency 80 95 88 83 76

78 The null and alternate hypotheses are as follows: The distribution of different colored M&Ms is uniform. (Claim) The distribution of different colored M&Ms is not uniform. STAT Edit L1 Enter the % claimed for each category, in decimal form. (1/6, 1/6, 1/6, 1/6, 1/6, 1/6) L2 Enter the observed values for each category (80, 95, 88, 83, 76, 78) L3 Highlight the L3, then type in L1*500 (n in this survey). This gives you the expected values. Example 4 (Page 558) Because each expected frequency is at least 5, and the M&Ms were randomly selected, you can use the chi-square goodness-of-fit test to test the claimed distribution. STAT TEST D Designate L2 for your Observed

Designate L3 for your Expected Designate 5 as your degrees of freedom (6 categories minus 1) (this is your standardized test statistic). Since , fail to reject the null hypothesis. This means that the actual distribution is not different from the claimed distribution. At the 10% significance level, there is not enough evidence to dispute the claim that the distribution of the different-colored candies in bags of dark chocolate M&Ms is uniform. Classwork:Page 560 #1-8 All For #7 and 8, use the p-value and to make your decision instead of rejection regions. Homework: Pages 560-563 #9-17 All Use the p-value and to make your decision instead of rejection regions.