Discuss the use of two-way tables in general and relate to the results obtained for the HMO study using proportions. The HMO study is repeated here for reference.
A study was designed to find reasons why patients leave a health maintenance organization (HMO). Patients were classified as to whether or not they had filed a complaint with the HMO. We want to compare the proportion of complainers who leave the HMO with the proportion of those who do not file complaints but who also leave the HMO. In the year of the study, 639 patients filed complaints, and 54 of these patients left the HMO voluntarily. For comparison, the HMO chose an SRS of 743 patients who had not filed complaints. Twenty-two of these patients left voluntarily. Previously we looked at this business problem using proportions. Discuss how to use the Chi-squared distribution and present specific results comparing to the results obtained using proportions.
Case study:
To design effective marketing strategy you need to know your customers. What are the characteristics of people who use the World Wide Web to collect information on travel, and how do they differ from those who do not and use other sources? A survey that collected data to address this question examined the responses of 1401 Web users (WWW) and 1080 people who used other sources for this information (no, other category).The following tables (Week06CaseStudy.xls) give counts of WWW (WWW Yes) and Other (WWW no) for various demographic characteristics. Note that the marginal sums are sometimes less than 1401 and 1080 because of missing data. Use the methods of this chapter to compare the two groups (WWW Yes and Other, WWW No). Include graphical and numerical summaries along with the results of your significance tests. In some cases you may want to combine some categories for the demographic variables. Be sure to include a discussion of missing values. Write a report summarizing your work.
Construct a two-way table based on a business problem statement giving observed counts.
Determine the applicable degrees of freedom for the model.
Compute the expected counts for the two-way table.
Compute the chi-square test statistic for the two-way table.
Use the chi-square distribution for hypothesis testing involving categorical variables.
Compute the observed level of significance (p-value) based on the computed chi-square test statistic.
Based on your statistical analysis results, be able to prepare a business report presenting the conclusions of the analysis.
Available data by demographic category in file Week06CaseStudy.xls covers:
• Age in years.
• Gender.
• Education.
• Occupational category.
• Household income (U.S. $).
• Race.
Age WWWUserA CountA
Under18 Yes 22
18to25 Yes 160
26to35 Yes 328
36to45 Yes 277
46to55 Yes 224
Over55 Yes 101
Under18 No 24
18to25 No 161
26to35 No 184
36to45 No 189
46to55 No 164
Over55 No 109
Gender WWWUserB CountB
Female Yes 709
Male Yes 423
Female No 561
Male No 291
Education WWWUserC CountC
Grammar Yes 4
HighSchool Yes 85
Vocational Yes 54
SomeCollege Yes 336
College Yes 357
PostGrad Yes 259
Professional Yes 27
Other Yes 10
Grammar No 15
HighSchool No 125
Vocational No 53
SomeCollege No 293
College No 218
PostGrad No 114
Professional No 17
Other No 17
Occupation WWWUserD CountD
Mgmt Yes 167
Prof Yes 264
Educ Yes 175
Computer Yes 309
Other Yes 217
Mgmt No 87
Prof No 156
Educ No 164
Computer No 164
Other No 281
Income WWWUserE CountE
Lt10K Yes 43
10to20K Yes 58
20to30K Yes 116
30to40K Yes 149
40to50K Yes 127
50to75K Yes 259
75to100K Yes 119
Over100K Yes 134
NoAnswer Yes 127
Lt10K No 65
10to20K No 78
20to30K No 102
30to40K No 127
40to50K No 105
50to75K No 129
75to100K No 71
Over100K No 47
NoAnswer No 128
Race WWWUserF CountF
Cauc Yes 1001
Afr Yes 20
Asian Yes 35
Hisp Yes 20
Other Yes 56
Cauc No 772
Afr No 16
Asian No 16
Hisp No 15
Other No33