Kasus
Survey terhadap remaja usia 15-16 tahun apakah pernah melakukan kerja paruh waktu (part-time)??
Race
Berikut Tabel datanya: Gender
White
Male
Black
Male
Yes 43
Female
26
Female
22
29
Part Time Job
No
134 149
23
36
Prediktor > 1
βi merupakan besarnya pengaruh Xi pada log odds ketika Y = 1 dan X lainnya tetap.
exp (β1 ) adalah penggandaan pengaruh pada odds dari penambahan X1 sebesar satu unit, pada saat x lainnya tetap
Memasukkan data di SAS Peubah “white” (1 = white, 0 = black),
Peubah “gender” (1 = male, 0 = female), and Peubah “part-time job” (1 = yes, 0 =no).
Ingin diketahui odds melakukan part-time job dengan ras dan gender sebagai prediktor.
(Enter the code on the next slide into SAS)
Syntax SAS DATA job; INPUT white male job count; DATALINES; 1 1 1 43 1 1 0 134 1 0 1 26 1 0 0 149 0 1 1 29 0 1 0 23 0 0 1 22 0 0 0 36 ; RUN;
PROC LOGISTIC DATA = job descending; weight count; MODEL job = white male/rsquare lackfit; RUN; “descending” models the probability that parttime job = 1 (yes) rather than = 0 (no). “rsquare” requests the R2 value from SAS; it is interpreted the same way as the R2 from linear regression. “lackfit” requests the Hosmer and Lemeshow Goodness-of-Fit Test. This tells you if the model you have created is a good fit for the data.
Output SAS: R2
Interpretasi nilai R2
Nilai R2 sebesar 0.9907. Berarti 99.07% keragaman pada respon(part-time job) pada model dapat dijelaskan oleh gender dan race.
Perbandingan Model untuk Melihat Faktor yang Berpengaruh
Hipotesis 1 H0: βW = βB = 0 Hipotesis ini menyatakan bahwa, pada gender yang sama, peluang seorang remaja kerja paruh waktu bebas terhadap ras (warna kulit). Hipotesis 2 H0: βM = βF = 0 Hipotesis ini menyatakan bahwa, pada ras (warna kulit) yang sama, peluang seorang remaja kerja paruh waktu bebas terhadap gender .
Output Proc Logistic
Notice that the race and gender terms are both statistically significant (p < 0.0001 and p = 0.0040, respectively).
The odds of having parttime job is 73.1% (1-0.269) lower for whites than blacks.
The logistic regression model is: log(odds) = β0 + β1(white) + β2(male) log(odds) = -0.4555 – 1.3135(white) + 0.6478(male) The odds of having parttime job is 1.911 times greater for males versus females.
Misalkan kita ingin mengetahui odds dari melakukan part-time job untuk black males versus white females:
Log(odds)black males = β0 + β1(0) + β2(1) Log(odds)white females = β0 + β1(1) + β2(0) Log(OR) = β0 + β2 – [β0 + β1] = β2 – β1 Log(OR) = 0.6478 – (-1.3135) = 1.9613 OR = exp(1.9613) = 7.11 Black males have a 7.11 times greater odds of having part time job than white females.
Uji Kebaikan Suai Hosmer and Lemeshow (Goodness of Fit Test)
Interpreting the Hosmer-Lemeshow Goodness Of Fit Test
Hipotesis: Ho: the model is a good fit, vs Ha: the model is NOT a good fit With this test, we want to FAIL to reject the null hypothesis, because that means our model is a good fit (this is different from most of the hypothesis testing you have seen). Look for a p_value > 0.10 in the H-L GOF test. This indicates the model is a good fit. In this case, the p_value = 0.2419, so we do NOT reject the null hypothesis, and we conclude the model is a good fit.
Ilustrasi : Data Crab
• Respon Jumlah satelit • Prediktor lebar cangkang dan warna cangkang • Warna terdiri dari lima kategori: light, medium light, medium, medium dark, dark. (semakin tua kepiting, semakin gelap warnanya) Note: data tidak ada kepiting dengan warna “light”
Model regresi
The crab color is dark (category 4) when c1 = c2 = c3 = 0.
Output SAS
• For each color, a 1 cm increase in width has a multiplicative effect of exp(0.468) = 1.60 on the odds that Y = 1. • a dark crab of average width (26.3 cm) has estimated probability exp[−12.715 + 0.468(26.3)]/{1 + exp[−12.715 + 0.468(26.3)]} = 0.399. • a medium-light crab of average width has estimated probability exp[−11.385 + 0.468(26.3)]/{1 + exp[−11.385 + 0.468(26.3)]} = 0.715.
• the difference in color parameter estimates between medium-light crabs and dark crabs= 12.715-11.385=1.330. • at any given width, the estimated odds that a medium-light crab has a satellite are exp(1.330) = 3.8 times the estimated odds for a dark crab • Using the probabilities at width 26.3, the odds equal 0.399/0.601 = 0.66 for a dark crab and 0.715/0.285 = 2.51 for a medium-light crab, for which 2.51/0.66 = 3.8.
Paralel No interaction
Are certain terms needed in a model?
• To test this, we can compare the maximized log-likelihood values for that model and the simpler model without those terms.
To test whether color contributes to model (4.11)
• H0: β1 = β2 = β3 = 0 (controlling for width, the probability of a satellite is independent of color)
• Statistik Uji −2(L0 − L1) = 7.0 ~ Chi-square(db=3) • P-value = 0.07 • Warna berpengaruh terhadap peluang memiliki satelit
WARNA ORDINAL???