Adott egy X folytonos változó, ami normális eloszlású.
X Î N (m , s ) S Z Ó R Á S A N A L Í Z I S
Adottak ezen kívül az Y1,Y2,…,Yk diszkrét változók (faktorok)
H 0 : X - re nincs hatással Y1 Qtotal = Q1 + Q2 + ... + Qk + Q12 + Q13 + ... + Qhiba
A minta teljes szórásnégyzete
Qtotal =
å (x - x )
2
i
Adott egy X folytonos változó, ami normális eloszlású.
X Î N (m , s ) S Z Ó R Á S A N A L Í Z I S
Adottak ezen kívül az Y1,Y2,…,Yk diszkrét változók (faktorok)
H 0 : X - re nincs hatással Y1 Qtotal = Q1 + Q2 + ... + Qk + Q12 + Q13 + ... + Qhiba
Az Y1 magyarázta rész
Adott egy X folytonos változó, ami normális eloszlású.
X Î N (m , s ) S Z Ó R Á S A N A L Í Z I S
Adottak ezen kívül az Y1,Y2,…,Yk diszkrét változók (faktorok)
H 0 : X - re nincs hatással Y1 Qtotal = Q1 + Q2 + ... + Qk + Q12 + Q13 + ... + Qhiba
Az első két faktor interakciójához tartozó rész
1 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Adott egy X folytonos változó, ami normális eloszlású.
X Î N (m , s ) S Z Ó R Á S A N A L Í Z I S
Adottak ezen kívül az Y1,Y2,…,Yk diszkrét változók (faktorok)
H 0 : X - re nincs hatással Y1 Qtotal = Q1 + Q2 + ... + Qk + Q12 + Q13 + ... + Qhiba A véletlen hiba okozta rész
Egyszerű csoportosítás
S Z Ó R Á S A N A L Í Z I S
X
A dolgozó fizetése
Y
A dolgozó beosztása (tisztviselő, őrző-védő, menedzser)
H 0 : A beosztás nincs hatással a fizetésre ì (t ) (t ) (t ) ü ì ( v ) (v ) (v ) ü ì ( m ) ( m ) ( m) ü í x1 , x2 ,..., xn ý, í x1 , x 2 ,..., xn ý, í x1 , x2 ,..., xn ý t þ î vþ î mþ î
Egyszerű csoportosítás Csoportátlagok:
S Z Ó R Á S A N A L Í Z I S
1 n t (t ) å x x (t ) = nt j = 1 j
x (v ) =
1 n v (v ) å x nv j = 1 j
x ( m) =
1 n m ( m) å x nm j = 1 j
Négyzetösszegek:
(
)
(
)
(
(
)2 + nv (x (v) - x )2 + nm (x (m) - x )2
)
nt 2 nv 2 nm 2 Qtotal = å x (jt ) - x + å x (jv ) - x + å x (jm) - x j =1 j =1 j =1 Q k = n t x (t ) - x
nt n v (v ) n m ( m) (t ) (t ) (v ) ( m) ) 2 Qb = å ( x j - x ) 2 + å ( x j - x )2 + å ( x j - x j =1 j =1 j =1
2 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Egyszerű csoportosítás
Q total = Q k + Q b S Z Ó R Á S A N A L Í Z I S
H0
ð
Qk
3 -1 Qb
F-eloszlású (2, n-3)
n -3
H1
ð
Q m t x ( ) - x ( ) ± te × b n-3
n m + nt nm × nv
Student (n-3)
3 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Descriptives Miles per Gallon
N American European Japanese Total
248 70 79 397
Mean 20,13 27,89 30,45 23,55
Std. Deviation 6,377 6,724 6,090 7,792
95% Confidence Interval for Mean Lower Bound Upper Bound 19,33 20,93 26,29 29,49 29,09 31,81 22,78 24,32
Std. Error ,405 ,804 ,685 ,391
Minimum 10 16 18 10
Maximum 39 44 47 47
Test of Homogeneity of Variances Miles per Gallon Levene Statistic ,106
df1
df2 394
2
Sig. ,900
ANOVA Miles per Gallon Sum of Squares Between Groups 7984,957 Within Groups 16056,415 Total 24041,372
df 2 394 396
Mean Square 3992,479 40,752
F 97,969
Sig. ,000
Report Miles per Gallon Country of Origin American European Japanese Total
Mean 20,13 27,89 30,45 23,55
N 248 70 79 397
Std. Deviation 6,377 6,724 6,090 7,792
Multiple Comparisons Dependent Variable: Miles per Gallon LSD
(I) Country of Origin American European Japanese
(J) Country of Origin European Japanese American Japanese American European
Mean Difference (I-J) -7,763* -10,322* 7,763* -2,559* 10,322* 2,559*
Std. Error ,864 ,825 ,864 1,048 ,825 1,048
Sig. ,000 ,000 ,000 ,015 ,000 ,015
95% Confidence Interval Lower Bound Upper Bound -9,46 -6,06 -11,94 -8,70 6,06 9,46 -4,62 -,50 8,70 11,94 ,50 4,62
*. The mean difference is significant at the .05 level.
4 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
R E G R E S S Z I Ó A N A L Í Z I S
Y
függőváltozó
X1, X2, ... Xp
független változók
Y» f(X1, X2, ... Xp )
fÎF
becslés
E(Y- f*(X1, X2, ... Xp ))2 = min E(Y- f(X1, X2, ... Xp ))2 fÎF A legkisebb négyzetek módszere h(a,b,c,...) =
n
S (Y - f(X
X2i, ... Xpi,a,b,c,... ))2 ® min
1i,
i
a,b,c,...
i=1
I.
R E G R E S S Z I Ó A N A L Í Z I S
• Lineáris regresszió
f(X) = B0 + B1 X
• Többváltozós lineáris regresszió f(X1 , X2 ,...,Xp ) = B0 + B1 X1 + B2 X2+...+ Bp Xp • Polinomiális regresszió f(X1 , X2 ,...,Xp ) = B0 + B1 X + B2 X2+...+ BpXp X1=X, X2=X2, ... , Xp=Xp • Kétparaméteres (lineárisra visszavezethető) regresszió pl. Y=f(X) = Bo·
e
B1 X
Þ lnY = B1 X + ln Bo
II.
Kétparaméteres (lineárisra visszavezethető) regresszió
y = b0 + b1 x + b2 x 2
y = b0 × b1
x
quadratic compound
y = exp(b0 + b1 × x)
growth
y = b0 + b1 × ln x
logarithmic
y = b0 + b1 × x + b2 × x + b3 × x 2
b ö æ y = expç b0 + 1 ÷ xø è
y = b0 +
b1 x
exponential inverse
y = b0 + x b1
power
3
y=
cubic
S
y = b0 + exp(b1 × x )
1 1 / u + b0 + b1
x
logistic
5 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
R E G R E S S Z I Ó A N A L Í Z I S
• Nemlineáris regresszió f(X) = B1 + B2 exp(B3 X)
aszimptotikus I.
f(X) = B1 - B2 · (B3 )X
aszimptotikus II.
f(X) = (B1 + B2 X)-1/B3
sűrűség
f(X) = B1 · (1- B3 · exp(B2 X2)) f(X) = B1 · exp( - B2 exp( - B3 X2))) f(X) = B1 · exp( - B2 /(X + B3 ))
Gauss Gompertz Johnson-Schumacher
III.
R E G R E S S Z I Ó A N A L Í Z I S
• Nemlineáris regresszió f(X) = (B1 + B3 X)B2
log-módosított
f(X) = B1 - ln(1 + B2 exp( - B3 X )
log-logisztikus
f(X) = B1 + B2 exp( - B3 X )
Metcherlich
f(X) = B1 · X / (X + B2 )
Michaelis Menten
f(X) = (B1 B2 +B3 XB4)/(B2 + XB4 )
Morgan-Merczer-Florin
f(X) = B1 /(1+B2 exp( - B3 X +B4X2 + B5X3 )) Peal-Reed
IV.
R E G R E S S Z I Ó A N A L Í Z I S
• Nemlineáris regresszió f(X) = (B1 + B2 X +B3X2 + B4X3)/ B5X3 f(X) = (B1 + B2 X +B3X2 )/ B4X2 f(X) = B1/((1+B3 · exp(B2
X))(1/B4)
f(X) = B1/((1+B3 · exp(B2 X)) f(X) = (B1
(1-B4) ·
B2 exp( - B3
X))1/(1-B4)
köbök aránya négyzetek aránya Richards Verhulst Von Bertalanffy
f(X) = B1 - B2 exp( -B3 X B4)
Weibull
f(X) = 1/(B1 + B2 X +B3X2 )
Yield sűrűség
V.
6 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
R E G R E S S Z I Ó A N A L Í Z I S
• Szakaszonkénti lineáris regresszió
VI.
R E G R E S S Z I Ó A N A L Í Z I S
• Poligoniális regresszió
VII.
R E G R E S S Z I Ó A N A L Í Z I S
• Többváltozós lineárisis regresszió kategória-változóval
VIII.
7 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
R E G R E S S Z I Ó A N A L Í Z I S
• Logisztikus regresszió
Y dichotóm A
Y=
bekövetkezik { 1,0, haha azaz AA esemény esemény nem következik be
• A választó fog szavazni • A páciensnek szívinfarktusa lesz • Az üzletet meg fogják kötni
X1 , X2 ,...,Xp
ordinális szintű független változók
• eddig hányszor ment el, kor, iskola, jövedelem • napi cigi, napi pohár, kor, stressz • ár, mennyiség, piaci forgalom, raktárkészlet
IX.
R E G R E S S Z I Ó A N A L Í Z I S
• Logisztikus regresszió 1 P(Y=1) = P(A) » ————— 1 - e-Z Z = B0 + B1 X1 + B2 X2+...+ Bp Xp P(A) ODDS = ————— » e Z 1 - P(A)
Þ
log (ODDS) = Z = B0 + B1 X1 + B2 X2+...+ Bp Xp
X.
• Logisztikus regresszió R E G A legnagyobb valószínűség elve R E S L(e1,e2,...,en) = P(Y1= e1, Y2= e2, ... , Yn= en) = S Z = P(Y1= e1) P(Y2= e2) L P(Yn= en) » I Ó A 1 1 1 N » ———— · ———— · L · ———— -Z -Z 1 2 A 1 e-Zn 1-e 1-e L Í Z I ln L(e ,e ,...,e ) = 1 ln —————————————— 1 2 n S
S (
XI.
1 - exp (B0 + B1 X1 + B2 X2+...+ Bp Xp)
)
8 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
R E G R E S S Z I Ó A N A L Í Z I S
• Lineáris regresszió A lineáris kapcsolat kitüntetett: (1) a legegyszerűbb és leggyakoribb. (2) két dimenziós normális eloszlás esetén a kapcsolat nem is lehet más (vagy lineáris vagy egyáltalán nincs).
XII.
R E G R E S S Z I Ó A N A L Í Z I S
• Lineáris regresszió A teljes négyzetösszeg
A maradékösszeg
A regressziós összeg
XIII.
A lineáris regresszió Q = Qres + Qreg
(xi, yi )
y
res
(xi, yˆ i )
reg
( x, y ) 0
yˆi = B0 + B1 xi x
9 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
A lineáris regresszió A teljes négyzetösszeg felbontása: Q = Qres + Qreg freg szabadsági foka n-2, mert n tagú az összeg, de ezek között két összefüggés van. Ha nincs lineáris regresszió, a varianciák hányadosa (1, n-2) szabadsági fokú F eloszlást követ.
fres szabadsági foka mindössze 1, mert az átlag konstans
Q reg F =
s s
2 reg 2 res
=
f reg Q reg ( n - 2 ) = Q res Q res f res
A lineáris regresszió A legkisebb négyzetek módszere alapelve: y
yˆi
= B0 + B1 xi (x3, y3) e3
(x1, y1) e1 0
e2
(x5, y5) e5 e4 (x4, y4)
(x2, y2) x
Regressziós kapcsolat keresése változók között
10 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
11 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
M o de l Summa ry R Cou ntry o f Origin = Am erican Mo del (Se lected ) R S q uare 1 ,92 0 a ,84 6
Adj usted R S quare ,84 5
Std . Erro r of the Estimate 38,866
a. Pre dictor s: (Co nstan t), Veh icle W eight (lbs.)
ANO VAb,c Mod el 1
Sum of Squ ares Reg ressio n 207 9737 Res idual 379 148,4 Tot al 245 8885
df
Mea n Squ are F 1 207 9737,0 24 137 6,806 251 151 0,552 252
Sig. ,000 a
a. Pre dictors : (Con stant), Vehic le W eig ht (lbs .) b. Dep enden t Varia ble: E ngine Displac emen t (cu. in ches) c. Sele cting only ca ses fo r which Coun try of O rigin = Ame rican
Coefficientsa,b
Model 1
(Constant) Vehicle Weight (lbs.)
Unstandardized Coefficients B Std. Error -140,192 10,736 ,115 ,003
Standardized Coefficients Beta
t -13,058 37,105
,920
Sig. ,000 ,000
a. Dependent Variable: Engine Displacement (cu. inches) b. Selecting only cases for which Country of Origin = American
Model Summary R Country of Origin = European (Selected) R Square ,895a ,801
Adjusted Std. Error of R Square the Estimate ,798 10,045 a. Predictors: (Constant), Vehicle Weight (lbs.)
Model 1
ANO VAb,c Mod el 1
Sum of Squ ares Reg ressio n 288 72,390 Res idual 716 3,774 Tot al 360 36,164
df 1 71 72
Mea n Squ are 288 72,390 100 ,898
F 286 ,154
Sig. ,000 a
a. Pre dictors : (Con stant), Vehic le W eig ht (lbs .) b. Dep enden t Varia ble: E ngine Displac emen t (cu. in ches) c. Sele cting only ca ses fo r which Coun try of O rigin = Euro pean
Coefficientsa,b
Model 1
(Constant) Vehicle Weight (lbs.)
Unstandardized Coefficients B Std. Error 10,275 5,980 ,041 ,002
Standardized Coefficients Beta ,895
t 1,718 16,916
Sig. ,090 ,000
a. Dependent Variable: Engine Displacement (cu. inches) b. Selecting only cases for which Country of Origin = European
12 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Model Summary
Model 1
R Country of Origin = Japanese (Selected) ,841a
R Square ,708
Adjusted R Square ,704
Std. Error of the Estimate 12,585
a. Predictors: (Constant), Vehicle Weight (lbs.)
ANOVAb,c Model 1
Sum of Squares 29570,727 12195,577 41766,304
Regression Residual Total
df 1 77 78
Mean Square 29570,727 158,384
F 186,703
Sig. ,000a
a. Predictors: (Constant), Vehicle Weight (lbs.) b. Dependent Variable: Engine Displacement (cu. inches) c. Selecting only cases for which Country of Origin = Japanese
Coefficientsa,b
Model 1
(Constant) Vehicle Weight (lbs.)
Unstandardized Coefficients B Std. Error -32,235 9,977 ,061 ,004
Standardized Coefficients Beta ,841
t -3,231 13,664
Sig. ,002 ,000
a. Dependent Variable: Engine Displacement (cu. inches) b. Selecting only cases for which Country of Origin = Japanese
Variables Entered/Removedb,c Model 1
Variables Entered Engine Displacem ent (cu. inches), Time to Accelerate from 0 to 60 mph (sec), Horsepow er, Vehicle Weight a (lbs.)
Variables Removed
Method
.
Enter
a. All requested variables entered. b. Dependent Variable: Miles per Gallon c. Models are based only on cases for which Country of Origin = American
Model Summary
Model 1
R Country of Origin = American (Selected) ,865a
R Square ,748
Adjusted R Square ,744
Std. Error of the Estimate 3,244
a. Predictors: (Constant), Engine Displacement (cu. inches), Time to Accelerate from 0 to 60 mph (sec), Horsepower, Vehicle Weight (lbs.)
13 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
ANOVAb,c Model 1
Sum of Squares 7482,899 2515,631 9998,529
Regression Residual Total
df 4 239 243
Mean Square 1870,725 10,526
F 177,730
Sig. ,000a
a. Predictors: (Constant), Engine Displacement (cu. inches), Time to Accelerate from 0 to 60 mph (sec), Horsepower, Vehicle Weight (lbs.) b. Dependent Variable: Miles per Gallon c. Selecting only cases for which Country of Origin = American
Coefficientsa,b
Model 1
Unstandardized Coefficients B Std. Error 46,620 2,498 -,019 ,015 -,003 ,001
(Constant) Horsepower Vehicle Weight (lbs.) Time to Accelerate from 0 to 60 mph (sec) Engine Displacement (cu. inches)
Standardized Coefficients Beta -,117 -,342
t 18,661 -1,259 -3,642
Sig. ,000 ,209 ,000
-,429
,129
-,183
-3,315
,001
-,034
,007
-,529
-4,802
,000
a. Dependent Variable: Miles per Gallon b. Selecting only cases for which Country of Origin = American
Variables Entered/Removeda,b Model 1
Variables Entered
Variables Removed
Vehicle Weight (lbs.)
.
Engine Displacem ent (cu. inches)
.
Time to Accelerate from 0 to 60 mph (sec)
.
2
3
Method Stepwise (Criteria: Probabilit y-ofF-to-enter <= ,050, Probabilit y-ofF-to-remo ve >= ,100). Stepwise (Criteria: Probabilit y-ofF-to-enter <= ,050, Probabilit y-ofF-to-remo ve >= ,100). Stepwise (Criteria: Probabilit y-ofF-to-enter <= ,050, Probabilit y-ofF-to-remo ve >= ,100).
a. Dependent Variable: Miles per Gallon b. Models are based only on cases for which Country of Origin = American
14 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Model Summary
Model 1 2 3
R Country of Origin = American (Selected) ,845a ,858b ,864c
Adjusted R Square ,712 ,734 ,744
R Square ,713 ,736 ,747
Std. Error of the Estimate 3,442 3,308 3,248
a. Predictors: (Constant), Vehicle Weight (lbs.) b. Predictors: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches) c. Predictors: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches), Time to Accelerate from 0 to 60 mph (sec)
ANOVAd,e Model 1
2
3
Regression Residual Total Regression Residual Total Regression Residual Total
Sum of Squares 7131,610 2866,919 9998,529 7360,992 2637,538 9998,529 7466,203 2532,326 9998,529
df 1 242 243 2
Mean Square 7131,610 11,847
F 601,987
Sig. ,000a
3680,496
336,298
,000b
235,869
,000c
241 243 3 240 243
10,944 2488,734 10,551
a. Predictors: (Constant), Vehicle Weight (lbs.) b. Predictors: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches) c. Predictors: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches), Time to Accelerate from 0 to 60 mph (sec) d. Dependent Variable: Miles per Gallon e. Selecting only cases for which Country of Origin = American
Coefficientsa,b
Model 1 2
3
(Constant) Vehicle Weight (lbs.) (Constant) Vehicle Weight (lbs.) Engine Displacement (cu. inches) (Constant) Vehicle Weight (lbs.) Engine Displacement (cu. inches) Time to Accelerate from 0 to 60 mph (sec)
Unstandardized Coefficients B Std. Error 43,104 ,964 -,007 ,000 39,642 1,196 -,004 ,001
Standardized Coefficients Beta -,845 -,490
t 44,715 -24,535 33,148 -5,811
Sig. ,000 ,000 ,000 ,000
-,025
,005
-,386
-4,578
,000
44,713 -,003
1,989 ,001
-,377
22,476 -4,176
,000 ,000
-,038
,007
-,580
-5,626
,000
-,336
,107
-,143
-3,158
,002
a. Dependent Variable: Miles per Gallon b. Selecting only cases for which Country of Origin = American
Excluded Variablesd
Model 1
2
3
Horsepower Time to Accelerate from 0 to 60 mph (sec) Engine Displacement (cu. inches) Horsepower Time to Accelerate from 0 to 60 mph (sec) Horsepower
Beta In -,140a ,009 -,386
a
a
,059b -,143
b
-,117c
t -2,243
Sig. ,026
Partial Correlation -,143
Collinearity Statistics Tolerance ,301
,226
,822
,015
,794
-4,578
,000
-,283
,154
,752
,453
,049
,180
-3,158
,002
-,200
,513
-1,259
,209
-,081
,122
a. Predictors in the Model: (Constant), Vehicle Weight (lbs.) b. Predictors in the Model: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches) c. Predictors in the Model: (Constant), Vehicle Weight (lbs.), Engine Displacement (cu. inches), Time to Accelerate from 0 to 60 mph (sec) d. Dependent Variable: Miles per Gallon
15 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Model Summary and Parameter Estimates Dependent Variable: Miles per Gallon Equation Linear Logarithmic Inverse Power Exponential Logistic
R Square ,595 ,658 ,659 ,705 ,669 ,669
Model Summary F df1 572,709 1 751,882 1 754,263 1 933,576 1 788,834 1 788,834 1
df2 390 390 390 390 390 390
Sig. ,000 ,000 ,000 ,000 ,000 ,000
Parameter Estimates Constant b1 39,855 -,157 108,452 -18,536 3,963 1808,017 1023,877 -,836 47,300 -,007 ,021 1,007
The independent variable is Horsepower.
16 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com
Model Summary and Parameter Estimates Dependent Variable: Miles per Gallon Equation Power
R Square ,705
Model Summary F df1 933,576 1
df2 390
Sig. ,000
Parameter Estimates Constant b1 1023,877 -,836
The independent variable is Horsepower.
ANOVA
Regression Residual Total
Sum of Squares 31,889 13,321 45,210
df 1 390 391
Mean Square 31,889 ,034
F 933,576
Sig. ,000
The independent variable is Horsepower.
Coefficients
ln(Horsepower) (Constant)
Unstandardized Coefficients B Std. Error -,836 ,027 1023,877 128,800
Standardized Coefficients Beta -,840
t -30,554 7,949
Sig. ,000 ,000
The dependent variable is ln(Miles per Gallon).
17 PDF created with FinePrint pdfFactory Pro trial version http://www.fineprint.com