ADLN Perpustakaan Universitas Airlangga
PERBANDINGAN METODE GENERALIZED CROSS VALIDATION DAN GENERALIZED MAXIMUM LIKELIHOOD DALAM REGRESI NONPARAMETRIK SPLINE UNTUK MEMPERKIRAKAN JUMLAH LEUKOSIT PADA TERSANGKA FLU BURUNG DI JAWA TIMUR RINGKASAN Dalam model regresi nonparametrik bentuk kurva regresi hanya diasumsikan mulus (smooth), dalam arti termuat di dalam ruang Sobolev , , ′ , … , kontinyu absolute dan ∞. Data diharapkan mencari sendiri bentuk
estimasinya, tanpa dipengaruhi oleh faktor subyektifitas peneliti (Eubank, 1988). Dengan demikian, fleksibilitas yang tinggi akan dimiliki oleh pendekatan regresi nonparametrik (Khair, 2006).
Terdapat beberapa pendekatan untuk memperoleh bentuk estimator kurva regresi dalam regresi nonparametrik. Diantaranya adalah pendekatan histogram (Green dan Silverman, 1994), pendekatan Kernel (Hardle, 1990), Spline (Wahba, 1990), estimator deret orthogonal atau regresi Fourier (Eubank, 1998), K-Nearest Neighbour (Hardle, 1990) dan analisis Wavelet (Antoniadis dkk, 1994). Menurut (Khair, 2006) penggunaan pendekatan spline dengan basis polynomial truncated yang penyelesaiannya menggunakan optimasi least square dapat menjadi pilihan yang lebih baik. Spline polynomial truncated merupakan jumlahan dari fungsi polynomial dengan suatu fungsi (Sutarsi, 2008). Spline mempunyai keunggulan dalam mengatasi pola data yang menunjukkan naik/turun yang tajam dengan bantuan titik knot, serta kurva yang dihasilkan relative smooth / mulus (Hardle, 1990). Spline orde dengan knot pada , , … didefinisikan sebagai suatu fungsi dengan bentuk:
! "#
#
#$
dimana + dan β merupakan parameter.
(
% ! &' ) ' * '$
Dalam pemilihan +,-.#/ , kriteria GCV didefinisikan sebagai: 012+
345 + 1 ) 7 ∑9#$ ##
Dengan: 9
345+ 7 !:# ) ;# #$
Pemilihan +,-.#/ dilakukan dengan melihat nilai GCV yang minimum.
vii Tesis
Perbandingan Metode Generalized Maximum Likelihood Spline …
Dewi Noor Hidayati
ADLN Perpustakaan Universitas Airlangga
Sedangkan metode GML diperoleh dengan cara: 03< +
Dengan D 7 )
: = >> = ?+>)1 >@: |> = ?+> |CB
Nilai +,-.#/ EF7G03<+. Metode GML cukup popular dan baik untuk data yang berkorelasi (Wang, 1998). Dari hasil penelitian dengan sampel leukosit pada tersangka flu burung di Jawa Timur, didapatkan hasil, bahwa metode GCV dengan Spline linier adalah yang paling menghasilkan kurva yang smooth / mulus dibandingkan dengan metode GML. Hal ini berdasarkan nilai MSE yang paling kecil dan R2 yang paling besar. Didapatkan titik knot pada metode ini adalah 8, dan model yang didapatkan adalah IJ 8004 ) 587 % 611 ) 8*
viii Tesis
Perbandingan Metode Generalized Maximum Likelihood Spline …
Dewi Noor Hidayati
ADLN Perpustakaan Universitas Airlangga
COMPARISON BETWEEN GENERALIZED CROSS VALIDATION METHOD AND GENERALIZED MAXIMUM LIKELIHOOD METHOD IN NONPARAMETRIC SPLINE REGRESSION TO ESTIMATE THE LEUCOCYTE OF AVIAN INFLUENZA SUSPECT IN EAST JAVA SUMMARY
In nonparametric regression, the shape of the regression curve is only assumed smooth, include in Sobolev space , , ′ , … , absolutely continuous and ∞. The data expected to fit its estimation, without influenced the
subjectivity of the researcher (Eubank, 1988). Therefore, the nonparametric regression has much flexibility (Khair, 2006).
There are some approximations to make the shape of regression curve estimator in nonparametric regression, such as histogram estimation (Green dan Silverman, 1994), Kernel estimation (Hardle, 1990), Spline (Wahba, 1990), orthogonal sequence estimator or Fourier estimation regression (Eubank, 1998), K-Nearest Neighbors (Hardle, 1990) and Wavelet analysis (Antoniadis dkk, 1994). (Khair, 2006) said spline bases with truncated polynomial spline which terminate by least square optimation can be the better choice. Truncated polynomial spline is the cumulative of polynomial function with a function (Sutarsi, 2008). Spline has enhanced to control the data which model is up or down strictly with knot points, and result relatively more smooth (Hardle, 1990). Spline orde with knot , , … is defined as a function:
! "#
Where + and & is parameter.
#
#$
(
% ! &' ) ' * '$
To choose +,-.#/ , GCV criteria is define as: 012+
345 + 1 ) 7 ∑9#$ ##
with :;<+ 7
9
!=# ) >#
#$
Choosing +,-.#/ is base of the minimum value of GCV. GML method is obtained by: ix Tesis
Perbandingan Metode Generalized Maximum Likelihood Spline …
Dewi Noor Hidayati
ADLN Perpustakaan Universitas Airlangga
03? +
With F 7 )
= @ AA @ ;+A)1 AB= |A @ ;+A |ED
The value of +,-.#/ :G7H03?+. GML method is popular and good for correlated data (Wang, 1998). The research result of leucocytes sample from avian influenza suspect in East Java finally found that GCV method with spline linier shape is smoother than GML method. The fact based on the regression curves, that describing by the least value of MSE and highest of R2. From this method founded that knot point is 8, and by spline truncated polynomial the estimator model fit on model JK 8004 ) 587 % 611 ) 8*
x Tesis
Perbandingan Metode Generalized Maximum Likelihood Spline …
Dewi Noor Hidayati
ADLN Perpustakaan Universitas Airlangga
ABSTRACT
COMPARISON BETWEEN GENERALIZED CROSS VALIDATION METHOD AND GENERALIZED MAXIMUM LIKELIHOOD METHOD IN NONPARAMETRIC SPLINE REGRESSION TO ESTIMATE THE LEUCOCYTE OF AVIAN INFLUENZA SUSPECT IN EAST JAVA Suppose that response variables , , . .. , have been observed at design points … , following the regression model: , 1, 2, … ,
Where . is an unknown regression function and , , , … , are zero mean, uncorrelated random errors, and variance. The regression curve shape is unknown but only assumed smooth, included in the Sobolev space , , ′ , … , ! ( ! . absolutely continuous " # $ % & ∞' )
This paper will study how to estimate the regression curve from the sample. It will be explaining the estimator regression curve by declaration as the truncated polynomial spline which relatively simpler than other nonparametric regression curve approximation. The regression curve estimation using least square optimization:
/
/
* + , -* .
2
* 01 # , 31 $4 1/
56
Furthermore, to know the smoothness of regression curve by using Generalized Cross Validation (GCV) and Generalized Maximum Likelihood (GML) methods. Which one of the methods is smoothest? Therefore from by using that estimator, so we can estimate the leukocytes of the avian influenza suspects in east Java. The result of this research found that the linier GCV method gives smoothest curve compared with the other GML methods. It came from the conclusion of the least value of MSE. This method also found the knot point is 8, and the models are: 8
& 8! 9: 8004 , 587 , : 9 8004 , 587 611 , 8 , @ 8
Keywords: truncated polynomial spline, GCV, GML, leucocytes, avian influenza.
xi Tesis
Perbandingan Metode Generalized Maximum Likelihood Spline …
Dewi Noor Hidayati