Volume 2 Number 2 Fall 2010
Internetworking Indonesia Journal The Indonesian Journal of ICT and Internet Development
ISSN: 1942-9703 IIJ © 2010 www.InternetworkingIndonesia.org
Editors’ Introduction
1
The Effect of IPv6 Packet Size on the Implementation of CRC Extension Header by Supriyanto, Iznan H. Hasbullah & Rahmat Budiarto
3
A Modified Weighted Clustering Algorithm for Stable Clustering using Mobility Prediction Scheme by S. Muthuramalingam, R. Viveka, B. Steffi Diana & R. Rajaram
9
Mengamankan Single Identity Number (SIN) Menggunakan QR Code dan Sidik Jari by Fridh Zurriyadi Ridwan, Hariyo Santoso & Wiseto Agung
17
EnsoTracker: Mouse Controller Device using Head Movement by Gunawan, Tri Kurniawan Wijaya, Indra Maryati & Edwin Seno Dwihapsoro
21
UTAUT Model for Understanding Learning Management System by I Gusti Nyoman Sedana & St. Wisnu Wijaya
27
Star Schema Design for Concept Hierarchy in Attribute Oriented Induction by Spits Warnars
33
Internetworking Indonesia Journal
The Indonesian Journal of ICT and Internet Development ISSN: 1942-9703 Internetworking Indonesia Journal is a semi-annual peer-reviewed electronic journal devoted to the timely study of Information & Communication Technology (ICT) and Internet development in Indonesia. The journal follows the open-access philosophy and seeks high-quality manuscripts on the challenges and opportunities presented by information technology and the Internet in Indonesia. Journal mailing address:
Internetworking Indonesia Journal, PO Box 397110 MIT Station, Cambridge, MA 02139, USA.
Co-Editors Thomas Hardjono, PhD (MIT Kerberos Consortium, MIT, USA)
Budi Rahardjo, PhD (ITB, Indonesia)
Kuncoro Wastuwibowo, MSc (PT. Telkom, Indonesia)
Editorial Advisory Board Prof. Edy Tri Baskoro, PhD (ITB, Indonesia) Mark Baugher, MA (Cisco Systems, USA) Lakshminath Dondeti, PhD (Qualcomm, USA) Paul England, PhD (Microsoft Research, USA) Prof. Svein Knapskog, PhD (NTNU, Norway) Prof. Merlyna Lim, PhD (Arizona State University, USA)
Prof. Bambang Parmanto, PhD (University of Pittsburgh, USA) Prof. Wishnu Prasetya, PhD (Utrecht University, The Netherlands) Graeme Proudler, PhD (HP Laboratories, UK) Prof. Jennifer Seberry, PhD (University of Wollongong, Australia) Prof. Willy Susilo, PhD (University of Wollongong, Australia) Prof. David Taniar, PhD (Monash University, Australia)
Technical Editorial Board Sulfikar Amir, PhD (NTU, Singapore) Moch Arif Bijaksana, MSc (IT Telkom, Indonesia) Teddy Surya Gunawan, PhD (IIUM, Malaysia) Dwi Handoko, PhD (BPPT, Indonesia) Mira Kartiwi, PhD (IIUM, Malaysia) Bobby Nazief, PhD (UI, Indonesia) Anto Satriyo Nugroho, PhD (BPPT, Indonesia)
Yanuar Nugroho, PhD (Manchester University, UK) Bernardi Pranggono, PhD (University of Leeds, UK) Bambang Prastowo, PhD (UGM, Indonesia) Bambang Riyanto, PhD (ITB, Indonesia) Andriyan Bayu Suksmono, PhD (ITB, Indonesia) Henri Uranus, PhD (UPH, Indonesia) Setiadi Yazid, PhD (UI, Indonesia)
Manuscript Language: The Internetworking Indonesia Journal accepts and publishes papers in Bahasa Indonesia and English. Manuscript Submission: • Manuscripts should be submitted according to the IEEE Guide for authors, and will be refereed in the standard way. • Manuscript pages should not exceed 7 pages of the IEEE 2-column format. It should be submitted as a Microsoft-Word file, using the IIJ Template document which can be found at the www.InternetworkingIndonesia.org website. • Manuscripts submitted to the IIJ must not have been previously published or committed to another publisher under a copyright transfer agreement, and must not be under consideration by another journal. • Papers previously published at conferences can be submitted to the IIJ, but must be revised so that it has significant differences from the conference version. • Authors of accepted papers are responsible for the Camera Ready Copy formatted using the same IIJ Template format. • Authors are advised that no revisions of the manuscript can be made after acceptance by the Editor for publication. The benefits of this procedure are many, with speed and accuracy being the most obvious. • Please email your papers (or questions) to:
[email protected] Submission Guidelines: Please review the descriptions below and identify the submission type best suited to your paper. • Research Papers: Research papers report on results emanating from research projects, both theoretical and practical in nature. • Short papers: Short research papers provide an introduction to new developments or advances regarding on-going work. • Policy Viewpoints: Policy Viewpoints explore competing perspectives in the Indonesian policy debate that are informed by academic research. • Teaching Innovation: Teaching Innovation papers explore creative uses of information technology tools and the Internet to improve learning and education in Indonesia. • Book Reviews: A review of a book, or other book-length document, such as a government report or foundation report.
Vol. 2/No. 2 (2010)
INTERNETWORKING INDONESIA JOURNAL
1
Editors’ Introduction
W
to the Fall/Winter 2010 issue of the Internetworking Indonesia Journal. We are pleased to bring you this regular issue of the IIJ which contains a broad range of papers, include those written by graduate students. As described in the goals of the IIJ, among others the journal seeks to be a publishing venue for graduate students (such as Masters/S2 and PhD/S3 students) as well as working academics in the broad field of ICT and Internet development. This includes graduate students from Indonesian universities and those studying abroad. Consistent with another goal of the IIJ, the current issue of the journal has papers written in Bahasa Indonesia. From the outset the IIJ has sought to promote the culture of writing and of excellent authorship in Indonesia. It is for this reason that the IIJ is “bilingual” in that it accepts and publishes papers in both English and Bahasa Indonesia. We believe that by publishing papers in Bahasa Indonesia, we provide students and other authors the opportunity to develop formal writing skills for technical papers in Bahasa Indonesia. The first paper in the current issue deals with the use of a cyclic redundancy check code (CRC code) in the extension header of an IPv6 packet. The paper proposes the introduction of a new IPv6 header extension whose contents would be a CRC code computed over the relevant fields of the packet. One aim of this approach is to obviate the need for error detection at the underlying Data Link layer of the intermediate nodes. The paper reports a number of results from tests using a simple IP network topology. Although introducing this new extension header costs additional computation at the sender and receiver, the authors claim that overall benefits is derived by from the intermediate nodes (routers) not needing to perform error detection at their Data Link layer prior to routing/forwarding packets. Mobile ad hoc networks (MANET) is the topic of the second paper. More specifically, the paper deals with the issue of cluster-head selection in a stable manner. The dynamic nature of the members of a wireless ad hoc network poses a number of interesting questions, including that of reliable routing. With each node in an ad hoc network acting also as wireless router, the nodes need to have the ability to selforganize into a given network architecture, such as a cluster network architecture. This in-turn necessitates the selection of certain nodes to become cluster-heads. This paper proposes a new approach to the selection of such cluster-heads. The approach is based on the well-known weighted clustering algorithm (WCA), and introduces a mobility prediction scheme which looks not only at the mobility of a given node ELCOME
but also at the relative mobility of its neighbors. The prediction algorithm uses the node link expiration time as the basis for the prediction scheme. The third paper, written in Bahasa Indonesia, focuses on the security issues around the planned use (by the Indonesian government) of the 16-digit Single identity Number (SIN) for every resident of Indonesia. The aim of the SIN is, among others, to expedite access by residents to government services by replacing non-digital credentials and other identity documents. The paper looks at the possible use of a twodimensional bar code called the Quick Response (QR) code to conveniently represent the SIN. This barcode would then be easily readable (eg. at local government offices) by using code-reader devices that are attachable to PC computers or mobile phones. The paper also considers the use of biometric scans to authenticate the user as the legitimate holder of a given SIN. Finally, the paper offers some broad suggestions regarding the implementation of such a SIN-based system, including the creation of the SIN values and its associated card, and the verification of the card and SIN at the point of service. Providing affordable computer-accessibility to sufferers of quadriplegia is the topic of the fourth paper. For handicapped persons in developing nations, the cost of devices and related equipment represents a major issue. As such, this paper looks into the possible development of affordable devices using cheap off-the-shelf components. The particular device in this case is a mouse controller, which can be operated by detecting the user’s head movement. The fifth paper, which is also written in Bahasa Indonesia, addresses quite a different topic from the previous papers, namely that of a learning management system. Following the unified theory of acceptance and use of technology, the paper reports some research results from experiments conducted at the Sanata Dharma University using the Exelsa system. The research found that performance expectancy, social influence and facilitating conditions represented significant factors influencing the behavioral intention of the students employing the Exelsa system. The last paper is one written by a graduate student on the topic attribute oriented induction and its influence in the design of databases within data mining. The paper provides an illustration of the steps required to arrive at database tables from concept hierarchies as the point of departure. The paper provides a basic algorithm to convert a non-rule based concept hierarchy into database tables.
The editors can be reached individually at the following email addresses. Thomas Hardjono is at
[email protected], Budi Rahardjo is at
[email protected], while Kuncoro Wastuwibowo is at
[email protected].
ISSN: 1942-9703 / © 2010 IIJ
Thomas Hardjono Budi Rahardjo Kuncoro Wastuwibowo
2
INTERNETWORKING INDONESIA JOURNAL
ISSN: 1942-9703 / © 2010 IIJ
Vol. 2/No. 2 (2010)
INTERNETWORKING INDONESIA JOURNAL
Vol.2/No.2 (2010)
3
The Effect of IPv6 Packet Size on Implementation of CRC Extension Header Supriyanto Electrical Engineering Department University of Sultan Ageng Tirtayasa (UNTIRTA) Indonesia
Iznan H. Hasbullah National Advanced IPv6 Centre Universiti Sains Malaysia (USM) Malaysia
Abstract—The traditional TCP/IP stack requires verification and regeneration of the CRC code in every router to detect transmission errors. For IPv6 packet transmission over high speed networks, this task is redundant because of the very rare transmission error and thus introduces unnecessary latency. The CRC Extension Header (CEH) was proposed to eliminate the unnecessary duplication of error detection. Since IPv6 packets are transmitted over Ethernet, the packet size ranges from 1280 up to 1500 bytes. In the near future, the size will increase with the advance of underlying technology such as 100 Gigabit Ethernet. This paper studies the effect of various IPv6 packet sizes with the implementation of CEH on the processing time and network latency. Experiments were done by transmitting various IPv6 packet sizes over high speed networks. The results showed that higher packet sizes increases processing time and network latency.
Rahmat Budiarto School of Computer Sciences Universiti Sains Malaysia (USM) Malaysia
delivered up to the Network layer for forwarding and hop limit updating. The Network layer within a router on an IPv6 network does a simpler task compared to the network layer of a IPv4 router. In an IPv6 router there are no error detection processing (header checksum checking) and fragmentation. In addition, the size of an IPv6 header is fixed. This means that IPv6 packets are processed faster on router. However, the Data Link layer of the router not only needs to perform CRC calculation for the received frame but also for each frame that needs to be forwarded to the next hop. On a large network that with numerous routers this double CRC calculation will be repeated in every router, as shown in Figure 1.
Index Terms—Routing, TCP/IP, IPv6, error correction.
I. INTRODUCTION
R
OUTERS in a computer network have the important role in the packet forwarding process in order to ensure that each packet reaches its correct destination. An IPv6 packet transmitted through the Internet has to pass through many routers along the network. A router can be defined as three of the five layers of the TCP/IP stack, namely the Physical layer, the Data Link layer and the Network layer. In general, a router needs to decide to which node a packet has to be forwarded next after the IPv6 header processing are done. The decision is made in the Network layer of the router. However, before the packet reaches the Network layer, it must pass through the error detection process in the Data Link layer. The Data Link layer always verifies the CRC code attached with the frame received to ensure that the frame from the previous hop is free from transmission error. Only the correct frame will be Manuscript received October 26, 2010. This work was supported by the Ministry of National Education of the Republic of Indonesia. Supriyanto is with the Electrical Engineering Department, University of Sultan Ageng Tirtayasa, Cilegon, Banten, Indonesia (e-mail: supriyanto@ ftuntirta.ac.id). Iznan H. Hasbullah is with the National IPv6 Centre (NAv6) Universiti Sains Malaysia, Penang, Malaysia. (e-mail:
[email protected]). Rahmat Budiarto is with the School of Computer Sciences, Universiti Sains Malaysia, Penang, Malaysia (e-mail:
[email protected]).
Figure 1: CRC Code Generation and Verification Figure 1 shows a small network with three routers as intermediate nodes. In this network, there will be eight CRC calculations: four CRC code generations and four CRC code verifications. For IPv6 packet transmission over high speed networks which has an extremely small probability of transmission error, those repeated tasks are unnecessary. This is more so in the case of a reliable medium such as fiber optic which is not influenced by either the magnetic field or electric field. Errors within this kind of medium will be infinitesimal that practically we can say it has a near zero probability of occurring. The IPv6 protocol has been designed with features that provide better performance than the former IPv4 protocol,
ISSN: 1942-9703 / © 2010 IIJ
4
INTERNETWORKING INDONESIA JOURNAL
including the extensibility of IPv6 extension header. IPv6 allows the definition of a new extension header in the future without impacting the existing infrastructure. In this paper we propose a CRC Extension Header (CEH) as a new IPv6 extension to perform error control within the Network layer and eliminating error control from Data Link layer [1]. In the near future, the IPv6 packet size will be increased [2]. Implementation of a CEH for a larger packet size is still debatable. This paper attempts to understand the effect of varying IPv6 packet sizes on the implementation the IPv6 packet transmission over high speed networks. II. OVERVIEW OF CRC EXTENSION HEADER The CEH is a new extension header that is proposed to reduce the bottleneck in IPv6 packet transmission over high speed networks by eliminating the CRC verification and regeneration in each and every router, as shown in Figure 1. This idea is entirely different from the existing method that conducts error detection and correction in the Data Link layer of a router. This new mechanism does not require the Data Link layer to verify and regenerate the CRC code for error detection. However, the concept of CEH does not eliminate overall CRC code generation and verification during IPv6 packet transmission. Generation of CRC code is done in the sender’s machine while the verification of CRC code will be done at Network layer of the destination node by processing the CEH inside the received IPv6 packet. Routers are only required to process IPv6 packet at their Network layer in the usual manner, which consists simply of a forwarding decision and hop limit updating. There are three major rationales of the idea of the CEH: Firstly, we would like to optimally utilize one of the advantages of IPv6 features, specifically the IPv6 extension header. IPv6 offers the opportunity to develop new extension headers for IPv6 packet transmission improvement in the future. Furthermore, it is not limited to 40 bytes only. In addition, applying a new extension header within an existing network will not disturb current IPv6 network operations. Secondly, within new high speed networks medium — such as fiber optic — the probability of transmission errors occurring is very small. This allows the error detection function in higher layer to be bypassed. Thus, the duplication of CRC calculation could be eliminated from the Data Link layer. Thirdly, the high transfer rate of current underlying technology, especially gigabit Ethernet, allows the transmission rates of more than 100 Gbps. Eliminating the error detection process in the intermediate nodes will decrease round trip delay from the sender to receiver. Hence, low latency of IPv6 packet transmission over high speed networks could be achieved. Since all the existing extension headers already have specific functions, designing a CEH cannot make use of any existing extension header. Therefore we need to define a new one. The aim is to still utilize the advantages of the current error control but at the same time avoid its weaknesses. The approach uses the widely deployed CRC-32 error detection
SUPRIYANTO ET AL.
mechanism to detect transmission errors. To obtain an optimal result, we select an appropriate generator polynomial to be implemented in the CEH [3]. In terms of error correction, the CEH applies retransmission procedure to overcome erroneous packets, such as Automatic Repeat reQuest (ARQ). Details of the CEH format and mechanism design will be presented in the next section. A. Format of IPv6 with CEH Currently, there is no standard format for IPv6 extension headers. The existing IPv6 extension headers have different formats compared to one another. Thus, the format of the CEH follows the draft of the generic IPv6 extension header (GIEH) proposed in [4]. The CEH is placed after the destination address field of the IPv6 header, as shown in Figure 2. It has three main fields: the Next Header field, the Reserved field and CRC-32 code field. The next header is an 8-bit indicator to point to other extension headers following the CEH or upper layer data. As a new extension header, the CEH has not obtained specific number from Internet Assigned Numbers Authority (IANA). The Reserved field is allocated for future use instead of specific extension header type. The CRC-32 code field is the important field for the CEH. It contains the CRC code generated by the sender machine. This code is used only by the destination node for verification.
Figure 2: IPv6 Packet with CRC Extension Header B. CEH Generation The CRC-32 code field is the most important field of the CEH. CEH generation is determined by CRC-32 code generation. The code is generated from entire IPv6 packet excluding hop limit field. The CRC-32 code generation requires a generator polynomial [5]. The generator polynomial used in the CEH was selected from three candidates, namely the generator polynomial standardized in IEEE802.3 [6], the generator polynomial proposed by Guy Castagnoli [7] and the generator polynomial proposed by Philip Koopman [8]. The test conducted showed that Castagnoli’s CRC-32C is the most suitable generator polynomial implement for the CEH [3]. The generation process of the CRC-32 code follows the usual algorithm used in the Data Link layer, which is already adapted for the Network layer as shown in Figure 3 [3]. The
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
sender generates a CRC-32 code and then inserts the code into the CRC-32 code field of the CEH. When the CEH is created in an IPv6 packet, the extension header field is first initialized to zero except for the next header field. A complete IPv6 packet includes a main header with next header value indicating the type of CEH, an extension header which is CEH and a payload. The packet is now ready for transmission along the network path until it reaches its destination.
5
III. EXPERIMENTS The CEH is a new concept for error detection for the entire IPv6 packet which is done at the Network layer. To verify the acceptability of the new concept we performed an experiment using the network topology depicted in Figure 5. The experiment has two scenarios: the transmission of IPv6 packets with the CEH for error detection in Network layer and the transmission of IPv6 packets with Frame Check Sequence (FCS) for error detection in Data Link layer. The first experiment simulates the new concept, while the later simulates the existing system. The two scenarios yielded an adequate data to justify the performance gain of the new error detection approach compared to the current error detection.
Figure 3: CEH Generation
Figure 5: Network Topology for IPv6 Packet Transmission
C. CEH Verification The verification of CEH will be done at the final destination by following the algorithm depicted in Figure 4 [3]. Once the receiver receives an IPv6 packet from previous node, it will check the next header field of the packet. When the value indicates there is a CEH inside the packet, it extracts the CEH and generates a new CRC-32 code from the entire IPv6 packet (omitting the hop limit field). The CRC-32 code generated by the receiver will be compared to the CRC-32 code found inside the IPv6 packet, which is located within the CRC-32 code field of the CRC extension header. If both CRC-32 code the same, then there is no transmission error in the IPv6 packet received. The receiver then forwards the packet into higher layer. Otherwise it will discard the packet and wait for retransmission.
The topology in Figure 5 has five nodes: Device 1 as a sender, Router 1, Router 2, Router 3 and Device 2 are the receivers. All nodes are PC computers equipped with Core 2 Duo processors that were configured as a mini IPv6 network. The sender is an end system installed with an IPv6 packet generator program written in JAVA that is able to generate both IPv6 packet with the CEH and IPv6 packet without the CEH. Router 1, 2 and 3 represent the intermediate system of the network whose task is to forward the packets. In the first experiment, the routers simply forwarded the IPv6 packets with the CEH rather than checking each and every packet for error detection. In the second experiment the intermediate system performs error checking for each packet before forwarding the packet to next node in the path. The last node is the receiver which is a PC installed with the same program as the sender to verify all packets received. The sender generates an IPv6 packet by adding the IPv6 main header to a TCP segment received from the upper layer. The Network layer then generates the CRC-32 code from both the IPv6 header and the payload using algorithm in Figure 3. The CRC code obtained is then inserted as the CRC extension header into the IPv6 packet. In the first scenario the Data Link layer just adds the link layer header. There is no trailer in the Data Link layer frame that usually performs the error detection task. The frame without frame check sequence (FCS) generated by the sender is then transmitted through the experiment network. When the IPv6 packet reaches the intermediate nodes (namely Router 1, 2 and 3) no CRC code calculation will be performed by the routers for error detection. The ingress frame is just verified for its frame header by the incoming port of router and passed to the Network layer to determine the next
Figure 4: CEH Verification
ISSN: 1942-9703 / © 2010 IIJ
6
INTERNETWORKING INDONESIA JOURNAL
path (forwarding process). The egress frame also does not have a Data Link layer trailer. It simply obtains the link layer header. The only node that will verify the IPv6 packet is the destination node (receiver), as indicated by destination address field of the IPv6 packet. The verification follows algorithm in Figure 4 to check whether the packet received is free from error. Thus, the receiver node’s task is to receive the packet in Data Link layer, release the Data Link layer header without computing the CRC code, and then to deliver the packet to the Network layer for verification. In the second experiment which is a simulation of the current error detection mechanism, the sender generates an IPv6 packet without the extension header. The packet is then encapsulated by Data Link layer by adding Data Link layer header and trailer. We refer to this frame as the IPv6 packet with FCS. Each intermediate node will receive the packet and process it as usual. They process the Data Link layer header including CRC code computation to detect transmission errors for the corresponding hop. Only the packets that are verified as free from error will be delivered up to Network layer for forwarding process without processing any extension header. Before the IPv6 packet is forwarded to the next hop, the packet will be encapsulated again in Data Link layer of the outgoing port of the router including CRC code regeneration. The receiver node in the second experiment captures the IPv6 packet that was addressed to it and then verifies the packet for error detection. The correct packet will be passed to Network layer. Otherwise, it will discard the erroneous packet and wait for retransmission. In order to know the effects of IPv6 packet size with the implementation of the CEH, we measure two main metrics: processing time and error detection capability. Processing time is the packet processing time in each node including the CRC32 code computation, packet generation and packet verification. Here it is very important to know the network latency, which is the most important aspect of network performance with various packet sizes. With regards to error detection capability, it is also important to know the new error detection capabilities to detect transmission error on varying sizes of IPv6 packets.
SUPRIYANTO ET AL.
sizes on the performance of the two error detection mechanisms. Table 1: Correlation between Processing Time and Packet Size at the Sender Size CEH FCS Δ
64 0.689 0.682 0.007
128 0.718 0.686 0.132
256 0.772 0.778 0.006
512 0.883 0.788 0.105
1024 1.039 0.834 0.205
1280 1.067 0.876 0.191
1492 1.133 0.909 0.224
Average 0.900 0.793 0.107
Both the IPv6 packet with CEH and the packet with FCS require a certain amount of time to process at the sender as well as at the receiver. The processing time of the packet at the sender increases with the increase in packet size. This is because CRC code generation was done byte-per-byte of the packet. Thus more bits in a packet means longer time to perform processing of the IPv6 packet. Table 1 also shows that the average of processing time for all packet sizes for the transmission of IPv6 packet with CEH is 13.5% higher than the transmission of IPv6 packet with FCS. This is due to the fact that the IPv6 packet with CEH generation at the Network layer (shown in Figure 3) is more complex compared to FCS generation at the Data Link layer. The process to generate the CRC code from the whole IPv6 packet requires firstly the exclusion of the hop limit and then secondly the insertion of the CRC code as the extension header. In contrast the FCS generation simply divides the whole frame with the generator polynomial and followed by appending the CRC code in the last part of the frame. Similar to the sender, the receiver’s processing time of the IPv6 packet which is listed in Table 2 showed that the processing time is affected by the packet size. The larger the packet, the higher it’s processing time. On average the receiver’s processing time of IPv6 packet transmission with CEH is 16.3% higher than IPv6 packet transmission with FCS. Note that the percentage value is higher than in the sender. This is because the average processing time for both IPv6 with CEH and IPv6 with FCS at the receiver is less. In other words, the processing time of IPv6 packet at the receiver is faster than at the sender. Table 2: Correlation between Processing Time and Packet Size at the Receiver
IV. RESULTS AND DISCUSSION The sender’s processing time is the time required to generate an IPv6 packet with the CEH using the algorithm in Figure 3. While the receiver’s processing time is the time required to verify the IPv6 packet received and the CEH verification using the algorithm in Figure 4. Experiments of IPv6 packets with the CEH, using varying packet sizes from 64 bytes to 1500 bytes yielded the results as shown in Table 1. The table shows the relationship between the sender’s processing time and packets size of the two error detection mechanisms on TCP/IP, namely the transmission of IPv6 packets with the CEH as a proposed mechanism and the IPv6 packet transmission with FCS (existing system) as a comparison. This is done to learn about the impact of packet
Size CEH FCS Δ
64 0.568 0.495 0.073
128 0.568 0.491 0.077
256 0.583 0.512 0.071
512 0.600 0.514 0.086
1024 0.642 0.548 0.094
1280 0.662 0.566 0.096
1492 0.670 0.565 0.085
Average 0.613 0.527 0.086
The processing times of IPv6 packets with the CEH as listed in Table 1 and Table 2 is based on the algorithm in Figure 3 and Figure 4. Even though the process of generating the CEH at the sender and at the receiver is similar, the sender has to generate the IPv6 packet first before it can generate the CEH. In contrast the receiver simply has to receive IPv6 packets without needing to generate any packets. However, note that the transmission of IPv6 packets
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
exploiting the CEH as an error detection method in the Network layer has eliminated the need for CRC code calculation and regeneration in every router. This resulted in the faster processing of IPv6 packets in every router in the network, where a router only processes the packets at the Network layer (routing and forwarding). Hence, although processing time at the end systems (sender and receiver) is higher, the total network latency of IPv6 with CEH transmission is visibly lower than IPv6 with FCS. Figure 6 shows the correlation between network latency in milliseconds (ms) and packet sizes in bytes. In the figure, network latency of the IPv6 packet with CEH transmission is 68 % lower than IPv6 with FCS. The network latency as shown in Figure 6 increases when the packet size increases. Compared to FCS, the increasing network latency of IPv6 with the CEH is lower as shown by the two graphs in Figure 6.
7
V. CONCLUSION In this paper we have proposed a new concept of handling transmission errors by exploiting the IPv6 extension header. The new concept utilizes the CRC extension header (CEH) as an error control method at the Network layer and eliminates error control at the Data Link layer of the intermediate nodes. Typically, the extension header is generated by the sender and is processed by the receiver. Thus, the CEH that contains the CRC code is also processed by the receiver. Using the CEH as a new IPv6 extension header for error detection at the Network layer increases the processing time of IPv6 packets both at the sender and at the receiver. The increase in processing time is influenced by the packet size. The higher the packet size the longer it will take to process the packet. However, our proposed new concept has eliminated error detection routine at the Data Link layer. Elimination of error detection at the Data Link layer means eliminating CRC code computation and regeneration in each and every router. This reduces the total network latency on IPv6 packet transmission. The results show that overall network latency decrease by 68% compared with normal latency. It also reduces the Data Link layer frame size due to the elimination of 4 bytes of frame check sequence field. The proposed error control mechanism also shows good ability to detect transmission error inside the transmitted packet. The experiment showed that all IPv6 packets sent with errors are successfully detected by the receiver.
REFERENCES [1]
Figure 6: Network Latency Another measurement is the error detection capability. Error detection capability of an error control mechanism is the ability to catch transmission errors during transmission from the sender to the receiver. Recent high speed network technologies have made the occurrence of transmission errors very rare. Employing more error detection tools is inefficient and also redundant. However, we cannot eliminate the tools entirely because if an error occurred during the communications then data will be corrupted. Proposing to exploit the CEH as an error detection tool is the correct way. It can reduce the redundant error control process but at the same time it does not ignore the error. Error is unpredictable. Thus an error control mechanism should be able to detect the error inside the packet transmitted and correct it. In order to know ability of CEH to detect transmission errors, we send erroneous IPv6 packet with the CEH. The result illustrated that all erroneous IPv6 packets sent are successfully detected by the receiver.
[2]
[3]
[4]
[5] [6]
[7]
[8]
Supriyanto, Raja Kumar Murugesan, Rahmat Budiarto, Sureswaran Ramadass, CRC Extension Header (CEH): A New Model to Handle Transmission Error for IPv6 Packets over Fiber Optic Links, Proceedings of IEEE Symposium on Industrial Electronics and Applications (ISIEA 2009), October 4-6, 2009, pp. 569 – 573. Supriyanto, Iznan H. Hasbullah, Rahmat Budiarto, Exploiting IPv6 Extension Header to Handle Transmission Error for IPv6 Packets Over High Speed Networks, Proceedings of International Conference on Advanced Computer Science and Information System, 7 – 8 December 2009. Supriyanto, Abidah M. Taib, Rahmat Budiarto. Selecting a Cyclic Redundancy Check (CRC) Generator Polynomial for CEH (CRC Extension Header), Proceedings of International Conference on Quality in Research, Jakarta. 3 – 6 August 2009, ISSN 114-1284 pp. 245 – 250 S. Krishnan, et. al, An uniform format for IPv6 extension headers, Internet Draft IETF, 2008. http://ietfreport.isoc.org/idref/draft-krishnanipv6-exthdr/ A.S. Tanenbaum, Computer Networks, Fourth Edition, Prentice Hall of India. New Delhi, 2006. ANSI/IEEE Standard for Local Area Networks, Carrier Sense Multiple Access with Collision Detection (CSMA/CD) Access Method and Physical Layer Specifications. 1984. http://standards.ieee.org/getieee802/802.3.html. Castagnoli, G., Brauer, S. & Herrmann, M. Optimization of cyclic redundancy-check codes with 24 and 32 parity bits. IEEE Transactions on Communications, 1993, pp. 883-892. Koopman, P. 32-bit cyclic redundancy codes for Internet applications, Proceedings. International Conference on Dependable Systems and Networks (DSN). 2002. http://www.ece.cmu.edu/~koopman/networks/dsn02/dsn
ISSN: 1942-9703 / © 2010 IIJ
8
INTERNETWORKING INDONESIA JOURNAL
Supriyanto received the B.Eng. degree in Electrical Engineering from Brawijaya University Malang, Indonesia in 1999, and and M.Sc in Computer Science from Universiti Sains Malaysia (USM) in 2010. Currently, he is a lecturer and also department secretary in the Electrical Engineering Department at the University of Sultan Ageng Tirtayasa (UNTIRTA). He is a researcher in the Computer Network Laboratory at the university. His research interest includes computer networks especially, IPv6, IPTV over overlay network and wireless communication. Iznan H. Hasbullah graduated with the B.Sc degree in electrical engineering in 1998 from Rensselaer Polytechnic Institute, Troy, New York. He is currently a Research Officer with National Advanced IPv6 Centre (NAv6) at University Sains Malaysia, Pulau Pinang. He is the Domain Head of HD Video Conference for IPv6 group under Next Generation Multimedia & Telemedicine Cluster at NAv6. Prior to joining NAv6 in 2009, he was an R&D Consultant with MLABS System Berhad (MLABS) seconded to JMCS Sdn. Bhd. to spearhead a project titled High Definition Video Conferencing for IPv6 Networks. He held the post of Chief Technology Officer (CTO) while with JMCS Sdn. Bhd. He was involved in two network security audit exercise commissioned by Malaysian Communications and Multimedia Commission (MCMC) in the year 2003 and 2004. He was also part of a team that brought Advance Military Mobile Conferencing System (AMMCS) to the Maritime Langkawi International Maritime & Aerospace Exhibition (LIMA'03) in 2003. His area of expertise includes multimedia conferencing application for computer supported cooperative work (CSCW), software architecture, Graphical User Interface (GUI), Component Object Modeling (COM), Microsoft Foundation Classes (MFC) and C++ Language.
Rahmat Budiarto received the B.Sc. degree from Bandung Institute of Technology (ITB) in 1986, and the M.Eng and Dr.Eng degree in Computer Science from Nagoya Institute of Technology in 1995 and 1998 respectively. Currently, he is an associate professor at the School of Computer Sciences at the Universiti Sains Malaysia (USM) as well as the advisor of Indonesia IPv6 Task Force. His research interest includes IPv6, network security and Intelligent Systems. He was chairman of APAN security working group.
ISSN: 1942-9703 / © 2010 IIJ
SUPRIYANTO ET AL.
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
9
A Modified Weighted Clustering Algorithm for Stable Clustering using Mobility Prediction Scheme S. Muthuramalingam, R. Viveka, B. Steffi Diana and R. Rajaram Department of Information Technology, Thiagarajar College of Engineering, Madurai, India Abstract—This paper proposes an idea for selecting stable cluster heads using a modified Weighted Clustering Algorithm and combining it with Link Expiration Time calculation. The mobile ad hoc network consists of nodes that move freely and communicate with each other. One way to support efficient communication between nodes is to partition ad hoc networks into clusters. Many clustering schemes have been proposed to form clusters. The WCA has improved performance compared with other previous clustering algorithms. However, the high mobility of nodes will lead to high frequency of re-affiliation which will increase the network overhead. To solve this problem, we propose a time-based WCA which can enhance the stability of cluster formation followed by stable cluster head selection. Then duration of nodes that are alive is considered. Meanwhile for forthcoming nodes the duration of link between them and the Cluster head is calculated. This is the Link Expiration time and it is calculated based on the three factors, namely position, speed and direction of nodes. If the calculated link expiration time is greater than the threshold value it is allowed to join the respective cluster. This is done to form stable clusters and to reduce the re-affiliation frequency. Index Terms—Adhoc Networks, Stable clustering, Weighted clustering algorithm, re affiliation, mobility prediction, link expiration time.
I
I. INTRODUCTION
N recent years, the development in wireless communication, and the widespread use of mobile and handheld devices, has resulted in the increasing popularity of mobile ad hoc networks (MANET). A mobile ad hoc network is a collection of wireless mobile nodes that dynamically form a network without the need for any pre-existing network infrastructure. Every node in the network is serving as a router, which means that every node is able to forward data to other nodes in the same transmission range. When wireless nodes are in an area that is not covered by any existing infrastructure, one of the possible solutions to achieve the ubiquitous computing is to enable wireless nodes to operate in the ad hoc mode and selforganize themselves into cluster based network architecture.
S. Muthuramalingam can be reached at
[email protected]. R. Viveka can be reached at
[email protected]. B. Steffi Diana can be reached at
[email protected]. R. Rajaram can be reached at
[email protected]@tce.edu.
Ad-hoc networks do not use specialized routers for path discovery and traffic routing develops the wireless backbone architecture. This means that certain nodes must be selected to form the backbone. One of the general approaches to build up a cluster based network architecture is to design an algorithm to self-organize themselves into cluster based network architecture. This is achieved by partitioning ad hoc networks into clusters. Certain nodes, known as clusterheads, would be responsible for the formation of cluster and maintenance of the topology of the network, and also for the resource allocation to all the nodes belonging to their clusters. The clustering algorithm is usually performed in two phases: clustering formation and clustering maintenance [3]. In the clustering formation phase, the election of a cluster-head among the nodes in the network is conducted. The functions of cluster heads are the coordination of the clustering process and relaying routers in data packet delivery. After electing clusterheads, some of nodes move out from the current cluster and are attached to another cluster. Other cluster-heads are downed. This leads to the second phase, namely, clustering maintenance. In light of this, this paper aims to avoid excessive computation in the cluster maintenance, and current cluster structure should be preserved as much as possible. In this paper, we propose an enhancement on a weighted clustering algorithm [1, 5] (WCA) by maintaining stable clusters. In WCA, a node is selected to be the cluster-head when it has the minimum weighted sum of four indices: the number of potential members, the sum of the distances to other nodes is its radio distance, the node‟s average moving speed and the battery life of the node. However, only the individual nodes‟ mobility is measured in WCA, whereas the relative mobility of a node and its neighbors is more important to the cluster stability. The ignorance of the relative mobility will lead to the situation that the high mobility of nodes causes a high frequency of re-affiliations, which will increase the network overhead. To solve this problem, we propose an enhanced weight-based clustering algorithm using mobility prediction scheme (MWCAMP). This can enhance the stability of the network by taking the relative mobility of a node and its neighbors into consideration for the clustering formation [1, 2, 5] and maintenance, by predicting the mobility of a node using the link expiration time [12]. The remainder of this paper is organized as follows. In Section 2, we review the several relevant clustering algorithms proposed previously and their limitations. Section 3 presents
ISSN: 1942-9703 / © 2009 IIJ
10
INTERNETWORKING INDONESIA JOURNAL
the proposed algorithm for ad hoc networks. Section 4 discusses the proposed algorithm. Section 5 discusses the simulation results and analysis. Finally, Section 6 concludes this paper. II. RELATED WORKS AND CURRENT MOTIVATION Recently a number of clustering algorithms have been proposed to choose cluster-heads based on the speed and direction, mobility, energy, position, and the number of neighbors of a given node. These efforts present advantages but also have some drawbacks, such as the high computational overhead for both clustering algorithm execution and update operations. The Highest Degree Algorithm is also known as connectivity-based algorithm [3]. This algorithm is based on the degree of nodes assumed to be the number of neighbors of a given node. Whenever the election procedure is needed, nodes broadcast their Identifier (ID) which is assumed to be unique in the same network. According to the number of received IDs every node computes its degree and the one having the maximum degree (no of adjacent nodes) becomes cluster-head (CH). Major drawbacks of this algorithm include the situation where the degree of a node changes very frequently, and thus the CHs are not likely to play their role as cluster-heads for very long. In addition, as the number of ordinary nodes in a cluster is increased, the throughput drops and system performance degrades. All these drawbacks occur because this approach does not have any restrictions on the upper bound on the number of nodes in a cluster. The Lowest-Identifier (LID) is also known as identifierbased clustering algorithm [4]. This approach chooses the node with the lowest ID as a cluster-head. The system performance is better than Highest-Degree in terms of throughput. Major drawbacks of this algorithm are its bias towards nodes with smaller IDs which may lead to the battery drainage of certain nodes, and it does not attempt to balance the load uniformly across all the nodes. The Distributed Clustering Algorithm (DCA) [6, 7] and Distributed Mobility Adaptive Clustering Algorithm (DMAC). In this approach, each node is assigned weights (a real number above zero) based on its suitability of being a cluster-head. A node is chosen to be a cluster-head if its weight is higher than any of its neighbor‟s weight; otherwise, it joins a neighboring cluster-head. The smaller weighted node ID is chosen in case of a tie. The DCA makes an assumption that the network topology does not change during the execution of the algorithm. To verify the performance of the system, the nodes were assigned weights which varied linearly with their speeds but with negative slope. Results proved that the number of updates required is smaller than the Highest-Degree and Lowest-ID heuristics. Since node weights were varied in each simulation cycle, computing the cluster-heads becomes very expensive and there are no optimizations on the system parameters such as throughput and power control. The Weighted Clustering Algorithm (WCA) [5] obtains 1-hop clusters with one cluster-head. The election of the
MUTHURAMALINGAM ET AL.
cluster-head is based on the weight of each node. It takes four factors into consideration and makes the selection of clusterhead and maintenance of cluster more reasonable. The four factors are node degree (number of neighbors), distance summation to all its neighboring nodes, mobility and remaining battery power. Although WCA has proved better performance than all the previous algorithms, it has a drawback in needing to know the weights of all the nodes before starting the clustering process and in draining the CHs rapidly. As a result, the overhead induced by WCA is very high. III. PROPOSED WORK In this section, we present the proposed Enhanced Weighted Clustering Algorithm using mobility prediction scheme (MWCAMP). MWCAMP consists of the clustering formation and clustering maintenance phases. Before describing our clustering algorithm in detail, we make the following assumptions: I. The network topology is static during the execution of the clustering algorithm. II. Each mobile node joins exactly one cluster-head. A. Phase-I: Clustering Formation In WCA, the goal is to minimize the value of the sum of all the cluster heads‟ weighted costs. Here a node is selected as cluster head when it minimize a function of four criteria such as degree (the number of direct link to its neighbors), sum of distance between cluster head and other nodes, mobility of nodes and battery power of the nodes. WCA has high reaffiliation frequency [7, 8] which will increase the communication overhead. Thus, we present an enhanced algorithm to reduce the amount of re-affiliation frequency corresponding to the number of nodes that is not „„proper‟‟ in current cluster. It corresponds to nodes that detach from current cluster and move to another cluster. In order to reduce the re-affiliation frequency, we should decrease the number of improper nodes. When the idea reflects on the cluster head election, the key is to choose „„stable‟‟ nodes as cluster heads as possible. The stability of a node is relative to its neighbors. In WCA, the running average speed for every node till current time T, it is computed [9, 10] as
Mi
1 1 T 2 2 2 (( X X ) ( Y Y ) ) t t 1 t t 1 T t 1
where (Xt, Yt) and (Xt-1, Yt-1) are the coordinates of the node i at time t and (t -1), respectively. It is shown that is the absolute mobility of node and cannot indicate information of stability between node and its neighbors. Therefore we present an improved parameter ̅ (relative mobility parameter) to substitute (absolute mobility parameter). Here ̅ is inversely proportional to stability factor which is measured in sense of relative mobility.
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
The stability [13] of node “i” is computed from two aspects: i : The relative average speed of i and its neighbors. I. The larger i is, the more unstable the node is, comparing to its neighbors. II. NCi : The change of node‟s neighbors. The value of NCi can indicate the change of topology surrounding node i. We also use a combined function of i and NCi to compute Si . Another improvement of our algorithm is that we compute the average relative distances instead of the sum of distances. The proposed MWCAMP described as follows
wi w1 i w2 Di w3 M i w4 Pi where I.
i is the degree of each node i (i.e., nodes within its transmission range)
II. III.
Di is the average relative distances each node i with all its neighbors. M i is the relative mobility of node i
Pi is the battery power of node i. IV. where w1, w2 , w3 and w4 are the weighting factors for the corresponding system parameters.
Figure 1
11
B. Phase-II: Clustering Maintenance Using Mobility prediction The mobility of nodes coupled with the transient nature of wireless media often results in a highly dynamic network topology. Due to mobility some nodes will detach from the current cluster and attach itself to some other cluster. The process of joining a new cluster is known as re-affiliation. If the re-affiliation fails, the whole network will recall the cluster head election routine. One disadvantage of WCA is high reaffiliation frequency. High frequency of re-affiliation will increase the communication overhead. Thus, reducing the amount of re-affiliation is necessary in ad hoc networks. To prevent this we go for mobility prediction schemes. The impact of mobility prediction schemes on the temporal stability of the clusters obtained using a mobility-aware clustering framework. We propose a simple framework for a mobility prediction-based clustering to enhance the cluster stability. One way to predict the mobility of nodes is using the Link Expiration Time [13]. The impact of mobility prediction schemes on the stability of the clusters obtained using a mobility-aware clustering framework. Compute the Link Expiration Time (LET) to predict the duration of a wireless link between two nodes in the network. The approach assumes that the direction and speed of motion of the mobile nodes does not change during the prediction interval. C. Link Expiration Time (LET) The Link Expiration Time (LET) is a simple prediction scheme that determines the duration of a wireless link between two mobile nodes. Dynamic clustering in ad hoc networks has also been extensively studied in the literature. Several distributed clustering algorithms for MANETs have been proposed. While some schemes try to balance the energy consumption for mobile nodes, others aim to minimize the clustering-related maintenance costs. Combined metrics based clustering schemes take a number of metrics into account for cluster configuration. The Weighted Clustering Algorithm (WCA) [5] is one such scheme, where four parameters are considered for the cluster head election procedure, which are representative of the degree, the sum of the distances to other nodes in its radio distance, mobility, and battery power of the mobile nodes. Here we propose an enhanced WCA which can enhance the stability of the network. Such a scheme can be tuned flexibly the parameters to suit to different scenarios. To calculate the duration of link between two mobile nodes, we assume that their location, speed and direction of movement remain constant. Here let: Location of node i and node j at time t be given by (xi , yi ) and (xj ,yj). Vi and V j be the speeds, θi and θj be the directions of the nodes i and j respectively.
Figure 2
ISSN: 1942-9703 / © 2009 IIJ
12
INTERNETWORKING INDONESIA JOURNAL
MUTHURAMALINGAM ET AL.
If the transmission range of the nodes is r, then the link expiration time Dt is given by the formula given below
Dt
(ab cd ) (a 2 c 2 )r 2 (ad bc)2 (a 2 c 2 )
where a vi cos i v j cos j b xi x j
c vi sin i v j sin j d yi y j
The LET gives an upper bound on the estimate of the residence time of a node in a cluster. In the proposed clustering framework, when LET-based prediction is used, a node is allowed to join a cluster only if the predicted LET of the link between the node and the cluster head is greater than the cluster‟s admission criteria Tj [13, 14]. For every node N that detach from current cluster we check whether the node is a Cluster Head (or ) Cluster member. I. If it is a Cluster Head then call for cluster head election within the particular cluster and form a new cluster. II. If it is a Cluster member then calculate Link Expiration Time with Cluster Head of each cluster and the node that reaffiliate must be within transmission range of cluster head where transmission range is fixed. Check whether LET is greater than threshold value (Tj), Here Tj is average of all LET, and if it is greater then the Node is eligible to join the particular cluster which shares greater LET.
Figure 4
IV. ENHANCED WEIGHTED CLUSTERING ALGORITHM USING MOBILITY PREDICTION Consider Node Ni in transmission range r, positions of node Ni be ( Xi Yi ), Nj be nodes that detach from current cluster, where 1 ≤ i ≤ n , 1 ≤ j ≤ n and 1 ≤ k ≤ n. 1: calculate degree of node Ni
i (Cij )
where Cij =1, if nodes Ni and Nj are connected else Cij =0 . (Where Cij be connection between node Ni and Nj, Where 1 ≤ i ≤ n , 1 ≤ j ≤ n).
2: calculate relative sum of distance Di from nodes members,
Di
N j Ni
N i to its
{dist ( Ni , N j )} i
where 1 ≤ i ≤ n, Nj is the node the establish link with the node Ni.
3: M i is the running average speed for every node current time T is calculated using formula 1 1 T 2 2 2 M i (( X t X t 1 ) (Yt Yt 1 ) ) T t 1
Figure 3
N i till
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
4: For every node N i , compute the average speed formula
Ai
1 i
N j Ni
Ai using
13
cluster members and they are grouped together to form a cluster. Thus cluster is formed by
Mi
[
Where 1 ≤ i ≤ n, Nj is the node that establish link with the node Ni
{
|
}]
{
}
{
}
where 5: compute the speed-difference
i
as
are previously formed cluster, where 1=k=n. Repeat step 1 to 8 until every node is a member of cluster.
i M i Ai where 1 ≤ i ≤ n. 6: For every node Ni compute the neighbor change NCi at time t.
NC i
{N it 1 } {N it 2 } {N it 1 } {N it 2 }
7: Compute the stability factor S i as 1
si a NCi 1i 1
LET i be the duration of the link between the two nodes is given by the formula given above in phase II and III.
where a is an adjustable integer which is decided by the stability of the node. 8: Thus we can get the improved parameter M i as
Mi
12: For every node N that detach from current cluster we check whether the node is a Cluster Head (or Cluster member). I. If it is a Cluster Head then call for cluster head election within the particular cluster and form a new cluster. II. If it is a Cluster member then calculate LET (Link Expiration Time) with Cluster Head CHi of each cluster Ci and Nj must be within transmission range of CHi .where transmission range is fixed.
b Si
where b is an integer decided by the speed of node. Both a and b are constant to make Si more reasonable, avoiding to be too large or less. 9: Calculate weight value
wi w1 i w2 Di w3 M i w4 Pi where Pi is the battery power of node, and w1= 0.7, w2= 0.05, w3= 0.05, and w4= 0.2. 10: The node with minimum weight value is elected as Cluster Head. That is
CH i min{wi ( N i )} .
13: Admission Criteria: Check whether LET is greater than threshold value (Tj). If it is greater then the Node Nj is eligible to join the particular cluster Cj which shares greater LET (where Tj is average of all LET calculated). Repeat steps 10 and 11 for every node detach from current cluster. Else END.
V. EXPERIMENTAL SETUP AND RESULTS In this section, we present the performance of the proposed WCA simulated by NS2.We simulate a system of N nodes on an 1500 m × 1500 m area. The value of N is varied between 10 and 70 and transmission range is 500m. The nodes move randomly in all directions with a maximum speed varied between 20m/s and 50 m/s. The power for each node is varied between 40 to 100 units. Because the mobility of node is an important parameter in our case, the corresponding weight is chosen to be high. The values of the weights used in our simulation are: w1 = 0.7, w2 = 0.05, w3 = 0.05 and w4 = 0.2. Note that the values may be adjusted according to the system requirements. The performance of algorithms is measured in terms of the number of re-affiliation count and no of clusters formed.
11: Now cluster members are added to the cluster head. The nodes that establish link with that cluster head are called
ISSN: 1942-9703 / © 2009 IIJ
14
INTERNETWORKING INDONESIA JOURNAL
MUTHURAMALINGAM ET AL.
clustering algorithm, nodes form into clusters and elect cluster head and the node which detach from current cluster are joined by predicting their mobility using link expiration time. Here future prediction (mobility) is done on basis of direction and speed of nodes and so re-affiliation frequency is reduced. As the number of nodes increased, the increasing rate of reaffiliation slowed down. According to the result, MWCAMP gives less re-affiliation than WCA especially for the high node density. Thus we form a stable cluster with MWCAMP. Here the number of clusters formed for a given number of nodes is plotted. The number clusters formed should be reduced. Here no of clusters formed are comparatively reduced. Comparison between WCA and our proposed work on basis of throughput and simulation time is given in figure 7 and 8. Figure 7 represents the throughput of forwarding message in WCA. Figure 8 represents the throughput of forwarding message in our proposed work MWCAMP. Figure 5: Average re-affiliation frequency vs node number
Figure 6: Comparison between WCA and our proposed work on basis of no of clusters formed.
Figure 5 depicts the average re-affiliation count versus the total number of nodes in the network where maximum speed is 50m/s and the transmission range is 500m. Here the nodes are organized into clusters and cluster head is elected. In previously designed clustering framework using weighted clustering algorithm, re-affiliation time for a given number of nodes are determined and plotted. In our enhanced weighted
Figure 7: This graph represent WCA output. The Transmission time is from 1sec to 2 sec. Here throughput of forwarding message is decreasing and it is not stable.
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
[4]
[5]
[6] [7]
[8] [9]
[10] [11]
[12]
Figure 8: This graph represent MWCAMP output. The Transmission time is from 1sec to 2 sec. Here throughput of forwarding message is increasing and stable.
[13]
[14]
15
Networks”. IEEE International Symposium on Industrial Electronics. (June 30 2008-July 2 2008). Yang Tao, Jian Wang, a-Li Wang; Tao Sun. “An Enhanced Maximum Stability Weighted Clustering Algorithm in Ad Hoc Network”. 4th International Conference on Wireless Communications, Networking and Mobile Computing. (12-14 Oct. 2008). M. Chatterjee, S. K. Das, and D. Turgut, “WCA: A Weighted Clustering Algorithm for Mobile Ad Hoc Networks,” Journal of Cluster Computing, Special issue on Mobile Ad hoc Networking, pp. 193 – 204, (5, 2002) Roxana Zoican, “An Enhanced Performance Clustering Algorithm for MANET”. 15th IEEE Mediterranean Electro technical conference 26-28 April 2010. Xiqing Zhao, Xifeng Guo, Zhaohao Sun, Changming Ren. “An Intelligent Weighted Clustering Algorithm (IWCA) for Ad Hoc”. WRI World Congress on Software Engineering, Volume 3. (19-21 May 2009). Chegin M, Fathy M. “Optimized Routing Based on Mobility Prediction in Wireless Mobile Adhoc Networks for Urban Area”. Fifth International Conference on Information Technology. (7-9 April 2008). Hruschka E.R, Campello R.J.G.B, Freitas A.A, de Carvalho A.C.P.L.F. “A Survey of Evolutionary Algorithms for clustering”. IEEE Transactions on Systems, Man, and Cybernetics, Volume 39, Issue 2. (March 2009). Rui Xu, Wunsch, “Survey of clustering algorithms”. IEEE Transactions on Neural Networks, Volume 16, Issue 3. (May 2007). Yingpei Zeng, Jiannong Cao, Shanqing Guo, Kai Yang, Li Xie. “A Secure Weighted Clustering Algorithm in Wireless Ad Hoc Networks”. IEEE conference on Wireless Communications and Networking Conference. (5-8 April, 2009). (Bricard-Vieu.V, Nasser. N, and Mikou, NA. “Mobility Prediction-based Weighted Clustering Algorithm Using Local Cluster-heads Election for QoS in MANETs”. IEEE International Conference on Wireless and Mobile Computing, Networking and Communications. (2006). Andronache, A, Rothkugel, S. “NLWCA Node and Link Weighted Clustering Algorithm for Backbone-Assisted Mobile Ad Hoc Networks”. Seventh International Conference on Networking. (13-18 April 2008). Bouk S.H, Sasase I. “Energy Efficient and Stable Weight Based Clustering for mobile ad hoc networks”. 2nd International Conference on Signal Processing and Communication Systems. (15-17 Dec. 2008).
VI. CONCLUSION In this paper we have presented an enhanced weight based clustering algorithm using mobility prediction (MWCAMP) that can be applied in MANETs to improve upon their stability and to reduce re-affiliation of the nodes. MWCAMP mainly focuses on reducing the instability caused by high-speed moving nodes, by taking relative mobility of node and its neighbors into consideration. Since WCA support stable cluster head election and the disadvantage is reaffiliation of nodes which is reduced by mobility prediction, it results in stable clustering. The performance of the proposed MWCAMP demonstrated that it outperforms WCA in terms of re-affiliation count. REFERENCES [1]
[2]
[3]
Chang Li, Yafeng Wang, Fan Huang, Dacheng Yang. “A Novel Enhanced Weighted Clustering Algorithm for Mobile Networks”. 5th International Conference on Wireless Communications, Networking and Mobile Computing. (24-26 Sept 2009). Likun Zou, Qishan Zhang, Jianwei Liu. “An Improved Weight-Based Clustering Algorithm in MANETs”.4th International Conference on Wireless Communications, Networking and Mobile Computing. (12-14 Oct. 2008). Hussein A.H, Abu Salem A.O, Yousef S. “A flexible weighted clustering algorithm based on battery power for Mobile Ad hoc
ISSN: 1942-9703 / © 2009 IIJ
S. Muthuramalingam has over 6 years of teaching experience and is currently working as a Asst Prof in Thiagarajar college of Engineering. He finished his bachelor of Engineering (computer science) at Shanmugha college of engineering, Thanjavur in the year 1997. And he completed his M.E in Kalasalingam University in 2002. He published two papers in International journal on applications of graph theory in wireless ad hoc networks and sensor networks, and IJCEE. He has published papers in the national and IEEE international conferences such as IEEE AMS 2008 IEEE Netcomm2010, NCCN‟09,VIT, NCACT,SAATRA and NCIM 2007. He is pursuing PhD degree in the field of Mobile communication at Anna University Chennai, India.
16
INTERNETWORKING INDONESIA JOURNAL
R. Rajaram completed his PhD in TCE, Madurai Kamaraj University in the year 1979, and the MTech degree in IIT Karagpur in the year 1971. He has published about 100 papers in a number of journals such as the International journal of Computational Engineering Science, Journal of Optics and Journal of the Institute of Engineers. He is a life member of Institution of engineers, Computational Society of India, India Society for Tech Education. Currently working as the Dean CSE/IT & Head of the Department of IT in Thiagarajar college of engineering..
MUTHURAMALINGAM ET AL.
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
17
Mengamankan Single Identity Number (SIN) Menggunakan QR Code dan Sidik Jari Fridh Zurriyadi Ridwan, Hariyo Santoso, dan Wiseto P. Agung PT Telekomunikasi Indonesia Abstract—The Indonesian government is planning to implement the Single Identity Number (SIN) as the primary identification method for residents of Indonesia. With this SIN implementation, it is hoped that a number of residential administrative issues, such as an individual having multiple identity cards (KTP), can be resolved. Furthermore, residents will have the advantage of only needing one identity form and not having to remember multiple identity information. It is hoped that just by using a single SIN a resident can obtain various government services more efficiently. However, on the other hand the use of the SIN introduces the potential for security attacks through the misuse of a SIN by people other than its owner or assignee. As such additional security mechanisms are needed to ensure that a SIN is used only by its proper owner or assignee. This paper aims at describing a proposal for a mechanism to secure the SIN, based on the combined use fingerprints and the QR Code. Pemerintah Indonesia telah merencanakan bahwa pada tahun 2012 akan diimplementasikan Single Identity Number (SIN) sebagai identitas tunggal penduduk Indonesia. Dengan adanya SIN ini maka diharapkan permasalahan-permasalahan administrasi kependudukan seperti misalnya satu orang memiliki lebih dari satu buah KTP dapat diatasi. Selain itu bagi penduduk juga akan diperoleh kemudahan lainnya yaitu tidak perlunya seseorang untuk memiliki banyak identitas dan mengingat banyak ID. Cukup dengan menggunakan SIN, penduduk dapat menggunakan berbagai macam layanan pemerintahan yang berbeda-beda. Dibalik semua kemudahan ini, terdapat suatu potensi ancaman keamanan jika SIN disalahgunakan oleh orang lain. Untuk itu diperlukan suatu mekanisme pengamanan agar SIN memang diberikan dan dimiliki oleh orang yang tepat. Paper ini bertujuan untuk menjelaskan suatu usulan mekanisme pengamanan SIN dengan menggunakan sidik jari dan QR Code. Index Terms—Computer Security, Identity, Authentication.
Fridh Zurriyadi Ridwan is with the R&D Centre of PT Indonesia (PT TELKOM) in Jakarta, Indonesia. He can
[email protected]. Hariyo Santoso is with the Consumer Directorate of PT Indonesia (PT TELKOM) in Jakarta, Indonesia. He can
[email protected]. Dr. Wiseto Agung is with the R&D Centre of PT Indonesia (PT TELKOM) in Jakarta, Indonesia. He can
[email protected].
Telekomunikasi be reached at: Telekomunikasi be reached at: Telekomunikasi be reached at:
P
I. PENDAHULUAN
emerintah Indonesia telah mencanakan bahwa pada tahun 2012 akan diimplementasikan sistim Single Identity Number (SIN) sebagai identitas tunggal penduduk Indonesia. SIN ini adalah 16 digit angka yang digunakan sebagai identitas unik dari seluruh penduduk Indonesia. Seperti banyak diketahui, saat ini masyarakat mengantongi banyak nomor identitas, mulai dari nomor Kartu Tanda Penduduk (KTP), nomor Surat Ijin Mengemudi (SIM) dan lainnya. Penerapan nomor identitas tunggal ini juga dimaksudkan untuk memudahkan seseorang dalam setiap pengurusan dokumen dan berbagai keperluan lainnya.[1] Dengan diterapkannya SIN, diharapkan akan sangat banyak manfaat yang diperoleh seperti diantaranya: tidak perlunya seseorang untuk memiliki banyak identitas dan mengingat banyak ID. Cukup dengan menggunakan SIN, penduduk sudah bisa mendapatkan berbagai macam layanan pemerintahan yang berbeda-beda. Selain itu permasalahan KTP ganda, informasi pajak yang terintegrasi dan lain-lain juga dapat disolusikan dengan menggunakan SIN. Akan tetapi dibalik semua kemudahan ini, terdapat suatu potensi ancaman keamanan jika SIN disalahgunakan oleh orang lain. Untuk itu diperlukan suatu mekanisme pengamanan agar SIN memang diberikan kepada dan dimiliki oleh orang yang tepat. Pada bagian selanjutnya akan dibahas mengenai QR code dan sidik jari yang diusulkan untuk digunakan sebagai pengaman dari SIN. II. QR CODE Quick Response (QR) code adalah sebuah kode matriks dalam bentuk dua dimensi yang dikembangkan oleh perusahaan Jepang Denso-Wave pada tahun 1994. Gambar 1 menunjukkan contoh bentuk QR Code. Beberapa kelebihan dalam menggunakan QR Code adalah sebagai berikut: Dibandingkan dengan barcode dua dimensi lain yang ada (misalnya: Datamatrix, PDF417 dan lain-lain) QR Code mampu menyimpan informasi cukup banyak (maksimum 4296 karakter untuk alfanumerik atau 7089 digit numerik) [3]. Sebagai ilustrasi, barcode satu dimensi standar yang banyak digunakan hanya mampu menampung informasi sebanyak 20 digit [4]. Memiliki kemampuan error correction. Data dapat diperbaiki meskipun QR code mengalami kerusakan atau
ISSN: 1942-9703 / © 2010 IIJ
INTERNETWORKING INDONESIA JOURNAL
18
kotor sebagian. Dapat dicetak di KTP dengan ukuran optimum sekitar 1,5cm x 1,5cm, dimana terdapat 57 x 57 modul yang dapat menyimpan data hingga 400 karakter (alphanumerik). Proses pembacaan yang cepat karena tidak harus dibaca dalam posisi sudut tertentu seperti halnya barcode satu dimensi [4]. Standard (ISO/IEC18004) dan memiliki banyak reader untuk terminal ponsel dan PC [5]. Mudah dibawa-bawa. Lebih murah jika dibandingkan media penyimpanan smartcard berbasis chip.
Gambar 1. QR Code QR code saat ini banyak digunakan di Jepang. Di Indonesia penggunaan QR code belum terlalu populer. Akan tetapi aplikasi QR reader untuk berbagai macam tipe ponsel dan PC cukup banyak tersedia untuk diunduh secara gratis melalui Internet. Di Indonesia QR Code digunakan di beberapa artikel dari harian Kompas untuk menyimpan URL dari artikel terkait yang dapat diakses melalui aplikasi browser di ponsel.
RIDWAN ET AL.
IV. METODA IDENTIFIKASI PENDUDUK Salah satu bentuk usulan pengamanan atas SIN adalah dengan melakukan pemetaan informasi-informasi unik dari seseorang dengan SIN yang diberikan. Informasi unik dari seseorang biasanya menggunakan data biometrik yang umum digunakan seperti sidik jari atau retina. Sidik jari saat ini lebih banyak digunakan karena banyak reader yang tersedia di pasaran dengan harga yang relatif murah. Informasi-informasi unik yang akan digunakan dalam solusi ini diantaranya adalah: Data sidik jari (fingerprint). Nama. Tempat & tanggal lahir. Golongan Darah. URL alamat dari data sidik jari orang tersebut. Semua data-data tersebut harus dapat direlasikan 1:1 dengan pemilik SIN dan dapat diakses dengan mudah untuk melakukan verifikasi. Untuk mendukung kebutuhan ini, maka sebagian data-data tersebut disimpan dalam bentuk barcode 2 dimensi yaitu menggunakan QR Code. Sedangkan khusus data sidik jari akan disimpan dalam bentuk basis data. Untuk lebih mengamankan informasi yang disimpan di dalam QR Code, maka sebagian informasi (misalnya URL alamat data sidik jari) dapat dienkripsi dengan menggunakan algoritma yang dibuat khusus yang hanya dapat dibaca oleh aplikasi yang dikembangkan khusus pula. V. KONFIGURASI SISTEM
III. SIDIK JARI Sidik jari setiap manusia sifatnya adalah unik, sehingga sidik jari merupakan alat yang dapat dipercaya untuk memastikan bahwa SIN yang digunakan adalah dimiliki oleh orang yang benar. Dengan menggunakan sidik jari, pengguna juga tidak perlu mengingat SIN-nya atau kata kunci jika dibutuhkan. Cukup dengan menyediakan informasi sidik jarinya melalui mesin pembaca sidik jari, maka proses otentikasi atas SIN dapat dilakukan dengan mudah. Sidik jari juga dapat dikonversi ke dalam bentuk digital, sehingga dapat disimpan dalam bentuk basis data RDBMS biasa. Dalam menyimpan data sidik jari, biasanya sistem mentranslasikan data sidik jari ke dalam bentuk numerik, sehingga dapat disimpan dalam basis data. Beberapa sistem mentranslasikan sidik jari ke dalam numerik dalam jumlah yang cukup banyak. Untuk meringkasnya, dapat diterapkan mekanisme pengkompresian dengan algoritma tertentu. Sidik jari cukup sensitif terhadap akurasi pembacaan mesin pembaca sidik jari, sehingga mungkin dibutuhkan beberapa kali proses pembacaan (scanning) untuk melakukan proses otentikasi. Saat ini sudah banyak mesin pembaca sidik jari yang menggunakan terminal USB dan bentuknya cukup kecil, sehingga memudahkan dalam proses instalasi dan mobilitas jika dibutuhkan dengan menggunakan notebook sebagai terminal pengganti PC.
Di dalam konfigurasi sistem yang diusulkan, QR Code digunakan untuk menyimpan SIN dan informasi-informasi unik lainnya seperti tanggal lahir, golongan darah dan URL, dengan parameter yang unik untuk setiap penduduk, untuk mengakses basis data SIN. Ada dua cara penambahan QR code yang diusulkan bagi KTP yang saat ini digunakan penduduk. Cara pertama adalah pembuatan QR Code dilakukan di setiap kantor Kelurahan dan dicetak dengan ukuran yang cukup kecil agar dapat ditempelkan/ditambahkan di kartu SIN/KTP. Cara kedua adalah pada saat pembuatan kartu identitas baru, sudah menyertakan QR Code yang telah dibuat. Selanjutnya QR Code ini dapat dibaca dengan menggunakan ponsel yang memiliki kamera atau PC dengan menggunakan web camera yang sebelumnya telah diinstal aplikasi QR Reader. Di dalam konfigurasi sistem yang diusulkan, data sidik jari disimpan di dalam suatu basis data khusus yang dapat diakses melalui jaringan privat virtual (VPN). Jaringan privat virtual digunakan untuk menghubungkan kantor Kelurahan dengan pusat data SIN. Dengan menggunakan jaringan privat virtual, maka dapat dijaga keamanan komunikasi antara Kelurahan dan pusat data SIN yang mungkin terpisah secara fisik dari ancaman penyadapan informasi oleh pengguna yang tidak terdaftar di dalam sistem SIN. Sidik jari ini diambil dari setiap penduduk di Kantor kelurahan terdekat dan kemudian direlasikan dengan SIN penduduk tersebut. Idealnya data sidik jari ini dapat diakses secara nasional sehingga dapat
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
membantu proses verifikasi apakah seseorang sudah atau belum pernah memiliki SIN atau identitas lainnya yang dikeluarkan dari dearah lain. Gambar 2 berikut menunjukkan konfigurasi dari sistem yang diusulkan.
4.
5. 6.
Gambar 2. Konfigurasi Sistem Konten dari basis data SIN juga harus dijamin keamanannya. Untuk itu informasi sidik jari yang disimpan di dalam basis data harus disimpan di dalam format yang terenkripsi (encrypted data). Sedangkan hubungan antara kantor kelurahan dengan pusat data SIN direkomendasikan menggunakan jaringan privat virtual. Untuk menjamin bahwa hanya petugas dari kantor kelurahan yang berwenang saja yang dapat mengakses pusat data SIN, maka proses otentikasi dapat menyertakan juga MAC address dari terminal PC yang digunakan oleh petugas Kelurahan. VI. ALIRAN PROSES Untuk mengimplementasikan sistem ini, aliran proses yang diusulkan adalah sebagai berikut: A. Pembuatan Kartu Identitas 1. Penduduk harus datang ke kantor Kelurahan dimana ia tinggal untuk melakukan pengambilan data sidik jari. Di kantor Kelurahan disediakan perangkat pembaca sidik jari. Untuk orang-orang yang tidak mampu datang ke kantor Kelurahan karena sakit atau cacat maka petugas Kelurahan yang akan mendatangi orangorang tersebut dengan membawa komputer notebook dan perangkat pembaca sidik jari. 2. Petugas akan memeriksa ke basis data SIN apakah orang tersebut sudah memiliki SIN dengan mencocokkan sidik jari yang diperolehnya dengan basis data sidik jari yang sudah ada. 3. Jika ditemukan sidik jari yang cocok dan orang tersebut sudah memiliki SIN, maka petugas Kelurahan tidak dapat membuat kartu identitas baru dengan SIN baru untuk orang tersebut.
19
Jika tidak ditemukan sidik jari yang cocok, maka data sidik jari yang diperoleh kemudian disimpan di dalam basis data SIN dan direlasikan dengan SIN orang tersebut. Petugas kependudukan kemudian membuat QR code yang berisi informasi-informasi unik dari orang tersebut. Selanjutnya petugas kependudukan membuat kartu identitas yang menyertakan QR code tersebut.
B. Pengecekan Kartu Identitas 1. Penduduk diharuskan untuk menunjukkan kartu identitas untuk mengajukan suatu layanan masyarakat (misalnya untuk mengajukan permohonan IMB). 2. Dengan menggunakan aplikasi QR Reader yang diinstal di PC atau di ponsel, Petugas Kelurahan atau petugas dari Instansi terkait akan membaca QR code yang ada pada kartu identitas untuk mendapatkan informasi unik yang dibutuhkan. 3. Jika dibutuhkan untuk otentikasi lebih lanjut, maka petugas dapat meminta orang tersebut untuk mencocokkan sidik jarinya dengan basis data sidik jari. Aplikasi QR Reader akan membaca data URL pada QR code milik orang tersebut yang berisi alamat untuk mengakses informasi sidik jari dari orang tersebut. VII. BISNIS PROSES PENUNJANG Untuk mengoptimalkan pemanfaatan QR code dan sidik jari dalam sistem SIN, diperlukan beberapa hal lain sehingga secara kesisteman SIN dapat digunakan untuk banyak keperluan dengan mudah dan tidak membutuhkan proses bisnis yang rumit. QR code yang berisi SIN dan sidik jari harus dicetak dalam Kartu Tanda Penduduk (KTP). Tersedia QR code dalam bentuk softcopy, sehingga transaksi dapat dilakukan secara online atau dengan menggunakan perangkat portabel seperti ponsel. Dengan semakin mudahnya penyimpanan dan proses yang tidak rumit, masyarakat akan lebih terbiasa dengan pemanfaatan SIN. Dapat diintegrasikan dengan sistem pelayanan satu atap. VIII. PENUTUP Penggunaan QR Code dan data sidik jari diharapkan dapat membantu pengamanan SIN yang akan diimplementasikan agar tidak disalahgunakan oleh orang yang tidak berhak untuk mengakses data-data atau layanan-layanan pribadi dari pemilik SIN. Penggunaan QR code merupakan salah satu solusi yang relatif murah jika dibandingkan dengan menggunakan smartcard berbasis chip. Untuk implementasi dalam skala nasional, perlu dipertimbangkan agar konfigurasi pusat data SIN didesain dengan arsitektur terdistribusi,
ISSN: 1942-9703 / © 2010 IIJ
INTERNETWORKING INDONESIA JOURNAL
20
sehingga waktu yang dibutuhkan untuk melakukan proses pencocokan sidik jari dari tempat-tempat yang lokasinya cukup jauh dengan pusat data SIN tetap seminimal mungkin. QR Code juga dapat digunakan untuk menyimpan informasi-informasi lainnya seperti untuk menyimpan beberapa ID lain yang mungkin saja masih digunakan bersama-sama dengan SIN selama masa transisi (misalnya nomor KTP lama, NPWP atau SIM). Pengembangan lebih lanjut yang dapat dilakukan adalah pembacaan QR Code dan proses otentikasi sidik jari dapat dilakukan melalui area publik dengan memanfaatkan jaringan Internet yang tersedia atau melalui jaringan komunikasi nirkabel yang tersedia (GSM, CDMA). Untuk hal ini perlu dikaji lebih mendalam mengenai aspek pengamanan transaksi data yang dilakukan. Pengembangan lebih lanjut lainnya adalah jika metode ini digunakan pada smartcard berbasis chip. Chip dapat digunakan untuk menyimpan data-data sidik jari secara offline, sehingga dapat mempercepat proses otentikasi sidik jari jika dibandingkan dengan harus melakukan pengecekan melalui jaringan privat karena tidak perlu melakukan koneksi dan proses otentikasi ke jaringan privat terlebih dahulu. DAFTAR PUSAKA [1] [2]
[3] [4] [5] [6] [7]
“Kementrian PAN Mewujudkan Single Identity Number”, [Online]. Available: http://bataviase.co.id/detailberita-10574703.html, 29 Januari 2010, diakses: 6 Apr il 2010. “2010, Jakarta Terapkan „Single Identity Number‟" , [Online]. Available: http://megapolitan.kompas.com/read/2009/09/25/12381886/2010..Jakart a.Terapkan.Single.Identity.Number., 25 September 2009, diakses: 6 April 2010 “QR Code”, [Online]. Available: http://en.wikipedia.org/wiki/QR_Code. About QR-Codes, [Online]. Available: http://www.mobilebarcodes.com/about-qr-codes/#barcodes-vs-qr-codes “QR Code Standardization”, [Online]. Available: http://www.densowave.com/qrcode/qrstandard-e.html, diakses: 6 April 2010 “QR Code Features”, [Online]. Available: http://www.densowave.com/qrcode/qrfeature-e.html, diakses: 7 April 2010 “Fingerprint”, [Online]. Available: http://en.wikipedia.org/wiki/Fingerprint
Fridh Zurriyadi Ridwan received the Informatical Engineering degree from the TELKOM School of Technology in Bandung, Indonesia in 1997 and a Master of Information Technology degree from the Computer Science Faculty of the University of Indonesia (UI) in 2008. From 1997 to 2008 he worked as an engineer in the R&D Center of PT Telekomunikasi Indonesia (TELKOM). He is currently working as a researcher in the TELKOM R&D Center. His research interest are mostly related with telecommunication services and products. He is also actively involved in several Asia Pacific Telecommunity (APT) joint research work with researchers from Japan.
ISSN: 1942-9703 / © 2010 IIJ
RIDWAN ET AL.
Hariyo Santoso received his Electrical Engineering degree from the TELKOM School of Technology in Bandung, Indonesia in 1997 and a Master of Information Technology degree from the Computer Science Faculty of the University of Indonesia (UI) in 2009. From 1997 to 2009 he worked as an engineer and researcher within the R&D Center of PT Telekomunikasi Indonesia (TELKOM). He is currently working in the TELKOM Consumer Directorate as a Senior Officer of the Device & CPE Product Development and Deployment. He also involved in several Asia Pacific Telecommunity (APT) joint research work with researchers from Japan.
Dr Wiseto Agung received the BSc degree in Telecommunications from Institut Teknologi Bandung (ITB), Indonesia in 1987. He also received an MSc degree in Telematics (in 1994) and a PhD in Multimedia Communication (in 2002) from the University of Surrey, UK. He has been with PT Telekomunikasi Indonesia since 1988 in various engineering divisions, and he is currently working in the TELKOM R&D Centre as a Senior Manager of Service and Product R&D. Within the Asia Pacific Telecommunity (APT) Wireless Forum (AWF) he holds the responsibility of the Convergence Working Group Chairman.
INTERNETWORKING INDONESIA JOURNAL
Vol.2/No.2 (2010)
21
EnsoTracker: Mouse Controller Device using Head Movement Gunawan Institut Teknologi Sepuluh Nopember FX. Ferdinandus, Tri Kurniawan Wijaya, Indra Maryati and Edwin Seno Dwihapsoro Sekolah Tinggi Teknik Surabaya Abstract—In this research work we have developed a tool and a program that can be used to replace the behavior of a mouse. Our results have shown that it can be used successfully to replace a mouse that is usually operated by hand. The system we developed in this research can be used to help people with quadriplegia (paralyzed) to operate a computer. In order to be highly accessible and affordable, the tool we built should be inexpensive and can be purchased easily. Thus we have selected an input method using a series of LEDs and a webcam. There are two LED used on the circuit due to the need to have two light sources in order to replace the mouse behavior. The image captured by the webcam is then filtered using an image segmentation technique (thresholding). The first light source is used as the head-tilt detector that will be used as the replacement for the left-button and right-button of the mouse. The second light source is used as a head-motion detector, to replace the mouse movement. Our head movement tracking device uses two pieces of LEDs, wires, several resistors and energy source (battery).
prices (particularly for developing nations), the low availability of these instruments or items, and the regulations pertaining to the buying and selling of electronic goods across states. One example of a detector that recently became available is the “Ocular Mouse” which is an eye muscle detector. However, the tool is only available to certain communities and it is expensive. There is also a head movement tracking device called TrackIR that is relatively well known compared to other tracer devices. This tool is well known for its primary function to play games. However, such a tool that can detect head movements can also be used to help people who suffer from quadriplegia to operate a computer. However, similar to other head movement tracking devices on the market, the price of this tool is also relatively expensive and therefore unaffordable the majority of people in developing nations, including Indonesia.
Index Terms— head movement tracking device, image segmentation, mouse controlling, thresholding.
A
I. INTRODUCTION
person who suffers quadriplegia is unable move his or her arms and legs to operate a mouse and keyboard [7]. Therefore, in order for such a person to operate today’s computers there is a need for special tools to aid that person operating a computer. Recently, there has been a number of special equipment and software that has been developed to help a people who suffer from quadriplegia to interact with a computer. However, most of these specialized equipment and software is difficult to obtain. The difficulty is mostly due to their high FX. Ferdinandus, Tri Kurniawan Wijaya, and Indra Maryati are with the Department of Computer Science, Sekolah Tinggi Teknik Surabaya, Surabaya 60284, Indonesia They can be contacted at email: {ferdi, tritritri}@stts.edu, and
[email protected]. Gunawan is with the Department of Electrical Engineering, Faculty of Industrial Technology, Institut Teknologi Sepuluh Nopember, Surabaya 60111 Indonesia. He can be contacted at email:
[email protected]. Edwin Seno Dwihapsoro is with the Department of Computer Science, Sekolah Tinggi Teknik Surabaya, Surabaya 60284 Indonesia.
II. IMAGE SEGMENTATION Thresholding is a simple method of image segmentation. A thresholding method allows the creation of a binary image from a gray scale image [2]. A binary image is an image in which every pixel can have of one of two possible values. The values that are usually chosen are the values that represent black and white, although the values that represent other colors can also be used. In the binary image, a pixel color value is used as the foreground/object color while the value of the other colors is used as the background. The ultimate purpose of the threshold method is to simplify the representation of the image so that the image can be more easily analyzed. The reason a thresholding method was chosen is to allow us to more easily detect and differentiate between the features of tracked objects and the background. Since the thresholding method is a simple method, it is expected that required computational processes will not overload the system and that the basic specifications required for the system can be as minimum as possible.
ISSN: 1942-9703 / © 2010 IIJ
INT TERNETWORK KING INDONE ESIA JOURNAL L
22
III. HEAD MOVEMEN NT TRACKING DEVICE There are sev veral targets that t are used as reference in maaking the head d movements trracking system m. The goal is the devvelopment of a system thatt has a high accessibility and a afffordability. Th he factors th hat became a benchmark of acccessibility at th he end of this taask were, amon ng others: 1.
22.
3.
Cost: In order for the mo otion tracking system s to be ussed by many societies, the overall o cost off the componeents b the system to t work well must m be reduced d to required by be as miniimal as possiblle. Componen nt availability: The availabiliity of the required componen nts in an impo ortant factor. The componeents needed for the motion trracking system m must be publiicly available. This is so thaat the prospectiive users have no difficulty in finding thee components required by the system. h movemen nt tracking systtem Flexibilityy of use: The head was desig gned with the highest possiible flexibility in mind. This can be seen from f the many alternative forrms ponents that caan be used to build b the tracking and comp device. Siimilarly, the so oftware can bee configured in nto many arraangements which can be mo odified according to the userr’s wishes.
In order for the tracking system to achieve the abo ove de a minimal as a possible. One O tarrgets, the cost had to be mad item m whose cost could be reducced was the viideo input deviice. Geenerally for thee system to tracck an object, th he system mustt be ablle to clearly distinguish d the special featurres of the objeect, parrticularly wheen compared to t another objject in the saame im mage. L dle grade videeo input devicees generally haave Low and midd poor quality of the t captured im mages (such ass darkness and d a nerally there arre settings thatt can improve the lott of noise). Gen quality of the cap ptured image. However, H usuaally in these inp put ge quality is inv versely proporttional to the leevel devvices the imag of images/framess that can be captured each second s (i.e. fraame perr second or FP PS) by the videeo input device. A reduction n in thee FPS has an impact i on the response timee in the detection andd tracking of movements. m As A such, in seleecting the dev vice wee decided on a general devicee in its class wh hich could redu uce thiis impact. IV. RELA ATED WORK In this projecct we developeed a head mo ovement tracking sysstem instead of o an eye mo ovement track king system. The T reaason was becaause head mov vement is easiier to detect th han eyee movement, particularly using u the low--cost and limiited avaailable tools. Our goal waas to reduce the cost of the devveloped system m, to develop a system with high h accessibiliity, andd to develop a system that is relatively moree practical to use. u E nt tracking deviices with video o input capabillity Eye movemen gennerally have a high manufaccturing cost, laarge size and less l praactical for usee. Figure 1 (a)) shows an ex xample of an eye e moovement trackiing devices thaat used an earrly version of the
GUNAW WAN ET AL.
video innput device w where the deviices are still leess accurate. Howeveer, even thesee tools still reqquire high-quaality camera. The sizze of the eye m movement traccking device iis also bulky and noot practical. Figure 1 (b) sshows the eyee movement trackingg device usingg the latest video inputs. Thhis tool was developped through tthe cooperatioon of several companies, includinng Vision S Systems Interrnational, Elbbit systems, Rockweell Collins, andd Helmet Integgrated Systems..
Figgure 1: Some reccent eye movem ment tracking devvices.
The eeye movementt tracking devicce shown in Fiigure 1 (b) is used byy pilots of com mbat aircraft oof the United S States in the latest-geeneration fighhters (e.g. tthe F-35 Ligghtning II). Howeveer, the eye m movement trackking device w with a video input ddevice (even w without the hhelmet) is still large, and require high-cost com mponents. ently a new eyee tracking movvement devicee from Brazil Recen called ““Ocular Mousee” has been prooposed. The O Ocular Mouse tool waas developed during 5 yearrs by a team of Brazilian scientistts headed by Professor Manuel Cardosoo, who is a graduatte of Electric E Engineering off the Federal U University of Rio de Janeiro (UFRJJ) [5]. The devvelopment of tthis tool was med Paulo Feeitoza Brazil sponsorred by the orgganization nam Foundaation (PFF). T The general fuunctions of thee device are almost tthe same with those we deveeloped in this rresearch, and the aim m is also to help people withh paralysis quaadriplegia to interactt easily with coomputers. Howeever, in contraast the Ocularr Mouse does not use any video innput device, buut instead it usees a special sennsor that can detect eeye muscle moovements. Sensors are placedd around the user's eyyes, with a tottal of at least ssix (6) sensors that need to be attacched to a speciaal section the uuser's head. The sensors are then pluugged into a tool that is capable of processing data generateed from the eeye muscle m movement. Thee instrument used to obtain the daata requires a PC with an R RS-232 serial connecttion. The cost of the device iis about US$ 2200 and thus may havve limited acceessibility to people in developping nations. V. SYSTEM ARCH HITECTURE The bblock diagram of our main syystem is shownn in Figure 2 below. The system gets its input fr from the head marker tool. The heaad marker tooll used in this ssystem emits liight which is then capptured by the vvideo input devvice.
ISSN: 194 42-9703 / © 20110 IIJ
Vool.2/No.2 (2010)
INT TERNETWORK KING INDONES SIA JOURNAL
L Light captured by the vid deo input device is processsed phyysically (by th he video input sensor s devices) and temporarrily stoored in the deevice in a dataa format repreesenting a digital im mage (such as the JPEG forrmat). To be able to produ uce mooving images, the t webcam caaptures images generally seveeral tim mes per second d (FPS) depend ding on the specification of the weebcam and the software settin ngs allowed.
23
For example, input ddevice send daata to the outpput device. F webcam m data can be accessed befoore it reaches the monitor, and souund card data can be accesssed before it reaches the speakerrs. The program acceesses the imagge captured byy a webcam throughh DSPack2344 which getss the image data from DirectS Show [17]. (DSSPack234 in thhis research is a collection mponents and aadditional classes which cann be used to of comp connectt the Delphi w with DirectShow w). The imagee is captured and theen processed too obtain the reelevant information so that m of mouse the proogram can proovide the outpput in the form movem ments and actioons emulationn. In addition,, the mouse ments and actionns emulation ccan also be useed to activate movem and ooperate the Windows onscreen keybboard. The combinnation of the ooutput from thhe program annd Windows onscreeen keyboard can then be ussed to activatee or operate other appplications succh as the internnet browser foor surfing the Internett. V VI. ENSOTRA ACKER In thhis research w we named ourr system Ensootracker. As previouusly explainedd, Ensotrackeer gets the iimage from Window ws DirectShow w API. The souurce image itself is caught by videeo input devicee in the form off a digital repreesentation of the imaage. Figure 3 thhe programs obbtaining the orriginal image mage that is capturedd by a webcaam via DSPacck234. The im caught bby visual inpuut devices usedd in Ensotrackeer systems is distribuuted using the W Windows DirecctShow API.
Figure 2: Block diagram of o the main sysstem
In our system m the video inp put device useed has to supp port deo input devicce is connected d to Miicrosoft DirecttShow. The vid thee operating sy ystem through a driver. In this context the driiver is a comp puter program m that essentiaally connects the harrdware with the t operating system [11]. Drivers conn nect harrdware and so oftware throug gh the compu uter bus or other com mmunication subsystem co onnected to the hardware. A com mputer bus iss a subsystem m that transfeers data betweeen com mputer components insidee the computter, or betweeen com mputers. D DirectShow iss the API (A Application Pro ogram Interfaace) devveloped by Microsoft M to peerform variouss operations with w meedia streams orr files. One off the functions of DirectShow w is to help software developers in n variety operaations in termss of streaming mediaa [13]. Here the API is a set of classses, proocedures, and functions f of an n operating sysstem, libraries,, or serrvices created to help link a program with h other prograams [144]. M Microsoft DirectShow caan be used d by Windo ows appplications to interact i with and control th he input devices bassed on Windows media [16]]. Examples of devices inclu ude cam mcorders, web bcams, DVD Drives, D TV tu uners, and analog viddeo input devices. DirectSh how can also be used to play p meedia files. y DirectShow is i due to its modular m approaach. The flexibility Auudio and video o files are treeated as data streams, and the sofftware modulees can controll the streams when the meedia
Figgure 3: Diagram of Ensotracker processes
Afterr the program rreceives the orriginal image, the image is then seegmented bassed on the llight intensityy using the threshollding segmenttation method. In this methood the initial image iis turned into a grayscale iimage. Next, tthe image is transforrmed into a bbinary image, using a threeshold value predeterrmined by the user. In Ensotracker, the traansformation process (grayscale andd binary imagee) is done from m the top-left
ISSN: 194 42-9703 / © 20110 IIJ
24
INT TERNETWORK KING INDONE ESIA JOURNAL L
cooordinate to botttom-right coorrdinate of the im mage. E Ensotracker receives r the light l source's position as the proogram transforrms the graysccale image into o a binary imaage. Duuring the transfformation proccess, when the program findss an objject pixel, the program takess the pixel posiition and stores it as the first light source positio on. After that the t program do oes nott take further object pixel positions until the image thatt is proocessed by thee program has reached a certtain distance. The T deffault setting requires r the distance to be 20 pixels do own (veertical). When the program had covered thee distance, if th here is another objectt pixel position n, that position n is stored as the his process co ontinues until the nexxt light sourcee position. Th im mage acquired from the videeo input devicce is stopped or Ennsotracker prog gram is closed.
GUNAW WAN ET AL.
be giveen to the difference in the vvalue of Y, beecause it can lead to “division by zero” error. F Formula 180/ppi is used to convertt degree to raddians, because we need the vvalue of θ in degree, but the defaault result of the arctan caalculation is radians..
Fiigure 5: Angle C Calculation
VIII. TESTIING Figure 4: Head H Movement Tracking Devicce Ensotracker
A After the progrram gets the fiirst light sourcee position and the seccond, the pro ogram can peerform the mouse m emulation. Hoowever Ensotrracker cannot perform the mouse m emulation whhen the light so ource is more or o less than two o sources. Thiss is beccause the progrram is designed to use two (2 2) light sourcess as thee input (see Fiigure 4). The first f light sourrce is used as the heaad-tilt angle, while w the seco ond light sourcce is used as the heaad position deetector. The reaason why the mouse emulation cannnot be done with w more thaan two light so ources is becau use thee position and condition c of th he equipment at the head marker capptured image cannot c be kno own. This is because there is i a possibility that the first light source s position n and the seco ond aree not of the heaad position marrker tool, but iss instead a noisse.
VII. ANGLE CALCULATIONSS A As can be seen n in Figure 5, the t angle θ is obtained o from the defflection of the first light sourrce (box 1) of the t triangle A and a thee second light source (box 2)). The formula used to calcullate thee angle θ, if th he known valu ues are the X and a Y position n of thee box 1 and box x 2 (X1, X2, Y1, Y Y2) is: θ = arctan ((X2 2-X1) / (Y2-Y1)) * 180/pi (1)) The above forrmula can lead d to an error if we do not cheeck nd Y of the box x 1 and box 2. Calculations with w thee value of X an thee formula abo ove has a requirement thaat the differen nce bettween X and Y is at least 1. Special attenttion also needss to
The testing of thee system is doone on severaal cases: the mouse click, Interneet browsing, aand the runniing of other applicattions using Enssotracker. Testin ing the internet browsing iis started by opening the Internett browser appllication, enteriing the URL, and clicking on a linnk. In the testting process, tthe URL used was simply “www.ggoogle.com” an and the link useed was the “Abbout Google” link. To teest the runningg of other appliications using Ensotracker, we testeed a notepad and a game ccalled “Smashiing”. On the notepadd we asked usser to type the letters “ABC CDEFGHIJ” (using tthe on-screen keyboard). Inn smashing gam me, the user plays gaames similar too Arkanoid, w which is a simpple game that requiress user to movee a board or pllank to the righht and to the left of tthe screen. In the game a baall bounces coontinually on the fourr edges/sides oof the screen. The object off the game is to prevvent the ball ffrom droppingg down the boottom of the screen. The movablee board is locaated on the boottom of the screen aand is controllled by the useer to prevent thhe ball from falling bbelow the screeen. The user is able to movve the board left or rright, and uponn hitting the board the ball boounces up the screen aagain. A nuumber of resspondents werre asked to pperform the activitiees in the abovee test, providingg us with somee test results. The resspondents werre asked to usse the Ensotraacker system only annd were not alloowed to use eiither hands or feet in a test run. The results of thee test allowedd us to concluude that the without using system can help peoople browse thhe Internet w either hhands or feett. Thus it cann also be conncluded that Ensotraacker system can help peoople with quaadriplegia to browse through the Innternet.
ISSN: 194 42-9703 / © 20110 IIJ
INT TERNETWORK KING INDONES SIA JOURNAL
Vool.2/No.2 (2010)
The responden nts had no diffficulty in eitheer opening a text t filee or playing “Smashing” “ gaame. When try ying notepad, the resspondents are able to type in i the letters “ABCDEFGH HIJ” witthout any prob blems. In trying the game “S Smashing” whiich, thee respondents did not expeerience probleems in operating refflective boardss on the screen n by using the mouse pointter. Thhus from the reesults it can bee concluded thaat the system can c inddeed help peo ople to run and a operate other applicatio ons bessides the Intern net browser, without using th heir hands or feeet. IX. CONCLUSION AND A FUTURE WORK In addition to creating the head movementt tracking deviice, wed that a threesholding meth hod in this research we also show g method for the cann be used as the detecting and tracking position of the ob bject/light. However, in ordeer for this meth hod s constrain nts that must be to work correctlly there are some perating enviro onment. Notab bly ambient light appplied to its op andd the backgro ound should be b darker than n the object/light beiing tracked. g a In this researrch, we have succeeded in implementing devvice that can replace r the mou use input using g a head tracking moovement devicce in an inex xpensive way. The device has h strong potentialls primarily for people suffering from owever, the resolution r and d image quallity quadriplegia. Ho prooduced by the video input device is very strrongly influencced thee quality of thee tracking of th he head movem ments and by the moouse emulation n. F For the futuree developments of our systeem, the aim iss to devvelop more co onvenient toolss for the users, such as smalller ledd, better electro onic cables, ad djustable ear po ositioning hand dle, andd others. In ad ddition, the lig ght positioning g process can be im mproved using a more robustt algorithm thaat can handle the noise easily and can use a larg ger image resollution that allo ows nt of the mouse pointer. forr a more refined the movemen
25
[11] Devvice Driver Wikipeedia, http://en.wikiipedia.org/wiki/Deevice_driver. 200 8. Holmes, Mark A. The Advanced [12] Brayy, Andrew C.; Dicckens, Adrian C.; H Useer Guide for the BB BC Microcomputeer. Cambridge, UK K: Cambridge Miccrocomputer Centrre. 1983, pp. 442-4443. [13] MSD DN, DirectShow ddocumentation. htttp://msdn.microsoft.com/enus/liibrary/ms783323.aaspx. 2007. [14] Orennstein, David. “QuuickStudy: Appliccation Programminng Interface (AP PI)". Computerworrld. httpp://www.computerw rworld.com/action//article.do?commaand=viewArticl eBaasic&articleId=434487. 2000. [15] Toppik Augmented Reeality Wikipedie, httpp://en.wikipedia.orrg/wiki/Augmentedd_reality. 2008. [16] Pagge 1 Introduction too DirectShow, httpp://www.vwlowen..co.uk/directshow//page01.htm. 20088. [17] Thee DSPack Project. pprodigy.com. httpp://www.progdigy..com/?page_id=4. 2008.
G Gunawan receiveed the S.Kom degrree in Computer S Science in 19911 and M.Kom in Information T Technology in 20005 from Sekolahh Tinggi Teknik S Surabaya, Indonesia. Currently hee is pursuing a ddoctoral degree att the Institut Tekknologi Sepuluh N Nopember, in Surrabaya, Indonesia. He is currently w working as Vice D Dean of Academiic Affairs and a S Senior Lecturer at Sekolah T Tinggi Teknik S Surabaya. His research interests iinclude Natural L Language Proceessing (especiallyy in Bahasa IIndonesia), Artificcial Intelligence, Data and Web M Mining, Neural Network, Fuzzyy, and Genetic A Algorithm. F F. X. Ferdinandu us received the S..Kom degree in C Computer Sciencee in 1992 from Sekolah Tinggi T Teknik Surabaya, Indonesia, and thee M.T degree in C Computer Sciencce in 1997 from m the Institut T Teknologi Sepulluh Nopember, in Surabaya, IIndonesia. He is cuurrently working aas Vice Dean of S Student Affairs annd a Senior Lectuurer at Sekolah T Tinggi Teknik S Surabaya. His ressearch interests iinclude Networkinng, and Data Encryyption.
REFER RENCES [1]
CoderSource.neet, Binary Image, http://www.cod dersource.net/csharrp_color_image_to o_binary.aspx. 200 08. olding. In Digital [2] Gonzalez, Rafael C. & Woods, Richard E., Thresho Image Processin ng. Pearson Educaation. 2002, pp. 595-611. c Wikipedia, [3] Topik Primary color http://en.wikipeedia.org/wiki/Prim mary_color. 2008. S Wikipediia, [4] Topik Visible Spectrum http://en.wikipeedia.org/wiki/Visib ble_Spectrum. 200 08. ouse ocular para tetraplégicos t - Terrra [5] Redação Terra, Brasileiro cria mo oftware, Hardware & So http://tecnologiaa.terra.com.br/inteerna/0,,OI1124734 4-EI4801,00.html. 2008. [6] Tania Orsi, ScieenceNET, http://www.scieencenet.com.br/bacckup/english/scien ncenews/ed_48/48_ _oc ularmouse.htm. 2008. Q [7] Quadriplegia - Quadriplegia, http://www.spin nalinjury.net/quadrriplegia.htm. 2008 8. nd Human Rights, [8] UN Enable - Reelationship betweeen Development an http://www.un.o org/disabilities/deffault.asp?navid=35 5&pid=33. 2008. or browsers, operatting systems and search engines, [9] Market share fo http://marketshaare.hitslink.com/reeport.aspx?qprid=1 10. 2008. W http://www w.digitalmoviebox.ccom/mycamcar/ho ow[100] Jow Webcams Work, web-cams-work k/. 2008.
ISSN: 194 42-9703 / © 20110 IIJ
T Tri Kurniawan Wijaya receivedd his bachelor ddegree in Computter Science in 20007 from Sekolah T Tinggi Teknik Surrabaya, Indonesia.. He was a fullttime computer labboratory assistant ssince his second yyear of study in this university. He joined the D Department of Coomputer Science, at the Sekolah T Tinggi Teknik Suurabaya, as a facuulty member in D December 2007. C Currently, he is a ggraduate student iin the faculty of IInformatics, Viennna University of T Technology. His rresearch interests iinclude machine llearning, neural networks, deccision support ssystems, intelligennt agent, data minning, and game ttheory. IIndra Maryati rreceived the S.K Kom degree in C Computer Sciencee in 2010 from the Sekolah Tinggi T Teknik Surabayaa, Indonesia. Cuurrently she is ppursuing a master’s degree iin Information T Technology at the Sekolah T Tinggi Teknik S Surabaya. Additioonally, she is workking as a Junior L Lecturer at the Sekkolah Tinggi Teknnik Surabaya, in IIndonesia.
26
INT TERNETWORK KING INDONE ESIA JOURNAL L
Edwin Seno Dwihapsoro reeceived the S.K Kom omputer Science in i 2008 from Seko olah degree in Co Tinggi Tekn nik Surabaya, Ind donesia, and the S.E degree in Ecconomics in 2008 from the Universsitas Wijaya Putraa, Surabaya. He iss currently working in the IT Manaagement division at Magna Karsa Mu ulya Contactor Co ompany.
ISSN: 194 42-9703 / © 20110 IIJ
GUNAW WAN ET AL.
INTERNETWORKING INDONESIA JOURNAL
Vol.2/No.2 (2010)
27
UTAUT Model for Understanding Learning Management System I Gusti Nyoman Sedana Universitas Sanata Dharma Yogyakarta, Indonesia
St. Wisnu Wijaya Universitas Sanata Dharma Yogyakarta, Indonesia
Abstract—Sanata Dharma University has been developing a web based learning management system, named Exelsa (Experiential E-Learning of Sanata Dharma University) since 2008. Exelsa provides a number of learning facilities, including an online discussion board, lecture materials, course content management, a course calendar/schedule, information announcement, online test, auto-marked quizzes and exams. The research aims to analyze the most dominant factors that underlie the acceptance and use of Exelsa among the students of Sanata Dharma University by adopting UTAUT model (Venkatesh et al. in 2003). The data is gathered by distributing questionnaires to many course members and by collecting data from the database of Exelsa. After the data was tabulated then it was analyzed using Partial Least Square (PLS). The results shows that performance expectancy, social influence and facilitating conditions have significant influence (α = 0.05) on behavioral intention. It was also shown from two predictor of use behavior (behavioral intention and facilitating conditions), the behavioral intention has a significant influence on use behavior. The research model explained 27.3% of the variance in user intention to use Exelsa. Based on this results, it can be concluded that the UTAUT model is insufficient in explaining students’ intentions in using LMS. Index Terms—Exelsa, LMS, PLS, UTAUT.
S
I. INTRODUCTION
ejak diperkenalkannya Learning Management System (LMS), cara staf pengajar bekerja telah berubah pesat. Apakah pelajaran diajarkan sepenuhnya online atau apakah menggunakan paduan pendekatan tradisional dan online, sebagian besar staf pengajar di universitas harus merancang dan mengembangkan materi online, membuat dan memelihara situs web pembelajaran dan LMS telah menjadi sarana komunikasi yang dominan dengan siswa bagi banyak staf pengajar [12]. Definisi dari LMS adalah infrastruktur dimana e-Learning dapat dibangun dan diantarkan [10]. Sementara e-Learning adalah proses pembelajaran yang memanfaatkan teknologi I Gusti Nyoman Sedana adalah staf quality assurance di PT. Sigma Cipta Caraka. Penulis dapat dihubungi melalui email,
[email protected]. St. Wisnu Wijaya adalah Dosen Teknik Infromatika Universitas Sanata Dharma Yogyakarta. Penulis dapat dihubungi di
[email protected].
informasi dan komunikasi secara sistematis dengan mengintegrasikan semua komponen pembelajaran, termasuk interaksi pembelajaran lintas ruang dan waktu, dengan kualitas yang terjamin [17]. Pengembangan sistem pembelajaran berbasis digital (LMS) merupakan salah satu kegiatan untuk meningkatkan kualitas pembelajaran di Universitas Sanata Dharma [18]. Kegiatan ini diadakan untuk mendukung salah satu sasaran jangka pendek Universitas Sanata Dharma yaitu, peningkatkan efisiensi dan produktifitas. Untuk mencapai sasaran ini, sejak tahun 2008 Universitas Sanata Dharma telah mengembangkan dan mengimplementasikan sebuah LMS yang diberi nama Experiential E-Learning of Sanata Dharma University, yang lebih dikenal dengan nama Exelsa [13]. Proyek ini diikuti oleh kebijakan Universitas dengan mengadakan pelatihan penggunaan Exelsa dan pengembangan materi pembelajaran digital bagi para dosen [18]. Disamping itu, juga dilakukan sosialisasi mengenai keberadaan Exelsa melalui penyebaran brosur-brosur di lingkungan Universitas Sanata Dharma. Sebagai sebuah media pembelajaran, Exelsa harus dapat diterima dan digunakan oleh para penggunanya sehingga dapat meningkatkan produktivitas. Salah satu model terbaru untuk menjelaskan penerimaan pengguna (user acceptance) dalam bidang Sistem Informasi dikembangkan oleh Venkatesh, dkk. [20]. Model ini diberi nama the Unified Theory of Acceptance and Use of Technology (UTAUT). UTAUT menunjukkan bahwa niat untuk berperilaku (behavioral intention) dan perilaku untuk menggunakan suatu teknologi (use behavior) dipengaruhi oleh persepsi orangorang terhadap ekspektansi kinerja (performance expectancy), ekspektansi usaha (effort expectancy), pengaruh sosial (social influence) dan kondisi yang membantu (facilitating conditions) yang dimoderatori oleh jenis kelamin (gender), usia (age), pengalaman (experience) dan kesukarelaan (voluntariness). Teori ini menyediakan alat bagi para manajer untuk menilai kemungkinan keberhasilan pengenalan teknologi baru dan membantu mereka memahami penggerak penerimaan dengan tujuan untuk proaktif mendesain intervensi (termasuk pelatihan, sosialisasi, dll.) yang ditargetkan pada populasi pengguna yang mungkin cenderung kurang untuk mengadopsi dan menggunakan sistem baru [20]. Model asli UTAUT divalidasi menggunakan data yang dikumpulkan dari lingkungan non akademik. Meskipun demikian, beberapa peneliti telah menerapkan model ini di lingkungan akademik. Dasgupta, dkk. [4] menerapkan model UTAUT untuk memahami persepsi mahasiswa terhadap
ISSN: 1942-9703 / © 2010 IIJ
INTERNETWORKING INDONESIA JOURNAL
28
penerimaan dan penggunaan Case tools. Hasilnya effort expectancy tidak berpengaruh terhadap behavioral intention. Marchewka, dkk. [11], Sedana dan Wijaya [14] juga melaporkan adanya perbedaan dengan model asli UTAUT ketika mereka menerapkan UTAUT di lingkungan akademik. Menurut laporan Marchewka, dkk. [11], performance expectancy tidak memiliki hubungan yang signifikan dengan behavioral intention. Sementara dalam laporan studi pendahuluan yang dilakukan oleh Sedana dan Wijaya [14], effort expectancy tidak memiliki hubungan yang signifikan dengan behavioral intention. Sedana dan Wijaya [14] juga melaporkan bahwa facilitating conditions memiliki hubungan yang signifikan dengan behavioral intention. Meskipun hasil penelitian-penelitian dengan model UTAUT di lingkungan akademik sedikit berbeda dengan model aslinya, kami percaya dengan mengadopsi model UTAUT akan membantu kami untuk memahami faktor-faktor yang mendasari penerimaan dan penggunaan Exelsa. Namun untuk menyesuaikan dengan situasi dan kondisi lingkungan penelitian, dalam penelitian ini kami tidak menggunakan variabel moderator. II. EXELSA Experiential E-Learning of Sanata Dharma University (Exelsa) adalah sebuah Learning Management System (LMS) berbasis web yang dikembangkan untuk meningkatkan kualitas pembelajaran dan pendidikan di Universitas Sanata Dharma [13]. Sebagai sebuah LMS, Exelsa menyediakan sejumlah fasilitas pembelajaran seperti, tugas online, tes online, bahan kuliah, chating, menjawab kuesioner, melihat pengumuman, forum mata kuliah, kalender kegiatan dan sebagainya. Lebih lanjut, Exelsa diharapkan dapat meningkatkan efektifitas dan kualitas komunikasi pembelajaran dengan pendekatan knowledge management diantara berbagai pihak seperti dosen, mahasiswa, program studi, biro administrasi akademik, penyedia media pembelajaran serta berbagai pihak lainnya yang berkepentingan [13]. Penggunaan Exelsa di Universitas Sanata Dharma tidak bersifat wajib. Meski demikian, beberapa jurusan seperti, Teknik Informatika, Teknik Mesin, Teknik Elektro, Pendidikan Akuntansi dan Bimbingan dan Konseling telah memanfaatkannya. Para mahasiswa dari jurusan-jurusan ini memiliki pengalaman yang relatif sama dalam menggunakan Exelsa. Usia mereka pun tidak terpaut jauh antara satu dengan yang lainnya. Alasan-alasan inilah yang menyebabkan kami tidak menggunakan variabel moderator dalam penelitian ini.
SEDANA & WIJAYA
innovation diffusion theory (IDT) dan social cognitive theory (SCT). UTAUT terbukti lebih berhasil dibandingkan kedelapan teori yang lain dalam menjelaskan hingga 70 persen varian niat (intention) [14].
Gambar 1. Model UTAUT. Sumber: Venkatesh, dkk. [20].
III. UTAUT The Unified Theory of Acceptance and Use of Technology (UTAUT) merupakan salah satu model penerimaan teknologi terkini yang dikembangkan oleh Venkatesh, dkk. [20]. UTAUT mensintesis elemen-elemen pada delapan model penerimaan teknologi terkemuka untuk memperoleh kesatuan pandangan mengenai penerimaan pengguna. Kedelapan teori terkemuka yang disatukan di dalam UTAUT adalah theory of reasoned action (TRA), technology acceptance model (TAM), motivational model (MM), theory of planned behavior (TPB), combined TAM and TPB, model of PC utilization (MPTU),
Model UTAUT memiliki empat konstruk yang memainkan peran penting sebagai determinan langsung dari behavioral intention dan use behavior yaitu, performance expectancy, effort expectancy, social influence, dan facilitating conditions (definisinya dapat dilihat pada tabel 1). Disamping itu terdapat pula empat moderator: gender, age, voluntariness, dan experience yang diposisikan untuk memoderasi dampak dari konstruk-konstruk pada behavioral intention dan use behavior.
ISSN: 1942-9703 / © 2010 IIJ
INTERNETWORKING INDONESIA JOURNAL
Vol.2/No.2 (2010)
Gambar 1 menampilkan keterkaitan antara determinandeterminan dan moderator-moderator ini. IV. METODOLOGI Penelitian ini menggunakan model UTAUT yang telah dimodifikasi sedemikian rupa hingga menjadi lebih sederhana seperti terlihat pada gambar 2. Performance Expectancy Effort Expectancy
Behavioral Intention
Use Behavior
Social Influence Facilitating Condition
Gambar 2. Model Penelitian. Teknik pengambilan data dalam penelitian ini adalah survei dan pengambilan data melalui basis data Exelsa. Pengambilan sampel menggunakan metode purposive sampling dengan kriteria: (1) kelas kuliah yang dipilih adalah kelas yang memanfaatkan Exelsa dalam proses pembelajaran, (2) responden adalah mahasiswa Universitas Sanata Dharma yang mengikuti kuliah yang telah ditentukan pada saat penelitian. Data yang diperoleh dari survei adalah data primer sedangkan, pengambilan data perilaku penggunaan Exelsa dari basis data menghasilkan data sekunder. Data primer adalah data yang diperoleh langsung di lapangan. Sedangkan data sekunder adalah data yang diperoleh dari sumber-sumber yang telah ada [6]. Instrumen penelitian yang dipakai adalah skala UTAUT. Skala UTAUT diadaptasi dari skala penelitian Venkatesh dkk.[20]. Skala ini mencakup lima aspek yaitu, performance expectancy (PE), effort expectancy (EE), social influence (SI), facilitating conditions (FC), dan behavioral intention (BI). Model penilaian pernyataan terdiri dari lima alternatif jawaban yang diberi skor berkisar dari 1 sampai 5. Klasifikasi jawabannya adalah Sangat Tidak Setuju (STS), Tidak Setuju (TS), Netral (N), Setuju (S) dan Sangat Setuju (SS). Sementara itu, konstruk use behavior diukur menggunakan frekuensi login mahasiswa dalam 5 minggu. Frekuensi login diperoleh dengan cara mengolah data dari basis data Exelsa milik Lembaga Penjaminan Mutu Universitas Sanata Dharma. Sejumlah 232 mahasiswa dari berbagai program studi mengisi kuesioner, dari jumlah tersebut 214 kuesioner dinyatakan layak untuk dianalisis sedangkan sisanya tidak layak untuk dianalisis karena tidak valid. Pengambilan data kemudian dilanjutkan dengan mengambil data dari basis data Exelsa berdasarkan nomor induk mahasiswa (NIM) yang telah mengisi kuesioner dengan valid. Data ini kemudian diolah sehingga diperoleh hasil berupa frekuensi login mahasiswa selama lima minggu. Rentang waktu lima minggu tersebut merupakan masa dimana mahasiswa akan, sedang, dan sudah menempuh ujian tengah semester.
29
V. HASIL PENELITIAN DAN PEMBAHASAN Dari 214 responden yang telah mengisi kuesioner dengan valid, 107 responden adalah laki-laki dan sisanya adalah perempuan. Data yang telah terkumpul kemudian ditabulasi lalu dianalisis dengan menggunakan partial least square (PLS) dengan bantuan perangkat lunak SmartPLS versi 2.0 M3. Chin (1998) dalam Henseler, dkk. [7] memberikan kriteria untuk mengevaluasi model PLS dengan 2 langkah, yang mencakup (1) evaluasi terhadap outer model dan (2) evaluasi terhadap inner model. A. Evaluasi Terhadap Outer Model Outer model dengan indikator refleksif dievaluasi dengan convergent dan discriminant validity dari indikatornya dan composite reliability untuk blok indikator. Convergent validity dinilai berdasarkan korelasi antara skor item dengan skor konstruk yang dihitung dengan PLS [5], [7]. Tabel 2 memperlihatkan bahwa semua item memiliki loading factor di atas 0,70 sehingga memiliki convergent validity yang baik [5], [7]. Tabel 2 Loading Factor dari SmartPLS.
Tabel 3 Composite Reliability. BI
Composite Reliability 0,953
EE
0,847
FC
0,842
PE
0,869
SI
0,844
Tabel 4 Korelasi Variabel Laten dan Akar AVE.
ISSN: 1942-9703 / © 2010 IIJ
INTERNETWORKING INDONESIA JOURNAL
30
Discriminant validity dinilai dengan membandingkan nilai square root of average variance extracted (akar kuadrat AVE) setiap konstruk dengan korelasi antara konstruk dengan konstruk lainnya dalam model. Dari tabel 4 dapat diketahui bahwa nilai discriminant validity tergolong baik karena nilai akar kuadrat AVE setiap konstruk lebih besar daripada nilai korelasi antara konstruk dengan konstruk lainnya dalam model [5], [7]. Masing-masing konstruk dalam penelitian ini memiliki reliabilitas yang tinggi. Hal ini dapat dilihat dari tabel 3 dimana composite reliability memiliki nilai yang tinggi (di atas 0,80) [5], [7]. B. Evaluasi Terhadap Inner Model Pengujian terhadap inner model dilakukan dengan melihat nilai R-square yang merupakan uji goodness-fit model. Dari tabel 5 dapat diketahui bahwa model penelitian ini memberikan nilai R-square sebesar 0,273 dan 0,055. Hasil ini dapat diinterpretasikan sebagai berikut, (1) besarnya varian konstruk behavioral intention yang dijelaskan oleh varian konstruk performance expectancy, effort expectancy, social influence, dan facilitating conditions adalah 27,3% sedangkan sisanya dijelaskan oleh variabel lain di luar yang diteliti. (2) besarnya varian konstruk use behavior (USE) yang dijelaskan oleh varian konstruk facilitating conditions dan behavioral intention adalah 5,5% sedangkan sisanya dijelaskan oleh variabel lain di luar yang diteliti. Tabel 5 R Square. Variabel Variabel Independen Dependen PE EE BI SI FC FC BI
USE
R Square 0,273
0,055
Pengujian kemudian dilanjutkan dengan melihat signifikansi pengaruh variabel independen terhadap variabel dependen dengan melihat nilai signifikansi statistik t dan nilai koefisien parameter. Dari tabel 6 dapat diketahui bahwa performance expectancy, social influence, dan facilitating conditions memiliki pengaruh signifikan terhadap behavioral intention karena nilai t hitung lebih besar daripada nilai t tabel (1,97) pada taraf signifikansi sebesar 0,05. Behavioral intention juga memiliki pengaruh signifikan terhadap use behavior. Sementara effort expectancy tidak memiliki pengaruh yang signifikan terhadap behavioral intention begitu pula facilitating conditions tidak memiliki pengaruh yang signifikan terhadap use behavior. Nilai koefisien parameter pengaruh performance expectancy, social influence, dan facilitating conditions terhadap behavioral intention secara berturut-turut adalah
SEDANA & WIJAYA
0.147, 0.188, 0.322. Nilai koefisien parameter pengaruh behavioral intention terhadap use behavior adalah 0,244. Tabel 6 Path Coefficients (Mean, STDEV, T-Values).
C. Pembahasan Temuan bahwa performance expectancy memberikan pengaruh yang signifikan pada behavioral intention mendukung hasil penelitian Venkatesh, dkk. [20], Dasgupta, dkk. [4]. Responden menganggap bahwa penggunaan Exelsa dapat menolongnya untuk mendapatkan keuntungankeuntungan kinerja di pekerjaannya seperti, lebih mudah dan cepat dalam mengerjakan dan menyelesaikan tugas-tugas kuliah. Anggapan ini akan meningkatkan niatnya untuk menggunakan Exelsa. Tingkat kemudahan terkait penggunaan suatu sistem akan mempengaruhi niat untuk menggunakan sistem tersebut [20], [21]. Penelitian ini menemukan hasil yang berbeda yaitu, tingkat kemudahan terkait penggunaan suatu sistem (effort expectancy) tidak memberikan pengaruh yang signifikan pada behavioral intention. Dasgupta, dkk. [4], yang meneliti tentang penerimaan pengguna terhadap Case Tools dalam menganalisis dan mendesain suatu sistem, juga menemukan pengaruh yang tidak signifikan seperti ini. Pengaruh yang tidak signifikan ini dapat disebabkan karena Exelsa relatif mudah digunakan dan berdasarkan hasil wawancara singkat diperoleh informasi bahwa sebagian besar responden telah memperoleh pengetahuan mengenai teknologi informasi dan komunikasi sejak duduk di bangku Sekolah Menengah Atas. Sehingga, responden tidak menganggap bahwa kemudahan dalam menggunakan Exelsa akan mempengaruhi niatnya untuk menggunakan Exelsa. Social Influence memiliki pengaruh yang signifikan terhadap behavioral intention. Temuan ini mendukung hasil penelitian Venkatesh, dkk. [20] dan dasgupta, dkk. [4]. Hasil ini menunjukkan bahwa lingkungan sosial di sekitar responden seperti teman kuliah, dosen, dan universitas mempengaruhi niat mereka untuk menggunakan Exelsa. Variabel facilitating conditions ternyata memiliki pengaruh signifikan terhadap behavioral intention namun tidak memiliki pengaruh yang signifikan terhadap use behavior. Dasgupta, dkk. [4] juga menemukan bahwa facilitating conditions memiliki pengaruh yang signifikan pada behavioral intention. Temuan ini berbeda dengan hasil yang dilaporkan oleh Venkatesh, dkk. [20] dimana dalam laporannya dikatakan bahwa facilitating conditions tidak memiliki pengaruh yang signifikan terhadap behavioral intention namun memiliki pengaruh yang signifikan terhadap
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
use behavior. Tampaknya ketersediaan infrastruktur teknik dan organisasional mendukung niat untuk menggunakan Exelsa tapi tidak mendukung perilaku penggunaan Exelsa. Lebih lanjut, hasil ini dapat disebabkan oleh lingkungan dimana penelitian ini diadakan. Penelitian ini diadakan di lingkungan akademik sedangkan model asli UTAUT divalidasi dengan data yang diperoleh dari lingkungan non akademik. Mengenai tidak signifikannya pengaruh facilitating conditions terhadap use behavior dapat disebabkan karena tidak digunakannya moderator experience dan age. Seperti yang diungkapkan oleh Venkatesh, dkk. [20] konstruk facilitating conditions apabila dimoderasi oleh experience dan age maka akan memiliki pengaruh yang signifikan pada use behavior. Namun karena responden dalam penelitian ini tergolong dalam satu kelompok usia yang sama, dan semuanya memiliki pengalaman yang relatif sama dalam menggunakan Exelsa maka kedua moderator ini tidak digunakan. Hal inilah yang dapat menyebabkan tidak signifikannya konstruk facilitating conditions dalam mempengaruhi use behavior. Penelitian ini juga menemukan bahwa behavioral intention memiliki pengaruh yang positif dan signifikan terhadap use behavior. Temuan ini sesuai dengan konsep dasar dari modelmodel penerimaan pengguna yaitu, niat untuk menggunakan teknologi informasi akan mempengaruhi penggunaan sebenarnya teknologi informasi tersebut [20].
UCAPAN TERIMA KASIH Penulis mengucapkan terima kasih kepada pihak-pihak yang telah membantu sehingga penelitian ini dapat diselesaikan, terutama kepada Ir. Ig. Aris Dwiatmoko, M.Sc., Prof. Diogenes de Souza Bido, dan Dr. Christian M. Ringle atas diskusinya mengenai metode statistik. Terakhir, namun tidak mengurangi rasa hormat, kami berterima kasih kepada Puspaningtyas Sanjoyo Adi, S.T., M.T., PH Prima Rosa, S.Si., M.Sc., I Gusti Ketut Puja S.T., M.T., dan Damar Widjaja, S.T., M.T., atas izinnya sehingga kami dapat mengambil data untuk penelitian ini. DAFTAR PUSTAKA [1] [2] [3] [4]
[5] [6] [7]
VI. KESIMPULAN DAN SARAN Berdasarkan hasil analisis empirik dan pembahasan yang telah dilakukan dalam penelitian ini, maka dapat disimpulkan bahwa (1) Variabel performance expectancy, social influence dan facilitating conditions terbukti signifikan mempengaruhi behavioral intention Mahasiswa Universitas Sanata Dharma dalam menggunakan Exelsa. Sementara variabel effort expectancy terbukti tidak signifikan. Variabilitas behavioral intention dapat diterangkan 27,3% dengan menggunakan variabel performance expectancy, social influence, effort expectancy dan facilitating conditions. (2) Variabel behavioral intention terbukti signifikan mempengaruhi use behavior. Sementara variabel facilitating conditions terbukti tidak signifikan. Variabilitas use behavior dapat diterangkan 5,5% dengan menggunakan variabel behavioral intention dan facilitating conditions. (3) Hasil penelitian menunjukkan bahwa model UTAUT belum bisa menjelaskan dengan baik penerimaan dan penggunaan Learning Management System di kalangan mahasiswa Universitas Sanata Dharma. Agar pemanfaatan Exelsa lebih meningkat, peneliti menyarankan hal-hal sebagai berikut (1) Cara mengajar sebaiknya dirancang sedemikian rupa hingga, mahasiswa percaya bahwa penggunaan Exelsa akan secara positif mempengaruhi performa belajarnya. (2) Para dosen sebaiknya mendorong mahasiswanya untuk menggunakan Exelsa karena mahasiswa percaya bahwa orang-orang yang penting baginya, misalnya dosen menganggap bahwa dia sebaiknya menggunakan Exelsa. (3) Universitas Sanata Dharma sebaiknya memberikan dukungan yang memadai terhadap ketersediaan infrastruktur teknis dan organisasional yang mendukung penggunaan Exelsa.
31
[8] [9] [10] [11]
[12] [13] [14]
[15] [16]
[17]
[18]
Arikunto, S., 2002, “Prosedur Penelitian: Suatu Pendekatan Praktek”, Edisi Revisi V, Rineka Cipta, Jakarta. Azwar, S., 2007, “Penyusunan Skala Psikologi”, Pustaka Pelajar, Yogyakarta. Azwar, S., 2008, “Reliabilitas dan Validitas”, Edisi ke-3, Pustaka Pelajar, Yogyakarta. Dasgupda, S., Haddad, M., Weiss, P., dan Bermudez, E., 2007, “User Acceptance of Case Tools in System Analysis and Design: an Empirical Study”, Journal of Informatics Education Research, Vol. 9, No. 1, pp. 51-78. Ghozali, Imam, 2008, “Struktural Equation Modeling Metode Alternatif dengan Partial Least Square (PLS)”, Edisi ke-2, Badan Penerbit Universitas Diponegoro, Semarang. Hasan, M.I., 2002, “Pokok-Pokok Materi Metodologi Penelitian dan Aplikasinya”, Ghalia Indonesia, Jakarta. Henseler, J., Ringle, C.M., dan Sinkovics, R.R., “The Use of Partial Least Squares Path Modeling in International Marketing”, dalam Sinkovics, R.R., Ghauri, P.N.(eds.), Advances in International Marketing, Vol. 20, Bingley 2009, pp. 277-320. Hesterberg, T., Monaghan, S., Moore, D.S., Clipson, A., dan Epstein, R., 2003, “Bootstrap Methods and Permutation Tests”, W. H. Freeman and Company. New York. Jogiyanto, HM, 2008, “Metodologi Penelitian Sistem Informasi: Pedoman dan Contoh Melakukan Penelitian di Bidang Sistem Teknologi Informasi”, Penerbit Andi, Yogyakarta. Lundy, J., dan Filho, W.A.A, 2004, “Magic Quadrant for Learning Management Systems”, http://mediaproducts.gartner.com/reprints/ wbtsystems/120167.html, Diakses tanggal 12 November 2009. Marchewka, J.T., Liu, C., dan Kostiwa, K., 2007, “An Application of the UTAUT Model for Understanding Student Perceptions Using Course Management Software”, Communications of the IIMA, Vol. 7, No. 2, pp. 93-104. McGill, T., Klobas, J., dan Renzi, S., 2008, “The Relationship between LMS Use and Teacher Performance: The Role of Task-Technology Fit”, Australasian Conference on Information Systems, pp. 648-657. P3MP, ”Manual Exelsa Mahasiswa”, http://www.exelsa. usd.ac.id/ goDownload.php?file=./uploads/tutorial/ManualExelsaMahasiswa.chm, Diakses tanggal 12 April 2009. Sedana, IGN., dan Wijaya, W., 2009, “Applying UTAUT Model to Reach Better Understanding on The Acceptance and Use of Learning Management System Case Study: Experiential E-Learning of Sanata Dharma University”, Proceedings of the International Conference on Advance Computer Science and Information Systems, pp 415-420. Sugiyono, 2009, “Metode Penelitian Kuantitatif, Kualitatif, dan R&D”, Edisi ke-6, Alfabeta, Bandung. Tibendarana, P.K.G., dan Ogao, P.J., 2008, “Information Communication Technologies Acceptance and Use Among University Communities in Uganda: A Model for Hybrid Library Services EndUsers”, International Journal of Computing and ICT Research, Vol. 1, No. 1, pp. 60-75. Tim Penyusun, 2007, “Pedoman Penjaminan Mutu Penyelenggaraan eLearning”, http://www.clr.ui.ac.id/files/pedoman%20penjaminan %20mutu%20 e-learning %20UI.pdf, Diakses tanggal 17 Desember 2009. Tim Penyusun, 2008, “Rencana Strategis 2008-2012”, Universitas Sanata Dharma, Yogyakarta.
ISSN: 1942-9703 / © 2010 IIJ
32
INTERNETWORKING INDONESIA JOURNAL
[19] Venkatesh, V, 2000, “Determinants of Perceived Ease of Use: Integrating Control, Intrinsic Motivation, and Emotion into the Technology Acceptance Model”, Information Systems Research, Vol. 11, No. 4, pp. 342-365. [20] Venkatesh, V., Morris, M.G., Davis, G.B., dan Davis, F.D., 2003, “User Acceptance of Information Technology: Toward a Unified View”, MIS Quarterly, Vol. 27, No. 3, pp. 425-478. [21] Wahid, F., 2007, “Using Technology Adoption Model to Analyze Internet Adoption and Use Among Men and Women in Indonesia”, The Electronic Journal on Information System in Developing Countries, Vol. 32, No. 6, pp. 1-8.
I Gusti Nyoman Sedana received his Bachelor degree from the Informatics Engineering Department, University of Sanata Dharma, Indonesia, in 2010. His previous research includes user acceptance models. He is currently a quality assurance staff member at the Sigma Cipta Caraka Corporation.
St. Wisnu Wijaya received his Bachelor degree from the Electrical Engineering Department, Gadjah Mada University and his Master of Informatics degree from the Department of Informatics, Bandung Institute of Technology (ITB), Indonesia. His current appointments are: lecturer of Department of Informatics and researcher at the Center for Information Technology Studies, Sanata Dharma University, Indonesia. Since 2009, he has been an expert staff for the Institute for Migrant Workers. His research interest is in the information system discipline, especially the role of information technology in social change. He has published numerous papers in peer-reviewed journals and conference proceeding. Mr St. Wisnu Wijaya has won numerous research grants, including research grants from the Government of Indonesia, Sanata Dharma University and International Donors.
ISSN: 1942-9703 / © 2010 IIJ
SEDANA & WIJAYA
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
33
Star Schema Design for Concept Hierarchy in Attribute Oriented Induction Spits Warnars Manchester Metropolitan University, United Kingdom Abstract—The Concept Hierarchy in attribute oriented induction is a powerful tool for saving the knowledge hierarchy in data, which will be then used to generalize mining rules for data mining. Database design influences the performance applications when reading records in database. When building the database table the allocation of fields will be decided by the lowest and the highest concept within the concept tree. Additionally, the number of hierarchy concept levels in a concept tree will decide the quantity of fields in the table. The number of record tables will be decided by the number of the lowest concepts in concept tree. For numeric values the treatment is different. For efficiency reasons the number of created table records will instead depend on the amount of concepts at the next generalization. All the tables from concept trees will become dimensional tables and the data table will become a fact table. All these tables in fact create a star schema. The Star schema is recognized as data warehouse schema for multidimensional analysis, and will give value add for attribute oriented induction for multidimensional purposes.
Index Terms—Data Mining, concept hierarchy, attribute oriented induction, rule, database.
T
I. INTRODUCTION
HE attribute oriented induction method integrates a machine learning paradigm – especially learning-fromexamples techniques [3] – with database operations, extracts generalized rules from an interesting set of data and discovers high level data regularities [13]. The attribute oriented induction method has been implemented in a data mining system prototype called DBMINER [16,17] which was previously called DBLearn [12,14], and has been tested successfully against a large relational database. The attribute oriented induction approach was developed for learning different kinds of knowledge rules such as characteristic rules, discrimination or classification rules, quantitative rules, data evolution regularities [15], qualitative rules [9], association rules and cluster description rules [13]. Attribute oriented induction has the concept hierarchy as an advantage, where a concept hierarchy as background knowledge can be provided by knowledge engineers or Manuscript received March 17, 2010. Spits Warnars is a doctoral student at the Department of Computing and Mathematics, Manchester Metropolitan University, John Dalton Building, Chester Street, Manchester M15GD United Kingdom. (e-mail:
[email protected]).
domain experts [8,10,13]. Concepts are ordered in a concept hierarchy by levels from specific or low level concepts into general or higher level concepts. Generalization is achieved by ascending to the next higher level concepts along the paths of concept hierarchy [11]. The most general concept is the null description as the most specific concepts correspond to the specific values of attributes in the database described as ANY. The concept hierarchy can be balanced or unbalanced, but an unbalanced hierarchy must then be converted to a balanced one. There are 3 Types of concept generalization on concept hierarchy [6] and in [5] number (2)-(4) as 3 types of rule based concept hierarchy and number (1) as a non rule based concept hierarchy. These are: 1) Unconditional concept generalization: the rule associated with the unconditional IS-A type rules. A concept is generalized to a higher level concept because of the subsumption relationship indicated in the concept hierarchy. 2) Conditional/deductive rule generalization: the rule associated with a generalization path as a deduction rule where the type of rules is conditional and can only be applied to generalize a concept if the corresponding condition can be satisfied. For example, form : A(x) Λ B(x) C(x) has the meaning that for a tuple x, the concept(attribute value) A can be generalized to concept C if condition B can be satisfied by x. Or concept C can be generalized if it can be satisfied by concept A and B. 3) Computational rule generalization: each rule is represented by a condition which is value-based and can be evaluated against an attribute or a tuple or the database by performing some computation. The truth value of the condition would then determine whether a concept can be generalized via the path. 4) Hybrid rule-based concept generalization: a hierarchy can have paths associated with all the above 3 different types of rules. It has a powerful representation capability and is suitable for many kinds of application. An implementation needs a database for storing records, while the application needs data mining tools for the user to read records from database and get the rules in for presentation. The database can be implemented using any kind of database management system, such as Microsoft Access, SQL Server, Oracle, MySQL and many other database management systems. The application can be implemented using any kind of language software, such as Java, Active
ISSN: 1942-9703 / © 2010 IIJ
34
INTERNETWORKING INDONESIA JOURNAL
Server Pages, Visual Basic and others. The choice of the database and application will depend on a number of factors, including on the organizational rules, monetary resources, and the technology. In considering databases, one important factor is the performance of the database system since it influences the performance of the application as a whole. A good database design will increase the application performance, while a bad database design will reduce the application performance. An additional factor is whether the database is normalized. On one hand the normalization of a database provides for the best performance and is useful for transactional processing systems with input, edit and delete operations. On the other hand, an unnormalized database will provide the best approach for online analytical processing where records need only to be read. Moreover, the number of SQL join operations will also influence performance as a whole. Typically, more join operations will need to be made in a normalized database compared to an unnormalized database. The expert knowledge in the area of database design contributes towards obtaining the best design and performance. Additionally, a distributed database can also be used for handling large distributed concept hierarchies. In this paper we transform the concept hierarchy into tables resulting in a new database design. The concept tree will be used as data input and the rules will be built based on the queries [23,26,28]. Using queries to build rules presents an efficient mechanism for understanding the mined rules ([22,25]). The use of the threshold as a control over the maximum number of tuples (within the target class in the final generalized relation) will no longer be needed. Instead, group by operator in the SQL Select statement will limit the final result generalization. Setting different thresholds will generate different generalized tuples (needed of global picture of induction). Doing this repeatedly is time-consuming and tedious work [27]. Instead, all the interesting generalized tuples as a multiple rule can be generated for the global picture of induction by using group by operator in the SQL Select statement. In converting the concept hierarchy into a database table and in order to obtain the best database design, there are a number of questions that need to be answered. These questions include how to perform the conversion (from the concept hierarchy into a table), how many tables will be created and based on which criteria. By answering these questions, we hope that a standard method will emerge for the conversion of concept hierarchies into good database designs. In the first instance the first step will be to explore how to convert the non-rule based concept hierarchy (as an unconditional concept hierarchy). Future work will address the conversion of from rule based concept hierarchy. In this paper we discuss only the conversion of the non-rule based concept hierarchy into tables in the star schema design. The implementation of the current and proposed attribute oriented induction, including the differentiation between them, can be found in [29].
WARNARS
II. CONVERTING FROM CONCEPT HIERARCHY INTO TABLE In order to aid the discussion and provide a link between the current work with our previous research, we will use the concept hierarchy based on data examples found in [2,8]. Figure 1 shows an example of a concept hierarchy from university database. Comp Math Physic Biology ... Literature Music History ...
Science ANY(major) Art
Burnaby British Columbia Victoria ... Canada Edmonton Alberta ... ... Bombay ...
India
Nanjing ...
China
ANY(Birthplace) Foreign
... Freshman Sophomore Junior Senior
Undergraduate
MA MS PhD
Graduate
0.0-1.99 2.0-2.99 3.0-3.49 3.5-4.0
Poor Average Good Excellence
ANY(Category)
ANY(GPA)
Fig. 1. Example of a Concept hierarchy
Based on concept hierarchy in Figure 1 we will build concept tree that represents a taxonomy of concepts of the values in an attribute domain. The concept tree will be built based on the most general concept which is described as ANY in concept hierarchy. Based on Figure 1 there are four (4) most general concepts with ANY, resulting in concept trees [24]. These are: 1) (science, art) ANY(major) 2) (graduate, undergraduate) ANY(category) 3) (Foreign, Canada) ANY(birthplace) 4) (poor, average, good, excellence) ANY(GPA) The dotted-line symbol (…) in Figure 1 shows the
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
possibility of other concepts. For example, there is the possibility to extend the width of concept tree by extending the “Major” beyond “Science” and “Art”. Similarly, the “Birthplace” is not limited to just only “Canada” and “Foreign”. For the database implementation all the concept trees will be implemented as tables in the database. The concept tree for “Major” will be implemented as the table entitled hierarchy_major (see Table 1 and its explanation in Figure 2). The lowest level concept tree for “Major” in Figure 2 like comp, math, physics, biology, literature, music and history become the first field name Major and has varchar data type with length 15. The next and the last level concept tree for major in figure 2 like science and art become the next field name StudyProg and have varchar data type with length 15. TABLE I
35
As another example, the concept tree for “Category” will be implemented as table hierarchy_cat (see Table 3 and Figure 3). The lowest level concept tree for “Major” in Figure 3 such as “Freshman”, “Sophomore”, “Junior”, “Senior”, “MS”, “MA” and “PhD” become the first field named “Category” and has varchar data type with length 15. The last level concept tree for “Category” in Figure 3, such as “Undergraduate” and “Graduate” becomes the next field name “Study” and has varchar data type with length 15. TABLE III HIERARCHY_CAT TABLE Field Name Type Category varchar(15) Study varchar(15)
HIERARCHY_MAJOR T ABLE
Field Name Major StudyProg
Type varchar(15) varchar(15)
Field Name Type Category varchar(15) Study varchar(15)
Field Name Type Major varchar(15) StudyProg varchar(15) Fig. 2. The transformation of the concept tree for Major into hierarchy_major table
For example, if there are 10 lowest level concepts in the concept tree “Major” (Figure 2) then 10 records or tuples will be created in the table hierarchy_major based on field “Major” as the first field in that table. Each of the records will fill the next field studyprog based on generalization the lowest level concept tree major. For example for the record where the major field is “Computing”, its next field studyprog will be filled with “Science” because the computing as the lowest level concept tree major in Figure 2 has a generalization into Science. Table 2 is the result from the data from the concept tree “Major” in Figure 2. TABLE II RECORDS FOR HIERARCHY_MAJOR TABLE Major Computing Math Biology Chemistry Statistics Physics Music History Literal Arts Literature
StudyProg Science Science Science Science Science Science Art Art Art Art
Fig. 3. The transformation of the concept tree for Category into hierarchy_cat table
In Figure 3 there are seven (7) lowest level concepts from concept tree “Category”, which will create seven (7) records or tuples in table hierarchy_cat based on field Category as the first field in the table. The records filling next field “Study” will be based on the generalization the lowest level concept tree “Category”. For example the record where the category field contains “Freshman” will fill the next field (“Study”) with “Undergraduate” because the concept “Freshman” as the lowest level concept tree (in “Category” in Figure 3) has a generalization into “Undergraduate”. Table 4 shows this based on the data from concept tree “Category” in Figure 3. TABLE IV RECORDS FOR HIERARCHY_CAT TABLE Category Freshman Sophomore Junior Senior MS MA PhD
Study undergraduate undergraduate undergraduate undergraduate graduate graduate graduate
The concept tree for “Birthplace” will be implemented as the table hierarchy_birth (see Table 5 and Figure 4). The lowest level concept tree for birthplace in Figure 4 (such as Burnaby, Victoria, Edmonton, Bombay and Nanjing) will become the first field name “Birthplace” and has a varchar data type with length 15. The next level concept tree for birthplace in Figure 4 (namely British Columbia, Alberta, India and China) will become the next field name “City” and ISSN: 1942-9703 / © 2010 IIJ
36
INTERNETWORKING INDONESIA JOURNAL
has varchar data type with length 20. Note that this tree is different from the previous concept due to the fact that it has three (3) levels. As such, the next and the last level concept tree for birthplace in Figure 4 (namely Canada and Foreign) will become the next field name “Country” and has a varchar data type with length 10. Thus, the number of hierarchy levels will determine the number of the fields that will be created. In the case of the previous concept tree, because there are two (2) hierarchy levels, therefore two fields will be created for each concept tree. Similarly, since the concept tree “Birthplace” has three (3) hierarchy levels, therefore three fields will be created in the table. TABLE V HIERARCHY_BIRTH T ABLE
Field Name Birthplace City Country
Type varchar(15) varchar(20) varchar(10)
Field Name Type Birthplace varchar(15) City varchar(20) Country varchar(10)
field “Country” (which is the last field) with have the value “Foreign” because the concept “India” has a generalization into “Foreign” concept. Table 6 shows the resulting data from concept tree “Birthplace” as shown in Figure 4. The concept tree “GPA” will be implemented as table hierarchy_gpa (see Table 7 and Figure 5). Different to the previous concept trees, here there are a large range data for hierarchy levels. For example, the generalization for concept “Poor” comes from the value range between 0 and 1.99, and thus there will be 199 values possible (covering from 0.00 to 1.99). For efficiency reasons we just record the first value and last value in the range for each of hierarchy levels. As result, we will add one field containing the last value implying that “The number of hierarchy levels will decide the number of fields will be created”. We use this notation here because although the concept tree “GPA” (Figure 5) has only two levels (and thereby creating two fields in the database), adding 199 records to each field will result in a table with a total of 400 records. Thus for efficiency reasons we treat concept trees with numeric values differently [11,21,19,20]. Therefore, here we have created only 3 fields with 4 records. In the case of the GPA, the first field is GPA_start and will have a range a values using the float(3,2) data type. Next is field name GPA_fin that will also have a range a values using the float(3,2) data type. The last field name “Range” will be the same as the other concept trees where it has “the highest level before the most general concept ANY is the last field”. The field range has varchar data type with length 15.
Fig. 4. The transformation of the concept tree for Birthplace into hierarchy_birth table
Following the previous example, there are eleven (11) lowest level concepts in the concept tree “Birthplace” as shown in Figure 4. This will create eleven records or tuples in table hierarchy_birth based on field “Birthplace” as the first field in the table. In each record the next field entry will be based on the generalization on concept tree “Birthplace”.
Birthplace Bombay Burnaby Calgary Edmonton Nanjing Ottawa Richmond Shanghai Toronto Vancouver Victoria
TABLE VI RECORDS FOR HIERARCHY_BIRTH TABLE City Country India Foreign British Columbia Canada Alberta Canada Alberta Canada China Foreign Ontario Canada British Columbia Canada China Foreign Ontario Canada British Columbia Canada British Columbia Canada
WARNARS
TABLE VII HIERARCHY_ GPA TABLE Field Name GPA_start GPA_fin range
Type float(3,2) float(3,2) varchar(15)
Fig. 5. The transformation of the concept tree for GPA into hierarchy_gpa table
For example the record where the birthplace field contains the value “Bombay” will have the next field (“City”) filled with the value “India” because the concept “Bombay” as the lowest level in the concept tree “Birthplace” (see Figure 4) has a generalization into the India concept. Furthermore, the next
Because GPA concept tree in Figure 5 handles numeric values, then the number of created records will not depend on the number of concepts at the lowest level in concept tree. Instead for efficiency it will depend on the number of concepts at the next generalization. Now at the next generalization (after the lowest level concept) there are four (4) concepts, namely “Poor”, “Average”, “Good”, and “Excellent”. Different to the other cases, the handling of numeric values in the concept tree will be via specialization. For example the first data at the level after the first low level is “Poor” and we assign the GPA_start value of 0 and the GPA_fin with the
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
value of 1.99. We do this also for the other fields. The result is shown in Table 8 using the concept tree in Figure 5.
37
concept slice, dice and pivot [4, 18]. The aggregate count function and group-by-operator in the SQL Select statement can represent the roll up process [1, 7].
TABLE VIII RECORDS FOR HIERARCHY_GPA TABLE GPA_start 0.0 2.0 3.0 3.5
GPA_fin 1.99 2.99 3.49 4.0
range Poor Average Good Excellent
Based on the previous explanation, below are the summary of the assumptions that we have used to convert a concept hierarchy into a table: 1) The lowest level concept tree maps to the first field and the highest level (before the most general concept ANY) is mapped to the last field. 2) The number of hierarchy levels in a concept tree will decide the number of fields in the table, with the exception of numeric values for efficiency reasons. 3) The number of concepts at the lowest level in a concept tree will become the number of records or tuples in the table, with the exception of numeric values for efficiency reasons. 4) For the efficiency handling numeric values in a concept tree, the number of created table records will not depend on the number of concepts at the lowest level in concept tree, but instead it will depend on the number of concepts at the next generalization. III. LOGICAL DATA MODEL Based on data examples found in [2,8], Table 9 shows the corresponding student table.
Name Anton Andi Amin Anil Ayin Amir Acai Abdi Afun Agung Ahing Akuan
TABLE IX STUDENT TABLE Category Major M.A. History Junior Math Junior Liberal arts M.S. Physics Ph.D. Math Sophomore Chemistry Senior Computing Ph.D. Biology Sophomore Music Ph.D. Computing M.S. Statistics Freshman literature
Birthplace Vancouver Calgary Edmonton Ottawa Bombay Richmond Victoria Shanghai Burnaby Victoria Nanjing Toronto
GPA 3.5 3.7 2.6 3.9 3.3 2.7 3.5 3.4 3.0 3.8 3.2 3.9
All the tables that represent a concept hierarchy can be connected to the student table. Figure 6 shows a class diagram with the connectivity between tables. In the context of the Data Warehouse concept, Figure 6 shows a star schema, where the student table is viewed as a fact table and the other tables from the concept tree are viewed as a dimensional table. As a result, the multi dimensional concept in Data Warehouse can be applied here, where the data can be roll up and drill down and the data can be viewed in multiple dimensions with
Fig. 6. A Logical data model
To improve the performance the logical data model in Figure 6, it can be changed to become one (1) table (as only one fact table) as shown in Table 10. The data as the representative all the data in Figure 6 is shown in Table 11 and 12. The performance can be improved by using unnormalized tables. This is because using one table is faster rather than using 5 tables through the decrease in the number of needed Join operations. However, the limitation of using only one fact table is that it poorly represents the concept hierarchy. For efficiency needs these two options can be used together, where queries will be directed at the single fact table and the concept hierarchy is used as the basis for the logical data model as exemplified by Figure 6. TABLE X ONE FACT TABLE Student Name Category Study Major Studyprog Birthplace City Country GPA Range TABLE XI DATA IN ONE FACT TABLE Name Anton Andi Amin Anil Ayin Amir Acai
Category
Study
Major
StudyProg
MA
graduate
History
art
Junior
undergraduate
Math
science
Junior
undergraduate
Literal Arts
art
MS
graduate
Physics
science
PhD
graduate
Math
science
Sophomore
undergraduate
Chemistry
science
Senior
undergraduate
Computing
science
ISSN: 1942-9703 / © 2010 IIJ
38 Abdi Afun Agung Ahing Akuan
INTERNETWORKING INDONESIA JOURNAL
PhD
graduate
Biology
science
Sophomore
undergraduate
Music
art
PhD
graduate
Computing
science
MS
graduate
Statistics
science
Freshman
undergraduate
Literature
art
TABLE XII CONTINUE DATA IN O NE FACT TABLE Birthplace
City
Country
GPA
Range
Vancouver
British Columbia
Canada
3.5
Excellent
Calgary
Alberta
Canada
3.7
Excellent
Edmonton
Alberta
Canada
2.6
Average
Ottawa
Ontario
Canada
3.9
Excellent
Bombay
India
Foreign
3.3
Good
Richmond
British Columbia
Canada
2.7
Average
Victoria
British Columbia
Canada
3.5
Excellent
Shanghai
China
Foreign
3.4
Good
Burnaby
British Columbia
Canada
3
Good
Victoria
British Columbia
Canada
3.8
Excellent
Nanjing
China
Foreign
3.2
Good
Toronto
Ontario
Canada
3.9
Excellent
Another option for efficiency is by the appropriate coding of the student data which can provide storage space savings, better query performance and easier data management. The following is an example of assigning codes to the field entries: 1) Coding for entries in the field studyprog: science 01, art 02. 2) Coding for entries in the field major: computing01, Math02, Biology03, Chemistry04, statistics05, physics06, Music07, History08, Literal arts09, Literature 10. 3) Coding for the field Study: undergraduate01, graduate02. 4) Coding for the field Category: Freshman01, Sophomore02, Junior03, Senior04, MS05, MA06, PhD07. 5) Coding for the field Birthplace: Bombay01, Burnaby02, Calgary03, Edmonton04, Nanjing05, Ottawa06, Richmond07, Shanghai08, Toronto09, Vancouver10, Victoria11. 6) Coding for the field City: India01, British Columbia02, Alberta03, China04, Ontario05. 7) Coding for the field Country: Canada01, foreign02. Note that the table representing hierarchy_gpa cannot be coded because of its numeric data. Table 13 shows the coding result for hierarchy_birth table, Table 14 for the hierarchy_cat table, Table for the student table, and Table 16 shows the coding result for hierarchy_major table.
WARNARS
TABLE XIII CODING FOR H IERARCHY_BIRTH TABLE BirthID 010102 020201 030301 040301 050402 060501 070201 080402 090501 100201 110201
Birthplace Bombay Burnaby Calgary Edmonton Nanjing Ottawa Richmond Shanghai Toronto Vancouver Victoria
City India British Columbia Alberta Alberta China Ontario British Columbia China Ontario British Columbia British Columbia
Country Foreign Canada Canada Canada Foreign Canada Canada Foreign Canada Canada Canada
TABLE XIV CODING FOR H IERARCHY_CAT TABLE CatID 0101 0102 0103 0104 0205 0206 0207
Category Freshman Sophomore Junior Senior MS MA PhD
Study undergraduate undergraduate undergraduate undergraduate graduate graduate graduate
TABLE XV CODING FOR STUDENT TABLE name Anton Andi Amin Anil Ayin Amir Acai Abdi Afun Agung Ahing Akuan
Category 0206 0103 0103 0205 0207 0102 0101 0207 0102 0207 0205 0101
Major 0208 0102 0209 0106 0102 0104 0101 0103 0207 0101 0105 0210
Birthplace 100201 030301 040301 060501 010102 070201 110201 080402 020201 110201 050402 090501
TABLE XVI CODING FOR H IERARCHY_MAJOR TABLE MajorID Major StudyProg 0101 Computing Science 0102 Math Science 0103 Biology Science 0104 Chemistry Science 0105 Statistics Science 0106 Physics Science 0207 Music Art 0208 History Art 0209 Literal Arts Art 0210 Literature Art
ISSN: 1942-9703 / © 2010 IIJ
Vol.2/No.2 (2010)
INTERNETWORKING INDONESIA JOURNAL
IV. CONCLUSION In order to convert a non-rule based concept hierarchy, there are some guidance which can help in the conversion. First, the lowest level concept is mapped to the first field and the highest level before the most general concept ANY is mapped to the last field. Secondly, the number of hierarchy levels will determine the number of fields except for numeric value (for efficiency reasons). Thirdly, the number of concepts at the lowest level in concept tree will become the number of records or tuples in the table (again with the exception of numeric values for efficiency reasons). Finally, for the efficiency handling numeric values in a concept tree, the number of created table records will not depend on the number of concepts at the lowest level in concept tree, but instead on the number of concepts at the next generalization. From an implementation perspective, there is the option of using one fact table or several combined tables, and there is also the option of using coding to improve efficiency and performance. REFERENCES [1] [2] [3] [4] [5]
[6] [7]
[8]
[9] [10] [11]
[12] [13]
[14] [15]
Alves, R. and Belo, O. (2007), „Multidimensional Data Mining„, SDDI‟07 – Doctoral Symposium. Cai, Y. (1989), „Attribute-oriented induction in relational databases‟, Master thesis, Simon Fraser University. Cai, Y., Cercone, N. and Han, J. (2007),‟Learning in relational databases: an attribute-oriented approach‟, Computational Intelligence, 7(3),119-132. Chen,M., Han, J. and Yu, P.S. (1996), „Data Mining: An overview from database perspective‟, IEEE Trans. Knowledge and Data Eng, 8(6),866883. Cheung, D.W., Fu, A.W. and Han, J. (1994), „Knowledge discovery in databases: A rule-based attribute-oriented approach‟, in Proceedings of Intl Symp on Methodologies for Intelligent Systems, Charlotte, North Carolina,164 -173. Cheung, D.W., Hwang, H.Y., Fu, A.W. and Han, J. (2000), „Efficient rule-based attribute-oriented induction for data mining‟, Journal of Intelligent Information Systems, 15(2), 175-200. Gray, J., Chaudhuri, S., Bosworth, A., Layman, A., Reichart, D. and Venkatrao, M. (1997), „Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals‟, Data Mining and Knowledge Discovery, 1, 29-53. Han, J., Cai, Y. and Cercone, N. (1992), „Knowledge discovery in databases: An attribute-oriented approach‟, in Proceedings 18th International Conference Very Large Data Bases, Vancouver, British Columbia, 547-559. Han,J., Cai, Y., and Cercone, N. (1993), „Data-driven discovery of quantitative rules in relational databases‟, IEEE Trans on Knowl and Data Engin, 5(1), 29-40. Han,J. (1994), „Towards efficient induction mechanisms in database systems‟, Theoretical Computer Science, 133(2), 361-385. Han, J. and Fu, Y.(1994),‟Dynamic Generation and Refinement of Concept Hierarchies for Knowledge Discovery in Databases‟, in Proceedings of AAAI Workshop on Knowledge Discovery in Databases, 157-168. Han, J., Fu, Y., Huang, Y., Cai, Y., and Cercone, N. (1994), „DBLearn: a system prototype for knowledge discovery in relational databases‟, ACM SIGMOD Record, 23(2), 516. Han,J. and Fu, Y. (1995), „Exploration of the power of attribute-oriented induction in data mining‟, in U. Fayyad, G. Piatetsky-Shapiro, P. Smyth and R. Uthurusamy, eds. Advances in Knowledge Discovery and Data Mining, 399-421, AAAI/MIT Press. Han, J., Fu, Y., and Tang, S. (1995), „Advances of the DBLearn system for knowledge discovery in large databases‟, in Proceedings of the 14th international Joint Conference on Artificial intelligence, 2049-2050. Han, J., Cai, Y., Cercone, N. and Huang, Y. (1995), „Discovery of Data Evolution Regularities in Large Databases‟, Journal of Computer and Software Engineering, 3(1), 41-69.
39
[16] Han,J., Fu, Y.,Wang, W., Chiang, J., Gong, W., Koperski, K., Li,D., Lu, Y., Rajan,A., Stefanovic,N., Xia,B. and Zaiane,O.R. (1996), „DBMiner:A system for mining knowledge in large relational databases‟, in Proceedings Int'l Conf. on Data Mining and Knowledge Discovery, 250-255. [17] Han, J., Chiang, J. Y., Chee, S., Chen, J., Chen, Q., Cheng, S., Gong, W., Kamber, M., Koperski, K., Liu, G., Lu, Y., Stefanovic, N., Winstone, L., Xia, B. B., Zaiane, O. R., Zhang, S., and Zhu, H. (1997), „DBMiner: a system for data mining in relational databases and data warehouses‟, In Proceedings of the 1997 Conference of the Centre For Advanced Studies on Collaborative Research, 8. [18] Han,J., Lakshmanan, L.V.S. and Ng, R.T. (1999), „Constraint-based, multidimensional data mining‟, IEEE Computer, 32(8), 46-50. [19] Hsu, C. (2004),‟Extending attribute-oriented induction algorithm for major values and numeric values‟, Expert Systems with Applications, 27, 187-202. [20] Hu, X. (2003), „DB-HReduction: A Data Preprocessing Algorithm for Data Mining Applications‟, Applied Mathematics Letters,16(6),889-895. [21] Huang, Y. and Lin, S. (1996), „An Efficient Inductive Learning Method for Object-Oriented Database Using Attribute Entropy‟, IEEE Transactions on Knowledge and Data Engineering, 8(6),946-951 [22] Imielinski, T. and Virmani, A. (1999), „MSQL: A Query Language for Database Mining‟, in Proceedings of Data Mining and Knowledge Discovery, 3, 373-408 [23] Meo, R., Psaila,G. and Ceri,S. (1998),‟An Extension to SQL for Mining Association Rules‟, in Proceedings of Data Mining and Knowledge Discovery,2,195-224. [24] Muyeba,M.K. and Keane,J.A. (1999), „Extending attribute-oriented induction as a key-preserving data mining method‟, in Proceedings 3rd European Conference on Principles of Data Mining and Knowledge Discovery, Lecture Notes in Computer science, 1704, 448-455. [25] Muyeba, M. (2005),‟On Post-Rule Mining of Inductive Rules using a Query Operator‟, in Proceedings of Artificial Intelligence and Soft Computing. [26] Muyeba, M. and Marnadapali, R. (2005),‟A framework for Post-Rule Mining of Distributed Rules Bases‟, in Proceeding of Intelligent Systems and Control. [27] Wu, Y., Chen, Y. and Chang, R. (2009), „Generalized Knowledge Discovery from Relational Databases‟, International Journal of Computer Science and Network, 9(6),148-153. [28] Zaiane, O.R. (2001), „Building Virtual Web Views‟, Data and Knowledge Engineering, 39, 143-163. [29] Warnars, S. (2010),‟Attribute oriented induction with star schema‟,International Journal of Database Management Systems (IJDMS),2(2),20-42.
ISSN: 1942-9703 / © 2010 IIJ
Spits Warnars received his bachelor‟s degree in Information System from the University of Budi Luhur, Indonesia in 1995. He finished a master degree in Information Technology from university of Indonesia, in January 2007. Since 1995 he has been a full time lecturer at the Information Technology faculty, at University of Budi Luhur, in Indonesia. He has also been an auxiliary lecturer since January 2008 at the Bina Nusantara University in Indonesia. Spits Warnars has been completing his PhD degree in computer science at Manchester Metropolitan University, UK, since Sept 2008. His area of interest is data mining. He has published 3 papers in international journals, 6 papers at international conferences, 10 papers in Indonesian journals, and 7 papers at national conferences in Indonesia. He has been an ACM student member since 2008. He has been a reviewer for the World Scientific and Engineering Academy and Society since 2006. Since July 2010 he has been on the editorial board of the Journal of Computing and Applications (JCA), International Association for Computer Scientists and Engineers (IACSE) and Journal of Global Research in Computer Science (JGRCS).
Internetworking Indonesia Journal
The Indonesian Journal of ICT and Internet Development ISSN: 1942-9703
About the Internetworking Indonesia Journal The Internetworking Indonesia Journal (IIJ) was established out of the need to address the lack of an Indonesia-wide independent academic and professional journal covering the broad area of Information and Communication Technology (ICT) and Internet development in Indonesia. The broad aims of the Internetworking Indonesia Journal (IIJ) are therefore as follows: • Provide an Indonesia-wide independent journal on ICT: The IIJ seeks to be an Indonesia-wide journal independent from any specific institutions in Indonesia (such as universities and government bodies). Currently in Indonesia there are numerous university-issued journals that publish papers only from the respective universities. Often these university journals experience difficulty in maintaining sustainability due to the limited number of internally authored papers. Additionally, most of these university-issued journals do not have an independent review and advisory board, and most do not have referees and reviewers from the international community. • Provide a publishing venue for graduate students: The IIJ seeks also to be a publishing venue for graduate students (such as Masters/S2 and PhD/S3 students) as well as working academics in the broad field of ICT. This includes graduate students from Indonesian universities and those studying abroad. The IIJ provides an avenue for these students to publish their papers to a journal which is international in its reviewer scope and in its advisory board. • Improve the quality of research & publications on ICT in Indonesia: One of the long term goals of the IIJ is to promote research on ICT and Internet development in Indonesia, and over time to improve the quality of academic and technical publications from the Indonesian ICT community. Additionally, the IIJ seeks to be the main publication venue for various authors worldwide whose interest include Indonesia and its growing area of information and communication technology. • Provide access to academics and professionals overseas: The Editorial Advisory Board (EAB) of the IIJ is intentionally composed of international academics and professionals, as well as those from Indonesia. The aim here is to provide Indonesian authors with access to international academics and professionals who are aware of and sensitive to the issues facing a developing nation. Similarly, the IIJ seeks to provide readers worldwide with easy access to information regarding ICT and Internet development in Indonesia. • Promote the culture of writing and authorship: The IIJ seeks to promote the culture of writing and of excellent authorship in Indonesia within the broad area of ICT. It is for this reason that the IIJ is bilingual in that it accepts and publishes papers in English and Bahasa Indonesia. Furthermore, the availability of an Indonesia-wide journal with an international advisory board may provide an incentive for young academics, students and professionals in Indonesia to develop writing skills appropriate for a journal. It is hoped that this in-turn may encourage them to subsequently publish in other international journals in the future.
Focus & Scope: The Internetworking Indonesia Journal (IIJ) aims to become the foremost publication for practitioners, teachers, researchers and policy makers to share their knowledge and experience in the design, development, implementation, and the management of ICT and the Internet in Indonesia. Topics of Interest: The journal welcomes and strongly encourages submissions based on interdisciplinary approaches focusing on ICT & Internet development and its related aspects in the Indonesian context. These include (but not limited to) information technology, communications technology, computer sciences, electrical engineering, and the broader social studies regarding ICT and Internet development in Indonesia. A list of topics can be found on the journal website at www.InternetworkingIndonesia.org. Open Access Publication Policy: The Internetworking Indonesia Journal provides open access to all of its content on the principle that making research freely available to the public supports a greater global exchange of knowledge. This follows the philosophy of the Open Journal Systems (see the Public Knowledge Project at pkp.sfu.ca). The journal is published electronically (PDF format) and there are no subscription fees. Such access is associated with increased readership and increased citation of an author’s work.
Internetworking Indonesia Journal ISSN: 1942-9703 www.InternetworkingIndonesia.org