Program Studi: Manajemen Bisnis Telekomunikasi & Informatika Mata Kuliah: Big Data And Data Analytics Oleh: Tim Dosen
CONCEPT/FRAMEWORK BIG DATA ANALYTICS ACTIVITIES
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Outline Review Data Analytics & Big Data (last week topics) o Understanding Data o Activity / Storytelling Based on Data Type (Model Based) o Asking the Questions to Data o
2
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Definitions Analytics: “the systematic computational analysis of data or statistics” (Google Definition) “the method of logical analysis” (Merriam – Webster) Analysis of data is a process of inspecting, cleaning, transforming, and modeling data with the goal of discovering useful information, suggesting conclusions, and supporting decisionmaking.
Data science is an interdisciplinary field about processes and systems to extract knowledge or insights from data in various forms, either structured or unstructured,[1][2] which is a continuation of some of the data analysis fields such as statistics, data mining, and predictive analytics (From Wikipedia, by many references) and many more….. 1. Dhar, V. (2013). "Data science and prediction". Communications of the ACM. 56 (12): 64. doi:10.1145/2500499. 2. Jeff Leek (2013-12-12). "The key word in "Data Science" is not Data, it is Science". Simply Statistics.
3
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Data Science
Based on aforementioned definitions, we can conclude that Data Analytics includes: • Data engineering • Scientific Method • Math • Statistics Data Engineering may includes: • Data Gathering • Data Mining • Data Transformation • Data Cleansing • etc.
Ref: many sources
4
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Recall
Our approach to Big Data
5
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Big Data Approach Framework Some people prefer 3Vs, 6Vs or 7Vs even 12Vs to explain big data. But the original “bigness” measurement metrics are volume, velocity, and variety. For example 7Vs: 1. Volume 2. Velocity 3. Variety 4. Variability 5. Veracity 6. Visualitazion 7. Value 6
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
7
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Data Analytics Workflow Telkom University
Data Set
Data Analytics Methods
8
Knowledge
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
TelkomData University Big Analytics Constructors
9
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
UNDERSTANDING DATA : High Dimensional Data
Telkom University
Dimension / Attributes / Properties
Name
Address
Occupation
Age
Blood Type Marital
…..
Sex
Agus
Jl. Mawar 1
Artist
30
A
Married
…..
Male
Andry
Jl. Kucing 50
Lawyer
32
O
Married
…..
Male
Beatrice
Jl. Raya 27
Student
21
O
Single
…..
Female
Ben
Jl. Diponegoro 12
Driver
37
AB
Married
…..
Male
…...
….
….
….
…..
…..
…..
…..
Zorro
Jl. Dago 34
Student
18
B
Single
…...
Male
High Dimensional Data, add up complexity problem to Big Data Analytics
Curse : High space searching, Summarization, Reduction (PCA) Blessing : Comprehensive data knowledge 10
Creating business leaders – Donoho (2000) High Dimensonal Data Analysis : the Cursegreat and Blessing of Dimensionality
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University UNDERSTANDING DATA : Network vs Non Network Data
Name
Sex
Age
Number of Friend
Agus
Cecep
Dita
Rina
Agus
-
1
1
0
Agus
Male
25
2
Cecep
1
-
1
1
Cecep
Male
23
3
Dita
1
1
-
0
Dita
Sex
21
2
Rina
Sex
22
1
Rina
0
1
0
-
Non Network Data
Network Data
11
Agus
Dita
Cecep
Rina
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
UNDERSTANDING DATA : Structured vs UnStructured Data
12
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
13
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Characteristics
Telkom University
Stuctured Data
Unstructured Data
Well defined content
Structure not obvious
Easily understood
Process data to understand
Stored in RDBMS
RDBMS not a good fit
Easy to enter, store, and analyze
Difficult and costly to analyze
Example: Data in database table (customer data, sales data, sensor data)
Example: Email, video files, audio files, web pages, presentations, social media feeds
14
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
UNDERSTANDING DATA : SQL vs NoSQL Telkom University
SQL
*NoSQL: Not only SQL
NoSQL
15
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
16
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
17
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
MODELLING FRAMEWORK
18
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Case Studies : Data Analytics Common Roles
Telkom University
1. Estimation
5. Association
4. Clustering
19
2. Predictions
3. Classification
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
1. Estimation
Telkom University
Estimate Pizza Time Delivery
Customer
Number of Order (O)
Number of Traffic Light (TL)
Distance (D)
Delivery Time (T)
1
3
3
3
16
2
1
7
4
20
3
2
4
6
18
4
4
6
8
36
2
4
2
12
...
1000
Learning with Estimation Methods (Regresi Linier)
Delivery Time (T) = 0.48O + 0.23TL + 0.5D Knowledge 20
Creating the great business leaders
Label
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Output/Pola/Model/Knowledge Telkom University
1. Formula/Function (Rumus atau Fungsi Regresi) • DELIVERY TIME = 0.48 + 0.6 DISTANCE + 0.34 TRAFFIC LIGHT + 0.2 ORDER
2. Decision Tree (Pohon Keputusan) 3. Correlation and Association
4. Rule (Aturan) • IF ipk>3.5 THEN lulus cum laude
5. Cluster (Klaster) 21
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
2. Prediction Telkom University
Label
Predict Stock Price Stock price data set in a form of time series (rentet waktu) model
Learning with Prediction(Neural Network)
22
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
2. Prediction Knowledge in a form of Neural Network Model
Predict Stock Price
Prediction Plot
23
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
3. Classification Telkom University
Label
Classify Student Graduation Time
Student Number
Sex
National School Final Score Origin
IPS1
IPS2
IPS3
IPS 4
...
Graduation Status
10001
L
28
SMAN 2
3.3
3.6
2.89
2.9
On Time
10002
P
27
SMA DK
4.0
3.2
3.8
3.7
Late
10003
P
24
SMAN 1
2.7
3.4
4.0
3.5
Late
10004
L
26.4
SMAN 3
3.2
2.7
3.6
3.4
On Time
L
23.4
SMAN 5
3.3
2.8
3.1
3.2
On Time
... ... 11000
Learning with Classification Methods(C4.5)
24
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
3. Classification Telkom University
Classify Student Graduation Time
Knowledge in a form of Decision Tree Model
25
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
3. Classification
Golf Playing Time Recommendation Input
Output (Rules)
If outlook = sunny and humidity = high then play = no If outlook = rainy and windy = true then play = no If outlook = overcast then play = yes If humidity = normal then play = yes If none of the above then play = yes 26
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
3. Classification Telkom University
Golf Playing Time Recommendation
Output Decision Tree
27
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
3. Classification
Contact Lens Recommendation Output
Input
28
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
4. Clustering Telkom University
Finding Iris Flower Cluster Dataset without Label
Input
Learning with Clustering Methods (K-Means)
29
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
4. Clustering Finding Iris Flower Cluster
Output (Distance Plot)
30
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
5. Association Telkom University
Association Product Sold
Learning with Association Method (FP-Growth)
31
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
5. Association Telkom University
Association Product Sold
Output (Association Rules)
32
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
5. Association • association rule algorithm objective is to find some attributes which has shown up “together” • Example, on Thursday night, 1000 customer has bought 200 orang membeli Soap, where from 200 who bought soap, 50 among them bought Fanta • In association rule, we have “If buy Soap, then buy Fanta”, with support value = 200/1000 = 20% and confidence value= 50/200 = 25% • Some association rule algorithm are : A priori algorithm, FP-Growth algorithm, GRI algorithm 33
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Assignment (in The Class) o
Find a Case Study of Big Data Implementation / Application for Business or others o
State the objective, problems, solution idea
o
State the methodology used (explain)
o
State the model, measurement, accuracy
34
Creating the great business leaders
Fakultas Ekonomi dan Bisnis
Program Studi:
Dosen:
School Economic and Business
MANAJEMEN BISNIS TELEKOMUNIKASI & INFORMATIKA
Yudi Priyadi, M.T.
Telkom University
Assignment (at home) o
o
Find a Case Study of Big Data Implementation / Application for Business or others o
State the objective, problems, solution idea
o
State the methodology used (explain)
o
State the model, measurement, accuracy, evaluation
Learn Big Data online free course (www.bigdatauniversity.com)
35
Creating the great business leaders