Wetenschappelijk Onderzoek- en Documentatiecentrum Ministerie van Veiligheid en Justitie
Privacy in the BIG DATA ERA
Sunil Choenni
19 november 2014
Content Introduction Privacy Three Cases: Public Safety, Health Care, and Data Sharing Conclusions & further Research
Privacy in the big data era |
BIG OPPORTUNITIES (ten dienste van de mens)
Automatisch signaleren wat er in je koelkast moet Simuleren alsof je thuis bent terwijl je op vakantie bent Auto’s die met elkaar communiceren Betere en actueler voorspellingen Ouderen helpen navigeren Veiligheid Serious Games, benadert de werkelijkheid
Privacy in the big data era |
BIG CHALLENGES: Verwerken raakt veel kernelementen van de Informatica: bestandssystemen (Google File System), nieuwe aanpakken voor het beheren van gegevens (BigTable), Geavanceerde redeneertechnieken, redeneren met onzekerheid, incompleetheid en onwetenheid, nieuwe programmeerparadigma's (MapReduce), nieuwe programmeertalen (bijvoorbeeld Sawzall)
Privacy in the big data era |
BIG CHALLENGES: SOFTE KANT Welke data mag je met elkaar combineren? Hoe ga je verkregen combinaties duiden? Hoe behoud je de relatie met de echte werkelijkheid? Privacy wordt een issue; prijsdifferentiaties nu ook al bij AH
Privacy in the big data era |
BIG CHALLENGES: Privacy Door de grote stromen van data en de combinatie van de data is de kans op privacyschending groter Stel een school publiceert gemiddelde examencijfers, uitgesplitst naar jongens en meisjes; vanuit social media weten we dat maar twee meisjes aan een vak deelnemen en een van de meisjes had een 9 cijfer van andere meisje is dan ook bekend
Privacy in the big data era |
• (alle) informatie die in de wereld van het recht en wet- en regelgeving in documenten/teksten als wetten, contracten, protocollen, convenanten, ‘treaties’, dagvaardingen, vonnissen enz is neergelegd, is gedigitaliseerd, • inclusief meta-data en sociale media communicatie er betrekking op hebbend en • inclusief de resultaten van digitaal zoekgedrag van burgers en organisaties, ----------- daardoor ontstaan Legal Big Data, die op hun beurt *) analyseerbaar zijn met behulp van software/ machines/ algoritmes En *) tot spannende, relevante resultaten leiden, die transparant en ‘enlightening’ zijn.
Privacy in the big data era |
Human Values / Privacy
open to different interpretation depending on context belief that a specific mode of conduct/end state is personally or socially preferable to an opposite or converse mode Values refer to desirable goals, are ordered by their importance Relative importance of multiple values guides the actions of people
Privacy in the big data era |
Human Values / Privacy
health care: privacy privacy concerns that people are not willing to be watched unnecessary by care takers
public safety: privacy privacy pertains on the exposure of the identity of individuals
Privacy in the big data era |
Privacy by Design: Three examples
Privacy in the big data era |
Public Safety Goal: provide policy makers a tool such that they may create mashups, i.e., able to combine data from different sources and create their own content. requirement: prevent undesired effects • violation of privacy • misinterpretation of statistics • disclosure of the identity of a group of individuals
Privacy in the big data era |
Information Need By means of two workshops: about 30 people participated ranging from junior policy makers to directors Some individual meetings after the workshops Results • Three types of questions • Contextual data is required as well • Requirements to the tool
Privacy in the big data era |
Sketch of the phenomenon crime presented at workshops
Privacy in the big data era |
Three types of questions
Simple queries. For example,how many people in a region within a time period responded in a specific way to a specific survey question?
Context of a quantifier. For example, how does the growth or decline of a specific figure in a geographical region relate to another figure? For example, a growth in bicycle thefts in a neighbourhood can turn into a relative decline when local population growth exceeds. This contextualisation must be considered in order to understand the public safety data.
Similarity queries, i.e. looking for regions that share in some respect the same context. After querying for a specific data set in which some numbers stand out in some way, the user can query for other regions that show similar numbers or trends.
Privacy in the big data era |
source
ETL process data
translator 1 translator 2
query results
translator n
mashup_to_sQL
store/ retrieve data
Data access layer
Data Warehouse
set queries
mashed up data
Interface Layer
Presentation module
Mashup module defined mashup
Privacy in the big data era |
Architecture: to prevent violation of privacy only attributes are stored in the system that are in line with Dutch Personal Data Protection Act, i.e., no data about someone’s religion or life conviction, political conviction, health, sexual orientation, ethnic orgin only aggregated data are stored Mashups that contains results that may violate the privacy are not shown by the presentation module ( e.g. if there are only 2 convicted persons for a crime type X, this is not shown. Also if there are 90 % of the people in a region involved in crime, this is not shown as well -(An extensive explanation module to facilitate interpretation)
Privacy in the big data era |
Three types of questions small-scale housing for dementia patients is gaining interest goal: to increase the quality of life by offering substitutes to traditional nursing home houses equipped with infra red sensors to alert staff if patients need assistance, e.g. falling from bed may lead to fractures (expensive to recover for elderly many false alarms due to falling blankets and/or eiderdown
Privacy in the big data era |
Health Care small-scale housing for dementia patients is gaining interest goal: to increase the quality of life by offering substitutes to traditional nursing home houses equipped with infra red sensors to alert staff if patients need assistance, e.g. falling from bed may lead to fractures (expensive to recover for elderly many false alarms due to falling blankets and/or eiderdown
Vul titel presentatie in | Vul datum in
Health care Sensor 1
Sensor n
images
sensor data
media, (e.g., pda)
Server Camera images
micro phone
sound
images
media, (e.g., pda)
Figure 1: Design of a health care system
Privacy in the big data era |
Health care: Privacy and Trust Objections against privacy was taken away since the server decides whether it will send movements of patients to the care takers. Whenever a care taker switches on the camera in a house, this is logged by the system and passed to higher management Trust between patient and care taker remains as value since patients consider the system as an extension to the staff and not as a replacement Privacy in the big data era |
Feedback Procedure WODC (data controller)
DANS (data processor)
researcher (data requester, or a next data processor)
0: data
1: data request 2: feedback 3: offline negotiation (optional) 4: deny/grant access
revised policy 5: data
Privacy in the big data era |
Implicit Feedback WODC (data controller)
DANS (data processor)
researcher (data requester, or a next data processor)
5: data 6: implicit feedback
scientific article data processor in order to Understand this one one hs realize oblem we
Privacy in the big data era |
Big Challenges: softe kant Welke data mag je met elkaar combineren? Hoe ga je verkregen combinaties duiden? Onderscheid tussen toegang tot data (Acess Control) en gebruik van data (Use Control) Monitoring van data gebruik Use levels (per veld aangeven hoe het te gebruiken)
Privacy in the big data era |