Privacy-preserving Data Analytics: Problems, Solutions and Challenges
InfoSec Summer School on Information Security

7 August 2015
Bilbao, Yukon, Spain



Big data analytics has the potential to solve many of the world's pressing problems and to create exciting new opportunities for individuals, corporations and governments. Application examples include finding treatments and cures for diseases, streamlining the worldâ??s transportation systems, securing people and infrastructure against acts of terrorism, deploying sustainable energy sources in smart grids, driving customer-centric businesses in the internet age, and many more. However, the big data requirement appears, almost fundamentally, to be in conflict with the idea of privacy. Indeed, much of big data analytics today involves indiscriminate information gathering with scant regard for individual privacy. How can expressive data analysis be conducted while protecting the privacy of people on whom that data is collected? The objective of this course is to present a systematic study of the area of privacy preserving analytics, and to encourage an understanding of the various capabilities, limitations, tradeoffs and challenges. We will motivate the need for privacy-aware analytics and specify the privacy requirements of various players in the big data analytics setting. Next, we will introduce privacy preserving technologies that can be brought to bear on this problem, including beautiful results from cryptography (homomorphic encryption, secure multiparty computation, verifiable computing), and statistical privacy mechanisms (k-anonymity and its variants, differential privacy). We will then highlight the gap between what existing privacy technologies have achieved and the demands of privacy-preserving analytics. This gap creates several interesting challenges that should fuel future research in this area.