Security and Privacy in Data Science
30 March 2022
Presented by
Florian Kerschbaum
(University of Waterloo)
Abstract
Data science is the process from collection of data to the use of new insights gained from this data. It is at the core of the big data and machine learning revolution fueling the digitization of our economy. The integration of data science and machine learning into digital and cyber-physical processes and the often sensitive nature of personally identifiable data used in the process, expose the data science process to security and privacy threats. In this talk I will review three exemplary security and privacy problems in different phases of the data science lifecycle and show potential countermeasures. First, I will show how to enhance the privacy of data collection using secure multi-party computation and differential privacy. Second, I will show how to protect data outsourced to a cloud database system and still perform efficient queries using keyword PIR and homomorphic encryption. Last, I will show that differential privacy does not protect against membership inference attacks as expected.
See video on YouTube