Privacy Preserving Machine Learning

CCS 2019 Workshop
London, November 15

Scope

This one day workshop focuses on privacy preserving techniques for training, inference, and disclosure in large scale data analysis, both in the distributed and centralized settings. We have observed increasing interest of the ML community in leveraging cryptographic techniques such as Multi-Party Computation (MPC) and Homomorphic Encryption (HE) for privacy preserving training and inference, as well as Differential Privacy (DP) for disclosure. Simultaneously, the systems security and cryptography community has proposed various secure frameworks for ML. We encourage both theory and application-oriented submissions exploring a range of approaches, including:

  • secure multi-party computation techniques for ML
  • homomorphic encryption techniques for ML
  • hardware-based approaches to privacy preserving ML
  • centralized and decentralized protocols for learning on encrypted data
  • differential privacy: theory, applications, and implementations
  • statistical notions of privacy including relaxations of differential privacy
  • empirical and theoretical comparisons between different notions of privacy
  • trade-offs between privacy and utility

We think it will be very valuable to have a forum to unify different perspectives and start a discussion about the relative merits of each approach. The workshop will also serve as a venue for networking people from different communities interested in this problem, and hopefully foster fruitful long-term collaboration.

Call For Papers & Important Dates

Download Full CFP


Submission deadline: July 1, 2019 (11:59pm AoE) [Extended]
Notification of acceptance: August 7, 2019
CCS early registration deadline: October 1, 2019 (11:59PM BST)
Workshop: November 15, 2019

Submission Instructions

Submissions in the form of extended abstracts must be at most 4 pages long (not including references), using the CCS format. We do accept submissions of work recently published or currently under review. Submissions should be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have a link to arxiv or a pdf published on the workshop webpage.

Submit Your Abstract



Poster Instructions

All accepted papers have a slot at the poster session. Please print your poster on size up to A0 (841 × 1189 mm) and bring it to the conference.

Travel Grants

Thanks to our generous sponsors, we are able to provide a limited number of travel grants of up to $800 to help partially cover the expenses of PPML attendees who have not received other travel support from CCS this year. To apply, please send an email to ppml19@easychair.org with the subject “PPML19 Travel Grant Application” including a half-page statement of purpose and a summary of anticipated travel expenses. If you are an undergraduate or graduate student, we ask for a half-page recommendation letter supporting your application to be sent by the deadline to the same email address. The deadline for applications is Sep 23, 2019 (11:59pm AoE). The notifications will be sent by Sep 30. Please feel free to send us an email if you have any questions.

Invited Speakers

Schedule

8:50 Welcome
9:00 Invited talk: Kobbi Nissim — Legal Theorems of Privacy
Significant gaps between legal and technical thinking around data privacy make it hard to exactly understand how legal standards apply to economic mechanisms that use personal information. Such mechanisms may apply technical solutions constructed under frameworks such as differential privacy and k-anonymity. However, there is a lot of uncertainty regarding how these mathematical concepts meet legal standards. As a result, arguments that mechanisms apply sufficient technical privacy measures for satisfying legal privacy often lack rigor, and their conclusions are uncertain. The uncertainty is exacerbated by a litany of successful privacy attacks on privacy measures thought to meet legal expectations but then shown to fall short of doing so. We examine strategies for introducing mathematical rigor into the analysis, so as to make formal claims and prove “legal theorems” that technical privacy measures meet legal expectations. For that, we explore some of the gaps between these two very different approaches, and present initial strategies towards bridging these gaps considering examples from US and EU law. Based on joint works with Aaron Bembenek, Mark Bun, Aloni Cohen, Marco Gaboardi, Urs Gasser, David R. O’Brien, Thomas Steinke, Alexandra Wood, and Salil Vadhan.
9:45 Secure parallel computation on national scale volumes of data (contributed talk)
Sahar Mazloom, Le Phi Hung, Samuel Ranellucci and S. Dov Gordon
In this work, we revisit secure computation of graph parallel algorithms, simultaneously leveraging all three of the following advances: we assume four computation servers (with an honest majority, and one malicious corruption), allow differentially private leakage during computation, and, exploiting the parallelism that this affords, we construct an MPC protocol that can perform at national scales. Concretely, we compute histograms on 300 million inputs in 4.18 minutes, and we perform sparse matrix factorization, which is used in recommendation systems, on 20 million inputs in under 6 minutes. These problems have broad, real-world applications, and, at this scale, we could imagine supporting the CensusBureau, or a large company such as Amazon. For comparison, the largest experiments in GraphSC [6] and OblivGraph [18] had 1M inputs, and required 13 hours and 2 hours of runtime, respectively, while using 4 times the number of processors that we employ. End-to-end, our construction is 320X faster than OblivGraph.
10:05 Poster Session and Coffee Break
10:45 Invited talk: Rachel Cummings — Differential Privacy for Dynamic Databases
Privacy concerns are becoming a major obstacle to using data in the ways we want. How can data scientists make use of potentially sensitive data, while providing rigorous privacy guarantees to the individuals who provided data? Over the last decade, differential privacy has emerged as the de facto gold standard of privacy preserving data analysis. Differential privacy ensures that an algorithm does not overfit to the individuals in the database by guaranteeing that if any single entry in the database were to be changed, then the algorithm would still have approximately the same distribution over outputs.
In this talk, we will focus on recent advances in differential privacy for dynamic databases, where the content of the database evolves over time as new data are acquired. First, we will see how to extend differentially private algorithms for static databases to the dynamic setting, with relatively small loss in the privacy-accuracy tradeoff. Next, we see algorithms for privately detecting changes in data composition. We will conclude with a discussion of open problems in this space, including the use of differential privacy for other types of data dynamism.
(based on joint works with Sara Krehbiel, Kevin Lai, Yuliia Lut, Yajun Mei, Uthaipon (Tao) Tantipongpipat, Rui Tuo, and Wanrong Zhang.)
11:30 Garbled Neural Networks are Practical (contributed talk)   
Marshall Ball, Brent Carmer, Tal Malkin, Mike Rosulek and Nichole Schimanski
We show that garbled circuits offer a practical choice for secure evaluation of neural network classifiers, comparable with complex, specialized protocols using less robust assumptions, many rounds of interaction, and/or tailor-made neural networks. In particular, we develop a scheme for garbling ``off the shelf'' pre-trained neural networks, where the only model preprocessing required is a mild discretization step as opposed to requiring a specialized SFE-friendly model to be independently trained. Moreover, as our solution is a garbling scheme, it inherits a much more diverse range of applications than non-garbling-based solutions, perhaps most notably, efficient compilers for the malicious setting. At the protocol level, we start with the garbling scheme of Ball, Malkin, and Rosulek (ACM CCS 2016) for arithmetic circuits and introduce new optimizations for modern neural network activation functions. We develop fancygarbling, the first implementation of the BMR16 garbling scheme along with our new optimizations, as part of heavily optimized garbled-circuits tool that is driven by a TensorFlow classifier description. We evaluate our constructions on a wide range of neural networks. We find that our approach is up to 100x more efficient than straight-forward boolean garbling. It is also roughly 40% more efficient than DeepSecure (Rouhani et al., DAC 2018), a recent garbled-circuit-based approach for secure neural network evaluation, which incorporates significant optimization techniques for boolean circuits. Furthermore, our approach provides competitive performance tradeoffs (efficiency and latency vs. communication) also when compared with non-garbled-circuit approaches.
11:50 Learning Rate Adaptation for Federated and Differentially Private Learning (contributed talk)   
Antti Koskela and Antti Honkela
We propose an algorithm for the adaptation of the learning rate for stochastic gradient descent (SGD) that avoids the need for validation set use. The idea for the adaptiveness comes from the technique of extrapolation: to get an estimate for the error against the gradient flow which underlies SGD, we compare the result obtained by one full step and two half-steps. The algorithm is applied in two separate frameworks: federated and differentially private learning. Using examples of deep neural networks we empirically show that the method works robustly in the case of federated learning unlike commonly used optimisation methods. We also show that the adaptive algorithm is competitive with state-of-the-art hyperparameter search methods and commonly used optimisation methods for differentially privately training.
12:10 Lightning Talks
1. Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adrià Gascón. QUOTIENT: Secure Two-Party Neural Network Training and Prediction via Oblivious Transfer
2. Yuantian Miao, Ben Zi Hao Zhao, Minhui Xue, Chen Chao, Lei Pan, Jun Zhang, Dali Kaafar and Yang Xiang. The Audio Auditor: Participant-Level Membership Inference in Internet of Things Voice Services
3. Harsh Chaudhari, Ashish Choudhury, Arpita Patra and Ajith Suresh. ASTRA: High Throughput 3PC over Rings with Application to Secure Prediction
4. Brendan Avent, Javier Gonzalez, Tom Diethe, Andrei Paleyes and Borja Balle. Automatic Discovery of Privacy-Utility Pareto Fronts
5. Marco Romanelli, Konstantinos Chatzikokolakis and Catuscia Palamidessi. Optimal Obfuscation Mechanisms via Machine Learning and Applications to Location Privacy
6. Niek J. Bouman and Niels de Vreede. A Practical Approach to the Secure Computation of the Moore-Penrose Pseudoinverse over the Rationals
7. James Bell, Aurélien Bellet, Adria Gascon and Tejas Kulkarni. Private Protocols for -statistics in the Local Model and Beyond
8. Mark Abspoel, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede. Fast Secure Comparison for Medium-Sized Integers and Its Application in Binarized Neural Networks
9. Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock and Anderson Nascimento. Secret Sharing based Private Text Classification
10. Qingrong Chen, Chong Xiang, Minhui Xue, Bo Li, Nikita Borisov, Dali Kaafar and Haojin Zhu. Differentially Private Data Sharing: Sharing Models versus Sharing Data
11. Antti Koskela, Joonas Jälkö and Antti Honkela. Computing Exact Guarantees for Differential Privacy
12. Thijs Veugen, Bart Kamphorst, Marie Beth van Egmond and Natasja van de L'Isle. Privacy-Preserving Coupling of Vertically-Partitioned Databases and Subsequent Training with Gradient Descent
13. Sebastian P. Bayerl, Ferdinand Brasser, Christoph Busch, Tommaso Frassetto, Patrick Jauernig, Jascha Kolberg, Andreas Nautsch, Korbinian Riedhammer, Ahmad-Reza Sadeghi, Thomas Schneider, Emmanuel Stapf, Amos Treiber and Christian Weinert. Privacy-Preserving Speech Processing via STPC and TEEs
14. Ranya Aloufi, Hamed Haddadi and David Boyle. Emotionless: Privacy-Preserving Speech Analysis for Voice Assistants
15. Lukas Burkhalter, Alexander Viand, Anwar Hithnawi and Hossein Shafagh. Robust Secure Aggregation for Privacy-Preserving Federated Learning with Adversaries
16. Anders Dalskov, Daniel Escudero and Marcel Keller. Secure Evaluation of Quantized Neural Networks
17. Mohammad Yaghini, Bogdan Kulynych and Carmela Troncoso. Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning
12:40 Poster Session and Lunch Break
14:00 Invited talk: Vitaly Shmatikov — Unwanted Machine Learning
Modern machine learning models exhibit amazing accuracy on tasks from image classification to natural-language processing, but accuracy does not tell the entire story of what these models have learned. Does a model memorize and leak its training data? Does it “accidentally" learn privacy-violating tasks uncorrelated with its training objective? Can it hide a backdoor introduced by an adversary? All of these are examples of unwanted learning, which we need to understand and mitigate in order to solve security and privacy problems in today's AI.
14:45 On Inferring Training Data Attributes in Machine Learning Models (contributed talk)   
Benjamin Zi Hao Zhao, Hassan Jameel Asghar, Raghav Bhaskar and Mohamed Ali Kaafar
A number of recent works have demonstrated that API access to machine learning models leaks information about the dataset records used to train the models. Further, the work of [9] shows that such membership inference attacks (MIAs) may be sufficient to construct a stronger breed of attribute inference attacks (AIAs), which given a partial view of a record can guess the missing attributes. In this work, we show (to the contrary) that MIA may not be sufficient to build a successful AIA. This is because the latter requires the ability to distinguish between similar records (differing only in a few attributes), and, as we demonstrate, the current breed of MIA are unsuccessful in distinguishing member records from similar non-member records. We thus propose a relaxed notion of AIA, whose goal is to only approximately guess the missing attributes and argue that such an attack is more likely to be successful, if MIA is to be used as a subroutine for inferring training record attributes.
15:05 Poster Session and Coffee Break
15:45 Invited talk: Kim Laine — From Homomorphic Encryption to Private AI: Successes, Challenges, and Opportunities
In this talk the audience will learn about modern homomorphic encryption and how it can be used to bring strong privacy guarantees to specific machine learning applications. We will also discuss limitations of homomorphic encryption in general, and specifically in the context of private machine learning, and get a glimpse of how some new exciting research directions may help resolve these challenges in the future.
16:30 Efficient Secure Ridge Regression from Randomized Gaussian Elimination (contributed talk)   
Frank Blom, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede
In this paper we present a practical protocol for secure ridge regression. We develop the necessary secure linear algebra tools, using only basic arithmetic over prime fields. In particular, we will show how to solve linear systems of equations and compute matrix inverses efficiently, using appropriate secure random self-reductions of these problems. The distinguishing feature of our approach is that the use of secure fixed-point arithmetic is avoided entirely, while circumventing the need for rational reconstruction at any stage as well. We demonstrate the potential of our protocol in a standard setting for information-theoretically secure multiparty computation, tolerating a dishonest minority of passively corrupt parties. Using the MPyC framework, which is based on threshold secret sharing over finite fields, we show how to handle large datasets efficiently, achieving practically the same root-mean-square errors as Scikit-learn. Moreover, we do not assume that any (part) of the datasets is held privately by any of the parties, which makes our protocol much more versatile than existing solutions.
16:50 A verification framework for secure machine learning (contributed talk)   
Prasad Naldurg and Karthikeyan Bhargavan
We propose a programming and verification framework to help developers build distributed software applications using composite homomorphic encryption (and secure multi-party computation protocols), to implement secure machine learning and classification over private data. With our framework, a developer can prove that the application code is functionally correct, that it correctly composes the various cryptographic schemes it uses, and that it does not accidentally leak any secrets (via side-channels, for example.) Our end-to-end solution results in verified and efficient implementations of state-of-the-art secure privacy-preserving learning and classification techniques.
17:10 Wrap-up

Accepted Papers

Marshall Ball, Brent Carmer, Tal Malkin, Mike Rosulek and Nichole Schimanski.
Garbled Neural Networks are Practical   
Frank Blom, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede.
Efficient Secure Ridge Regression from Randomized Gaussian Elimination   
Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adrià Gascón.
QUOTIENT: Secure Two-Party Neural Network Training and Prediction via Oblivious Transfer   
Yuantian Miao, Ben Zi Hao Zhao, Minhui Xue, Chen Chao, Lei Pan, Jun Zhang, Dali Kaafar and Yang Xiang.
The Audio Auditor: Participant-Level Membership Inference in Internet of Things Voice Services   
Benjamin Zi Hao Zhao, Hassan Jameel Asghar, Raghav Bhaskar and Mohamed Ali Kaafar.
On Inferring Training Data Attributes in Machine Learning Models   
Harsh Chaudhari, Ashish Choudhury, Arpita Patra and Ajith Suresh.
ASTRA: High Throughput 3PC over Rings with Application to Secure Prediction   
Brendan Avent, Javier Gonzalez, Tom Diethe, Andrei Paleyes and Borja Balle.
Automatic Discovery of Privacy-Utility Pareto Fronts   
Marco Romanelli, Konstantinos Chatzikokolakis and Catuscia Palamidessi.
Optimal Obfuscation Mechanisms via Machine Learning and Applications to Location Privacy   
Sahar Mazloom, Le Phi Hung, Samuel Ranellucci and S. Dov Gordon.
Secure parallel computation on national scale volumes of data   
James Bell, Aurélien Bellet, Adria Gascon and Tejas Kulkarni.
Private Protocols for $U$-statistics in the Local Model and Beyond   
Mark Abspoel, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede.
Fast Secure Comparison for Medium-Sized Integers and Its Application in Binarized Neural Networks   
Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock and Anderson Nascimento.
Secret Sharing based Private Text Classification   
Qingrong Chen, Chong Xiang, Minhui Xue, Bo Li, Nikita Borisov, Dali Kaafar and Haojin Zhu.
Differentially Private Data Sharing: Sharing Models versus Sharing Data   
Antti Koskela, Joonas Jälkö and Antti Honkela.
Computing Exact Guarantees for Differential Privacy   
Prasad Naldurg and Karthikeyan Bhargavan.
A verification framework for secure machine learning   
Thijs Veugen, Bart Kamphorst, Marie Beth van Egmond and Natasja van de L'Isle.
Privacy-Preserving Coupling of Vertically-Partitioned Databases and Subsequent Training with Gradient Descent   
Sebastian P. Bayerl, Ferdinand Brasser, Christoph Busch, Tommaso Frassetto, Patrick Jauernig, Jascha Kolberg, Andreas Nautsch, Korbinian Riedhammer, Ahmad-Reza Sadeghi, Thomas Schneider, Emmanuel Stapf, Amos Treiber and Christian Weinert.
Privacy-Preserving Speech Processing via STPC and TEEs   
Ranya Aloufi, Hamed Haddadi and David Boyle.
Emotionless: Privacy-Preserving Speech Analysis for Voice Assistants   
Lukas Burkhalter, Alexander Viand, Anwar Hithnawi and Hossein Shafagh.
Robust Secure Aggregation for Privacy-Preserving Federated Learning with Adversaries   
Anders Dalskov, Daniel Escudero and Marcel Keller.
Secure Evaluation of Quantized Neural Networks   
Mohammad Yaghini, Bogdan Kulynych and Carmela Troncoso.
Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning   

Organization


Workshop organizers

  • Borja Balle (DeepMind)
  • Adrià Gascón (Google)
  • Olya Ohrimenko (Microsoft Research)
  • Mariana Raykova (Google)
  • Phillipp Schoppmmann (HU Berlin)
  • Carmela Troncoso (EPFL)

Program Committee

  • Pauline Anthonysamy (Google)
  • Brendan Avent (USC)
  • Carsten Baum (BIU)
  • Aurélien Bellet (Inria)
  • Elette Boyle (IDC Herzliya)
  • Kamalika Chaudhuri (UCSD)
  • Giovanni Cherubin (EPFL)
  • Graham Cormode (University of Warwick)
  • Morten Dahl (Dropout Labs)
  • Christos Dimitrakakis (Chalmers University of Technology)
  • Jack Doerner (Northeastern)
  • Jamie Hayes (UCL)
  • Dali Kaafar (Macquarie University and Data61-CSIRO)
  • Peter Kairouz (Google)
  • Shiva Kasiviswanathan (Amazon)
  • Marcel Keller (Data61)
  • Niki Kilbertus (Cambridge University)
  • Ágnes Kiss (TU Darmstadt)
  • Nadin Kokciyan (King's College London)
  • Boris Köpf (Microsoft Research)
  • Aleksandra Korolova (USC)
  • Eleftheria Makri (KU Leuven)
  • Sebastian Meiser (Visa)
  • Luca Melis (Amazon)
  • Kartik Nayak (Duke University)
  • Catuscia Palamidessi (École Polytechnique & INRIA)
  • Peter Rindal (Visa Research)
  • Benjamin Rubinstein (University of Melbourne)
  • Anand Sarwate (Rutgers University)
  • Thomas Schneider (TU Darmstadt)
  • Peter Scholl (Aarhus University)
  • Or Sheffet (University of Alberta)
  • Nigel Smart (KU Leuven)
  • Adam Smith (Boston University)
  • Florian Tramer (Stanford)
  • Muthuramakrishnan Venkitasubramaniam (Rochester)
  • Xiao Wang (Northwestern University)
  • Kevin Yeo (Google)
  • Pinar Yolum (Utrecht University)
  • Yang Zhang (CISPA Helmholtz Center)

Sponsors


Previous Editions