Privacy Preserving Machine Learning - PriML and PPML Joint Edition (NeurIPS 2020 Workshop)

Schedule

Time Zone Accommodation

The workshop will be hosted in two blocks: BLOCK I accommodates Asia and Europe (morning) time zones, BLOCK II accommodates U.S. and Europe (evening) time zones. Unless otherwise noted, all listed times are CET (UTC+1).

Each block will contain three main components: an hour to view two recorded talks by invited speakers followed by a 30min live joint Q&A with both, a poster session/social via Gather.Town (little tutorial for our venue) , and various 'contributed talks' highlighting submissions to this workshop with corresponding live Q&A sessions. Due to time zone constraints, most contributed talks will be in BLOCK II, but all talks will be recorded on SlidesLive for viewing afterwards.

To join the workshop you will need a NeurIPS 2020 workshop registration ticket (see neurips.cc for more information). Instructions on how to join the workshop will be provided by NeurIPS.

	BLOCK I, Asia/Europe: (17:20-21:00 Beijing) (14:50-18:30 Delhi) (10:20-14:00 Paris)
10:20-10:30	Welcome & Introduction
10:30-11:00	Invited talk (1): Reza Shokri — Data privacy at the intersection of trustworthy machine learning
Machine learning models leak significant amount of information about their training data, through their predictions and parameters. In this talk, we discuss the impact of trustworthy machine learning, notably interpretability and fairness, on data privacy. We present the privacy risks of model explanations, and the effects of differential privacy on interpretability. We will also discuss the trade-off between privacy and (group) fairness, and how training fair models can make underrepresented groups more vulnerable to inference attacks.
11:00-11:30	Invited talk (2): Katrina Ligett — The Elephant in the Room: The Problems that Privacy-Preserving ML Can’t Solve
In this talk, I attempt to lay out the problems of the data ecosystem, and to explore which of them can potentially be addressed by the toolkit of privacy-preserving machine learning. What we see is that while privacy-preserving machine learning has made amazing advances over the past decade and a half, there are enormous and troubling problems with the data ecosystem that seem to require an entirely different set of solutions.
11:30 - 12:00	Invited Talk Q&A with Reza and Katrina
12:00-12:10	Break 10min
12:10-12:30	POSEIDON: Privacy-Preserving Federated Neural Network Learning (contributed talk: 15min presentation + 5min Q&A) Sinem Sav, Apostolos Pyrgelis, Juan Ramón Troncoso-Pastoriza, David Froelicher, Jean-Philippe Bossuat, João Sá Sousa and Jean-Pierre Hubaux
We address the problem of privacy-preserving training and evaluation of neural networks in an N-party, federated learning setting. We propose a novel system, POSEIDON, that employs multiparty lattice-based cryptography and preserves the confidentiality of the training data, the model, and the evaluation data, under a passive-adversary model and collusions between up to N−1 parties. Our experimental results show that POSEIDON achieves accuracy similar to centralized or decentralized non-private approaches and that its computation and communication overhead scales linearly with the number of parties. POSEIDON trains a 3-layer neural network on the MNIST dataset with 784 features and 60K samples distributed among 10 parties in less than 2 hours.
12:30-14:00	Gather.Town Poster Session and Social (log in with Neurips registration credentials)
	BLOCK II, U.S./Europe: (8:30-13:25 LA) (11:30-16:25 NYC) (17:30-22:25 Paris)
17:30-17:40	Welcome & Introduction
17:40-18:05	Invited talk (1): Carmela Troncoso — Is Synthetic Data Private?
Synthetic datasets produced by generative models have been advertised as a silver-bullet solution to privacy-preserving data publishing. In this talk, we show that such claims are unfounded. We show how synthetic data does not stop linkability or attribute inference attacks; and that differentially-private training does not increase the privacy gain of these datasets. We also show that some target records receive substantially less protection than others and that the more complex the generative model, the more difficult it is to predict which targets will remain vulnerable to inference attacks. We finally challenge the claim that synthetic data is an appropriate solution to the problem of privacy-preserving microdata publishing.
18:05-18:30	Invited talk (2): Dan Boneh — Proofs on secret shared data: an overview
Many consumer devices these days are Internet enabled, and locally record information about how the device is used by its owner. Manufacturers have a strong interest in mining this data in order to improve their products, but concerns over data privacy often prevents this from taking place. A recent collection of techniques enables companies to process this distributed data without ever seeing the data in the clear. One obstacle is that a malfunctioning device might send invalid data and throw off the analysis. No one will ever know because no one can see the data in the clear. To prevent this, there is a need for ultra light-weight zero-knowledge techniques to prove properties about the hidden collected data. This talk will survey some recent progress in this area.
18:30 - 19:00	Invited Talk Q&A with Carmela and Dan
19:00-19:10	Break 10min
19:10-20:10	Gather.Town Poster Session and Social (log in with Neurips registration credentials)
20:10-20:20	Break 10min
20:20-20:35	On the (Im)Possibility of Private Machine Learning through Instance Encoding (contributed talk: 15min presentation) Nicholas Carlini, Samuel Deng, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Shuang Song, Abhradeep Thakurtak, Florian Tramèr
A learning algorithm is private if the produced model does not reveal (too much) about its training set. In this work, we study whether a non-private learning algorithm can be made private by relying on an instance encoding mechanism that modifies the inputs before being fed to the normal learner. We formalize the notion of instance encoding and its privacy by providing two attack models. We first prove impossibility results for achieving the first (stronger) model. We further demonstrate practical attacks in the second (weaker) attack model on recent proposals that aim to use instance encoding for privacy.
20:35-20:50	Poirot: Private Contact Summary Aggregation (contributed talk: 15min presentation) Chenghong Wang, David Pujol, Yaping Zhang, Johes Bater, Matthew Lentz, Ashwin Machanavajjhala, Kartik Nayak, Lavanya Vasudevan and Jun Yang
Physical distancing between individuals is key to preventing the spread of a diseasesuch as COVID-19. On the one hand, having access to information about physical interactions is critical for decision makers; on the other, this information is sensitive and can be used to track individuals. In this work, we design Poirot, a system to collect aggregate statistics about physical interactions in a privacy-preserving manner. We show a preliminary evaluation of our system that demonstrates the scalability of our approach even while maintaining strong privacy guarantees.
20:50-21:05	Greenwoods: A Practical Random Forest Framework for Privacy Preserving Training and Prediction (contributed talk: 15min presentation) Harsh Chaudhari and Peter Rindal
In this work we propose two prediction protocols for a random forest model. The first takes a traditional approach and requires the trees in the forest to be complete in order to hide sensitive information. Our second protocol takes a novel approach which allows the servers to obliviously evaluate only the “active path” of the trees. This approach can easily support trees with large depth while revealing no sensitive information to the servers. We then present a distributed framework for privacy preserving training which circumvents the expensive procedure of privately training the random forest on a combine dataset and propose an alternate efficient collaborative approach with the help of users participating in the training phase.
21:05-21:20	Joint Q&A with the three speakers above
21:20-21:25	Break 5min
21:25-21:40	Shuffled Model of Federated Learning: Privacy, Accuracy, and Communication Trade-offs (contributed talk: 15min presentation) Antonious Girgis, Deepesh Data, Suhas Diggavi, Peter Kairouz and Ananda Theertha Suresh
We study empirical risk minimization (ERM) optimization with communication efficiency and privacy under the shuffled model. We use our communication- efficient schemes for private mean estimation in the optimization solution of the ERM. By combining this with privacy amplification by client sampling and data sampling at each client as well as the shuffled privacy model, we demonstrate that one can get the same privacy, optimization-performance operating point as recent methods using full-precision communication, but lower communication cost, i.e., effectively getting communication efficiency for “free”.
21:40-21:55	Sample-efficient proper PAC learning with approximate differential privacy (contributed talk: 15min presentation) Badih Ghazi, Noah Golowich, Ravi Kumar and Pasin Manurangsi
In this paper we prove that the sample complexity of properly learning a class of Littlestone dimension d with approximate differential privacy is at most Õ(d^6) (ignoring privacy and accuracy parameters). This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of 2^O(d) on the sample complexity. Prior to our work, finiteness of the sample complexity for privately learning a class of finite Littlestone dimension was only known for improper private learners, and the fact that our learner is proper answers another question of Bun et al. which was also asked by Bousquet et al. (2019). Using machinery developed by Bousquet et al., we show that the sample complexity of sanitizing a binary hypothesis class is at most polynomial in its Littlestone dimension and dual Littlestone dimension. This implies that a class is sanitizable if and only if it has finite Littlestone dimension. An important ingredient of our proofs is a new property of binary hypothesis classes that we call irreducibility, which may be of independent interest.
21:55-22:10	Training Production Language Models without Memorizing User Data (contributed talk: 15min presentation) Swaroop Ramaswamy, Om Dipakbhai Thakkar, Rajiv Mathews, Galen Andrew, Brendan McMahan and Françoise Beaufays
This paper presents the first consumer-scale next-word prediction (NWP) model trained with Federated Learning (FL) while leveraging the Differentially Private Federated Averaging (DP-FedAvg) technique. There has been prior work on building practical FL infrastructure, including work demonstrating the feasibility of training language models on mobile devices using such infrastructure. It has also been shown (in simulations on a public corpus) that it is possible to train NWP models with user-level differential privacy (DP) using DP-FedAvg. Nevertheless, training production-quality NWP models with DP-FedAvg in a real-world production environment on a heterogeneous fleet of mobile phones requires addressing numerous challenges. For instance, the coordinating central server has to keep track of the devices available at the start of each round and sample devices uniformly at random from them, while ensuring \emph{secrecy of the sample}, etc. Unlike all prior privacy-focused FL work of which we are aware, for the first time we demonstrate the deployment of a DP mechanism for the training of a production neural network in FL, as well as the instrumentation of the production training infrastructure to perform an end-to-end empirical measurement of unintended memorization.
22:10-22:25	Joint Q&A with the three speakers above

Accepted Papers

Badih Ghazi, Noah Golowich, Ravi Kumar, Pasin Manurangsi
(B1) Sample-efficient proper PAC learning with approximate differential privacy (contributed talk) [arxiv]

Poster B1

Pranav Subramani, Nicholas Vadivelu, Gautam Kamath
(B2) Enabling Fast Differentially Private SGD via Static Graph Compilation and Batch-Level Parallelism [arxiv]

Poster B2

Nishat Koti, Mahak Pancholi, Arpita Patra, Ajith Suresh
(B3) SWIFT: Super-fast and Robust Privacy-Preserving Machine Learning [link]

Poster B3

Xianrui Meng, Joan Feigenbaum
(B4) Privacy-preserving XGBoost Inference [arxiv]

Poster B4

César Sabater, Aurélien Bellet, Jan Ramon
(B5) Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties [arxiv]

Poster B5

Georgios Damaskinos, Celestine Mendler-Dünner, Rachid Guerraoui, Nikolaos Papandreou, Thomas Parnell
(B6) Differentially Private Stochastic Coordinate Descent [arxiv]

Poster B6

Shuang Song, Om Dipakbhai Thakkar, Abhradeep Thakurta
(B7) Characterizing Private Clipped Gradient Descent on Convex Generalized Linear Problems [arxiv]

Poster B7

Carsten Baum, Shahar Segal, Yossi Adi, Benny Pinkas, Joseph Keshet, Chaya Ganesh
(B8) Fairness in the Eyes of the Data: Certifying Machine-Learning Models [arxiv]

Poster B8

Mimee Xu, Awni Hannun, Laurens van der Maaten
(B9) Data Appraisal Without Data Sharing [PDF]

Poster B9

Vitaly Feldman, Tijana Zrnic
(B10) Individual Privacy Accounting via a Rényi Filter [PDF]

Poster B10

Swaroop Ramaswamy , Om Dipakbhai Thakkar, Rajiv Mathews, Galen Andrew, Brendan McMahan, Françoise Beaufays
Training Production Language Models without Memorizing User Data (contributed talk) [arxiv]

No poster

Om Dipakbhai Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Françoise Beaufays
(B11) Understanding Unintended Memorization in Federated Learning [arxiv]

Poster B11

Marcel Keller, Ke Sun
(B12) Effectiveness of MPC-friendly Softmax Replacement [arxiv]

Poster B12

Sinem Sav, Apostolos Pyrgelis, Juan Ramón Troncoso-Pastoriza , David Froelicher, Jean-Philippe Bossuat, João Sá Sousa, Jean-Pierre Hubaux
POSEIDON: Privacy-Preserving Federated Neural Network Learning (contributed talk) [arxiv]

No poster

Javier Alvarez-Valle, Pratik Bhatu, Nishanth Chandran, Divya Gupta, Aditya Nori, Aseem Rastogi, Mayank Rathee, Rahul Sharma, Shubham Ugare
(B13) Secure Medical Image Analysis with CrypTFlow [PDF]

Poster B13

Evrard Garcelon, Vianney Perchet, Ciara Pike-Burke, Matteo Pirotta
(B14) Local Differentially Private Regret Minimization in Reinforcement Learning [PDF]

Poster B14

Debabrota Basu, Christos Dimitrakakis
(B15) Privacy in Multi-armed Bandits: Fundamental Definitions and Lower Bounds on Regret [arxiv]

Poster B15

Vinith M. Suriyakumar, Nicolas Papernot, Anna Goldenberg, Marzyeh Ghassemi
(G1) Challenges of Differentially Private Prediction in Healthcare Settings [arxiv]

Poster G1

Andrew Trask, Kritika Prakash
(G2) Towards General-purpose Infrastructure for Protecting Scientific Data Under Study [PDF]

Poster G2

Brian Knott, Shobha Venkataraman, Awni Hannun, Shubho Sengupta, Mark Ibrahim, Laurens van der Maaten
(G3) CrypTen: Secure Multi-Party Computation Meets Machine Learning [PDF]

Poster G3

Xu Zheng, Nicholas McCarthy, Jeremiah Hayes
(G4) Network Generation with Differential Privacy [PDF]

Poster G4

Ji Gao, Sanjam Garg, Mohammad Mahmoody, Prashant Nalini Vasudevan
(G5) Data Leakage in the Context of Machine Unlearning [PDF]

Poster G5

Hanshen Xiao, Srinivas Devadas
(G6) Randomness Beyond Noise: Differentially Private Optimization Improvement through Mixup [PDF]

Poster G6

Fatemehsadat Mireshghallah, Huseyin A. Inan, Marcello Hasegawa, Victor Ruhle, Robert Sim
(G7) Privacy Regularization: Joint Privacy-Utility Optimization in Language Models

Poster G7

Fatemehsadat Mireshghallah, Mohammadkazem Taram, Ali Jalali, Ahmed Youssef, Dean Tullsen, Hadi Esmaeilzadeh
(G8) A Principled Approach to Learning Stochastic Representations for Privacy in Deep Neural Inference [PDF]

Poster G8

Joonas Jälkö, Lukas Prediger, Antti Honkela, Samuel Kaski
(G9) Twinify: A software package for differentially private data release [PDF]

Poster G9

Edwige Cyffers, Aurélien Bellet
(G10) Privacy Amplification by Decentralization [arxiv]

Poster G10

Antti Koskela, Joonas Jälkö, Lukas Prediger, Antti Honkela
(G11) Tight Approximate Differential Privacy for Discrete-Valued Mechanisms Using FFT [arxiv]

Poster G11

Tejas Kulkarni, Joonas Jälkö, Antti Koskela, Antti Honkela, Samuel Kaski
(G12) Differentially Private Bayesian Inference For GLMs [arxiv]

Poster G12

Mikko Heikkilä, Antti Koskela, Kana Shimizu, Samuel Kaski, Antti Honkela
(G13) Differentially private cross-silo federated learning [arxiv]

Poster G13

Fabian Boemer, Rosario Cammarota, Daniel Demmler, Thomas Schneider, Hossein Yalame
(G14) MP2ML: A Mixed-Protocol Machine Learning Framework for Private Inference [PDF]

Poster G14

Javad Ghareh Chamani, Dimitrios Papadopoulos
(G15) Mitigating Leakage in Federated Learning with Trusted Hardware [arxiv]

Poster G15

Rahul Rachuri, Daniel Escudero, Matthew Jagielski, Peter Scholl
(O1) Adversarial Attacks and Countermeasures on Private Training in MPC [PDF]

Poster O1

Andrew Law, Chester Leung, Rishabh Poddar, Raluca Ada Popa, Chenyu Shi, Octavian Sima, Chaofan Yu, Xingmeng Zhang, Wenting Zheng
(O2) Data-oblivious training for XGBoost models [arxiv]

Poster O2

Théo Jourdan, Antoine Boutet, Carole Frindel, Sébastien Gambs, Rosin Claude Ngueveu
(O3) DYSAN: Dynamically sanitizing motion sensor data against sensitive inferences through adversarial networks [PDF]

Poster O3

Bargav Jayaraman, Lingxiao Wang, Katherine Knipmeyer, Quanquan Gu, David Evans
(O4) Revisiting Membership Inference Under Realistic Assumptions [arxiv]

Poster O4

Nick Angelou, Ayoub Benaissa, Bogdan Cebere, William Clark, Adam James Hall, Michael A. Hoeh, Daniel Liu, Pavlos Papadopoulos, Robin Roehm, Robert Sandmann, Phillipp Schoppmann, Tom Titcombe
(O5) Asymmetric Private Set Intersection with Applications to Contact Tracing and Private Vertical Federated Machine Learning [arxiv]

Poster O5

Ramy Ali, Jinhyun So, A. Salman Avestimehr
(O6) On Polynomial Approximations for Privacy-Preserving and Verifiable ReLU Networks [arxiv]

Poster O6

Himanshu Arora
(O7) Multi-Headed Global Model for handling Non-IID data

Poster O7

Divyat Mahajan, Shruti Tople, Amit Sharma
(O8) Does Domain Generalization Provide Inherent Membership Privacy [PDF]

Poster O8

Antonious Girgis, Deepesh Data, Suhas Diggavi, Peter Kairouz, Ananda Theertha Suresh
Shuffled Model of Federated Learning: Privacy, Accuracy, and Communication Trade-offs (contributed talk) [arxiv]

No poster

Virat Shejwalkar, Amir Houmansadr
(O9) Machine Learning with Membership Privacy via Knowledge Transfer [PDF]

Poster O9

Tianshi Cao, Alex Bie, Karsten Kreis, Sanja Fidler
(O10) Differentially Private Generative Models Through Optimal Transport [PDF]

Poster O10

Ilaria Chillotti, Marc Joye, Pascal Paillier
(O11) New Challenges for Fully Homomorphic Encryption [PDF]

Poster O11

Ishaq Aden-Ali, Hassan Ashtiani, Gautam Kamath
(O12) On the Sample Complexity of Privately Learning Unbounded High-Dimensional Gaussians [arxiv]

Poster O12

Wenlin Chen, Samuel Horváth, Peter Richtárik
(O13) Optimal Client Sampling for Federated Learning [arxiv]

Poster O13

Debmalya Biswas
(O14) Privacy Preserving Chatbot Conversations [PDF]

Poster O14

Vasisht Duddu, Antoine Boutet, Virat Shejwalkar
(O15) Quantifying Privacy Leakage in Graph Embedding [arxiv]

Poster O15

Harsh Chaudhari , Peter Rindal
Greenwoods: A Practical Random Forest Framework for Privacy Preserving Training and Prediction (contributed talk) [PDF]

No poster

Peizhao Hu, Asma Aloufi, Adam Caulfield, Kim Laine, Kristin Lauter
(P1) SparkFHE: Distributed Dataflow Framework with Fully Homomorphic Encryption [PDF]

Poster P1

Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Thao Nguyen
(P2) Robust and Private Learning of Halfspaces [arxiv]

Poster P2

Vitaly Feldman, Audra McMillan, Kunal Talwar
(P3) Hiding Among the Clones: A Simple and Nearly Optimal Analysis of Privacy Amplification by Sh uffling [arxiv]

Poster P3

Harsha Nori, Zhiqi Bu, Judy Shen, Rich Caruana, Janardhan Kulkarni
(P4) Accuracy, Interpretability and Differential Privacy via Explainable Boosting [PDF]

Poster P4

Chenghong Wang, David Pujol, Yaping Zhang, Johes Bater, Matthew Lentz, Ashwin Machanavajjhala, Kartik Nayak, Lavanya Vasudevan, Jun Yang
Poirot: Private Contact Summary Aggregation (contributed talk) [PDF]

No poster

Praneeth Vepakomma, Subha Nawer Pushpita, Ramesh Raskar
(P5) DAMS: Meta-estimation of private sketch data structures for differentially private contact tracing [PDF]

Poster P5

Mark Cesar, Ryan Rogers
(P6) Unifying Privacy Loss for Data Analytics [arxiv]

Poster P6

Maziar Gomrokchi, Spencer Main, Susan Amin, Doina Precup
(P7) PrivAttack: A Membership Inference Attack Framework Against Deep Reinforcement Learning Agents [PDF]

Poster P7

Abhishek Singh, Vivek Sharma, Ayush Chopra, Praneeth Vepakomma, Ramesh Raskar
(P8) Dynamic Channel Pruning for Privacy [PDF]

Poster P8

Nicholas Carlini, Samuel Deng, Sanjam Garg, Somesh Jha, Saeed Mahloujifar, Mohammad Mahmoody, Shuang Song, Abhradeep Thakurta, Florian Tramèr
On the (Im)Possibility of Private Machine Learning through Instance Encoding (contributed talk) [arxiv]

No poster

Chung-Wei Weng, Yauhen Yakimenka, Hsuan-Yin Lin, Eirik Rosnes, Joerg Kliewer
(P9) Generative Adversarial User Privacy in Lossy Single-Server Information Retrieval [arxiv]

Poster P9

Vasisht Duddu, Virat Shejwalkar, Antoine Boutet
(P10) Privacy Risks in Embedded Deep Learning [arxiv]

Poster P10

Nurislam Tursynbek, Aleksandr Petiushko, Ivan Oseledets
(P11) Robustness Threats of Differential Privacy [arxiv]

Poster P11

Pratyush Maini, Mohammad Yaghini, Nicolas Papernot
(P12) Dataset Inference: Ownership Resolution in Machine Learning [PDF]

Poster P12

Anshul Aggarwal, Trevor Carlson, Reza Shokri, Shruti Tople
(P13) SOTERIA: In Search of Efficient Neural Networks for Private Inference [PDF]

Poster P13

James Henry Bell, K. A. Bonawitz, Adrià Gascón, Tancrède Lepoint, Mariana Raykova
(P14) Secure Single-Server Aggregation with (Poly)Logarithmic Overhead [arxiv]

Poster P14

Privacy Preserving Machine Learning - PriML and PPML Joint Edition

Scope

Call For Papers & Important Dates

Submission Instructions

Invited Speakers

Schedule

Time Zone Accommodation

Accepted Papers

Organization

Workshop organizers

Program Committee

Sponsors

Previous Editions