CCS 2019 Workshop
London, November 15
This one day workshop focuses on privacy preserving techniques for training, inference, and disclosure in large scale data analysis, both in the distributed and centralized settings. We have observed increasing interest of the ML community in leveraging cryptographic techniques such as Multi-Party Computation (MPC) and Homomorphic Encryption (HE) for privacy preserving training and inference, as well as Differential Privacy (DP) for disclosure. Simultaneously, the systems security and cryptography community has proposed various secure frameworks for ML. We encourage both theory and application-oriented submissions exploring a range of approaches, including:
We think it will be very valuable to have a forum to unify different perspectives and start a discussion about the relative merits of each approach. The workshop will also serve as a venue for networking people from different communities interested in this problem, and hopefully foster fruitful long-term collaboration.
Submission deadline: July 1, 2019 (11:59pm AoE) [Extended]
Notification of acceptance: August 7, 2019
CCS early registration
deadline: October 1, 2019 (11:59PM BST)
Workshop: November 15, 2019
Submissions in the form of extended abstracts must be at most 4 pages long (not including references), using the CCS format. We do accept submissions of work recently published or currently under review. Submissions should be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have a link to arxiv or a pdf published on the workshop webpage.
Submit Your AbstractAll accepted papers have a slot at the poster session. Please print your poster on size up to A0 (841 × 1189 mm) and bring it to the conference.
Thanks to our generous sponsors, we are able to provide a limited number of travel grants of up to $800 to help partially cover the expenses of PPML attendees who have not received other travel support from CCS this year. To apply, please send an email to ppml19@easychair.org with the subject “PPML19 Travel Grant Application” including a half-page statement of purpose and a summary of anticipated travel expenses. If you are an undergraduate or graduate student, we ask for a half-page recommendation letter supporting your application to be sent by the deadline to the same email address. The deadline for applications is Sep 23, 2019 (11:59pm AoE). The notifications will be sent by Sep 30. Please feel free to send us an email if you have any questions.
8:50 | Welcome |
9:00 | Invited talk: Kobbi Nissim — Legal Theorems of Privacy |
Significant gaps between legal and technical thinking around data privacy make it
hard to exactly understand how legal standards apply to economic mechanisms that use
personal information. Such mechanisms may apply technical solutions constructed
under frameworks such as differential privacy and k-anonymity. However, there is a
lot of uncertainty regarding how these mathematical concepts meet legal standards.
As a result, arguments that mechanisms apply sufficient technical privacy measures
for satisfying legal privacy often lack rigor, and their conclusions are uncertain.
The uncertainty is exacerbated by a litany of successful privacy attacks on privacy
measures thought to meet legal expectations but then shown to fall short of doing
so.
We examine strategies for introducing mathematical rigor into the analysis, so as to
make formal claims and prove “legal theorems” that technical privacy measures meet
legal expectations. For that, we explore some of the gaps between these two very
different approaches, and present initial strategies towards bridging these gaps
considering examples from US and EU law.
Based on joint works with Aaron Bembenek, Mark Bun, Aloni Cohen, Marco Gaboardi, Urs
Gasser, David R. O’Brien, Thomas Steinke, Alexandra Wood, and Salil Vadhan.
|
|
9:45 |
Secure parallel computation on national scale volumes of data
(contributed talk)
Sahar Mazloom, Le Phi Hung, Samuel Ranellucci and S. Dov Gordon |
In this work, we revisit secure computation of graph parallel algorithms,
simultaneously leveraging all three of the following advances: we assume four
computation servers (with an honest majority, and one malicious corruption), allow
differentially private leakage during computation, and, exploiting the parallelism
that this affords, we construct an MPC protocol that can perform at national scales.
Concretely, we compute histograms on 300 million inputs in 4.18 minutes, and we
perform sparse matrix factorization, which is used in recommendation systems, on 20
million inputs in under 6 minutes. These problems have broad, real-world
applications, and, at this scale, we could imagine supporting the CensusBureau, or a
large company such as Amazon. For comparison, the largest experiments in GraphSC [6]
and OblivGraph [18] had 1M inputs, and required 13 hours and 2 hours of runtime,
respectively, while using 4 times the number of processors that we employ.
End-to-end, our construction is 320X faster than OblivGraph.
|
|
10:05 | Poster Session and Coffee Break |
10:45 | Invited talk: Rachel Cummings — Differential Privacy for Dynamic Databases |
Privacy concerns are becoming a major obstacle to using data in the ways we want.
How can data scientists make use of potentially sensitive data, while providing
rigorous privacy guarantees to the individuals who provided data? Over the last
decade, differential privacy has emerged as the de facto gold standard of privacy
preserving data analysis. Differential privacy ensures that an algorithm does not
overfit to the individuals in the database by guaranteeing that if any single entry
in the database were to be changed, then the algorithm would still have
approximately the same distribution over outputs.
In this talk, we will focus on recent advances in differential privacy for dynamic databases, where the content of the database evolves over time as new data are acquired. First, we will see how to extend differentially private algorithms for static databases to the dynamic setting, with relatively small loss in the privacy-accuracy tradeoff. Next, we see algorithms for privately detecting changes in data composition. We will conclude with a discussion of open problems in this space, including the use of differential privacy for other types of data dynamism. (based on joint works with Sara Krehbiel, Kevin Lai, Yuliia Lut, Yajun Mei, Uthaipon (Tao) Tantipongpipat, Rui Tuo, and Wanrong Zhang.) |
|
11:30 |
Garbled Neural Networks are Practical
(contributed talk)
Marshall Ball, Brent Carmer, Tal Malkin, Mike Rosulek and Nichole Schimanski |
We show that garbled circuits offer a practical choice for secure evaluation of
neural network classifiers, comparable with complex, specialized protocols using
less robust assumptions, many rounds of interaction, and/or tailor-made neural
networks. In particular, we develop a scheme for garbling ``off the shelf''
pre-trained neural networks, where the only model preprocessing required is a mild
discretization step as opposed to requiring a specialized SFE-friendly model to be
independently trained. Moreover, as our solution is a garbling scheme, it inherits a
much more diverse range of applications than non-garbling-based solutions, perhaps
most notably, efficient compilers for the malicious setting. At the protocol level,
we start with the garbling scheme of Ball, Malkin, and Rosulek (ACM CCS 2016) for
arithmetic circuits and introduce new optimizations for modern neural network
activation functions. We develop fancygarbling, the first implementation of the
BMR16 garbling scheme along with our new optimizations, as part of heavily optimized
garbled-circuits tool that is driven by a TensorFlow classifier description. We
evaluate our constructions on a wide range of neural networks. We find that our
approach is up to 100x more efficient than straight-forward boolean garbling. It is
also roughly 40% more efficient than DeepSecure (Rouhani et al., DAC 2018), a recent
garbled-circuit-based approach for secure neural network evaluation, which
incorporates significant optimization techniques for boolean circuits. Furthermore,
our approach provides competitive performance tradeoffs (efficiency and latency vs.
communication) also when compared with non-garbled-circuit approaches.
|
|
11:50 |
Learning Rate Adaptation for Federated and Differentially Private Learning
(contributed talk)
Antti Koskela and Antti Honkela |
We propose an algorithm for the adaptation of the learning rate for stochastic
gradient descent (SGD) that avoids the need for validation set use. The idea for the
adaptiveness comes from the technique of extrapolation: to get an estimate for the
error against the gradient flow which underlies SGD, we compare the result obtained
by one full step and two half-steps. The algorithm is applied in two separate
frameworks: federated and differentially private learning. Using examples of deep
neural networks we empirically show that the method works robustly in the case of
federated learning unlike commonly used optimisation methods. We also show that the
adaptive algorithm is competitive with state-of-the-art hyperparameter search
methods and commonly used optimisation methods for differentially privately
training.
|
|
12:10 | Lightning Talks |
1. Nitin Agrawal, Ali Shahin Shamsabadi, Matthew Kusner and Adrià Gascón. QUOTIENT:
Secure Two-Party Neural Network Training and Prediction via Oblivious Transfer
2. Yuantian Miao, Ben Zi Hao Zhao, Minhui Xue, Chen Chao, Lei Pan, Jun Zhang, Dali Kaafar and Yang Xiang. The Audio Auditor: Participant-Level Membership Inference in Internet of Things Voice Services 3. Harsh Chaudhari, Ashish Choudhury, Arpita Patra and Ajith Suresh. ASTRA: High Throughput 3PC over Rings with Application to Secure Prediction 4. Brendan Avent, Javier Gonzalez, Tom Diethe, Andrei Paleyes and Borja Balle. Automatic Discovery of Privacy-Utility Pareto Fronts 5. Marco Romanelli, Konstantinos Chatzikokolakis and Catuscia Palamidessi. Optimal Obfuscation Mechanisms via Machine Learning and Applications to Location Privacy 6. Niek J. Bouman and Niels de Vreede. A Practical Approach to the Secure Computation of the Moore-Penrose Pseudoinverse over the Rationals 7. James Bell, Aurélien Bellet, Adria Gascon and Tejas Kulkarni. Private Protocols for -statistics in the Local Model and Beyond 8. Mark Abspoel, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede. Fast Secure Comparison for Medium-Sized Integers and Its Application in Binarized Neural Networks 9. Devin Reich, Ariel Todoki, Rafael Dowsley, Martine De Cock and Anderson Nascimento. Secret Sharing based Private Text Classification 10. Qingrong Chen, Chong Xiang, Minhui Xue, Bo Li, Nikita Borisov, Dali Kaafar and Haojin Zhu. Differentially Private Data Sharing: Sharing Models versus Sharing Data 11. Antti Koskela, Joonas Jälkö and Antti Honkela. Computing Exact Guarantees for Differential Privacy 12. Thijs Veugen, Bart Kamphorst, Marie Beth van Egmond and Natasja van de L'Isle. Privacy-Preserving Coupling of Vertically-Partitioned Databases and Subsequent Training with Gradient Descent 13. Sebastian P. Bayerl, Ferdinand Brasser, Christoph Busch, Tommaso Frassetto, Patrick Jauernig, Jascha Kolberg, Andreas Nautsch, Korbinian Riedhammer, Ahmad-Reza Sadeghi, Thomas Schneider, Emmanuel Stapf, Amos Treiber and Christian Weinert. Privacy-Preserving Speech Processing via STPC and TEEs 14. Ranya Aloufi, Hamed Haddadi and David Boyle. Emotionless: Privacy-Preserving Speech Analysis for Voice Assistants 15. Lukas Burkhalter, Alexander Viand, Anwar Hithnawi and Hossein Shafagh. Robust Secure Aggregation for Privacy-Preserving Federated Learning with Adversaries 16. Anders Dalskov, Daniel Escudero and Marcel Keller. Secure Evaluation of Quantized Neural Networks 17. Mohammad Yaghini, Bogdan Kulynych and Carmela Troncoso. Disparate Vulnerability: on the Unfairness of Privacy Attacks Against Machine Learning |
|
12:40 | Poster Session and Lunch Break |
14:00 | Invited talk: Vitaly Shmatikov — Unwanted Machine Learning |
Modern machine learning models exhibit amazing accuracy on tasks from image
classification to natural-language processing, but accuracy does not tell the entire
story of what these models have learned. Does a model memorize and leak its training
data? Does it “accidentally" learn privacy-violating tasks uncorrelated with its
training objective? Can it hide a backdoor introduced by an adversary? All of these
are examples of unwanted learning, which we need to understand and mitigate in order
to solve security and privacy problems in today's AI.
|
|
14:45 |
On Inferring Training Data Attributes in Machine Learning Models
(contributed talk)
Benjamin Zi Hao Zhao, Hassan Jameel Asghar, Raghav Bhaskar and Mohamed Ali Kaafar |
A number of recent works have demonstrated that API access to machine learning
models leaks information about the dataset records used to train the models.
Further, the work of [9] shows that such membership inference attacks (MIAs) may be
sufficient to construct a stronger breed of attribute inference attacks (AIAs),
which given a partial view of a record can guess the missing attributes. In this
work, we show (to the contrary) that MIA may not be sufficient to build a successful
AIA. This is because the latter requires the ability to distinguish between similar
records (differing only in a few attributes), and, as we demonstrate, the current
breed of MIA are unsuccessful in distinguishing member records from similar
non-member records. We thus propose a relaxed notion of AIA, whose goal is to only
approximately guess the missing attributes and argue that such an attack is more
likely to be successful, if MIA is to be used as a subroutine for inferring training
record attributes.
|
|
15:05 | Poster Session and Coffee Break |
15:45 | Invited talk: Kim Laine — From Homomorphic Encryption to Private AI: Successes, Challenges, and Opportunities |
In this talk the audience will learn about modern homomorphic encryption and how it
can be used to bring strong privacy guarantees to specific machine learning
applications. We will also discuss limitations of homomorphic encryption in general,
and specifically in the context of private machine learning, and get a glimpse of
how some new exciting research directions may help resolve these challenges in the
future.
|
|
16:30 |
Efficient Secure Ridge Regression from Randomized Gaussian Elimination
(contributed talk)
Frank Blom, Niek J. Bouman, Berry Schoenmakers and Niels de Vreede |
In this paper we present a practical protocol for secure ridge regression. We
develop the necessary secure linear algebra tools, using only basic arithmetic over
prime fields. In particular, we will show how to solve linear systems of equations
and compute matrix inverses efficiently, using appropriate secure random
self-reductions of these problems. The distinguishing feature of our approach is
that the use of secure fixed-point arithmetic is avoided entirely, while
circumventing the need for rational reconstruction at any stage as well. We
demonstrate the potential of our protocol in a standard setting for
information-theoretically secure multiparty computation, tolerating a dishonest
minority of passively corrupt parties. Using the MPyC framework, which is based on
threshold secret sharing over finite fields, we show how to handle large datasets
efficiently, achieving practically the same root-mean-square errors as Scikit-learn.
Moreover, we do not assume that any (part) of the datasets is held privately by any
of the parties, which makes our protocol much more versatile than existing
solutions.
|
|
16:50 |
A verification framework for secure machine learning
(contributed talk)
Prasad Naldurg and Karthikeyan Bhargavan |
We propose a programming and verification framework to help developers build
distributed software applications using composite homomorphic encryption (and secure
multi-party computation protocols), to implement secure machine learning and
classification over private data. With our framework, a developer can prove that the
application code is functionally correct, that it correctly composes the various
cryptographic schemes it uses, and that it does not accidentally leak any secrets
(via side-channels, for example.) Our end-to-end solution results in verified and
efficient implementations of state-of-the-art secure privacy-preserving learning and
classification techniques.
|
|
17:10 | Wrap-up |