NeurIPS 2018 Workshop
Montréal, December 8
Palais des Congrès de Montréal
Room: 512CDGH
This one day workshop focuses on privacy preserving techniques for training, inference, and disclosure in large scale data analysis, both in the distributed and centralized settings. We have observed increasing interest of the ML community in leveraging cryptographic techniques such as MultiParty Computation (MPC) and Homomorphic Encryption (HE) for privacy preserving training and inference, as well as Differential Privacy (DP) for disclosure. Simultaneously, the systems security and cryptography community has proposed various secure frameworks for ML. We encourage both theory and applicationoriented submissions exploring a range of approaches, including:
We think it will be very valuable to have a forum to unify different perspectives and start a discussion about the relative merits of each approach. The workshop will also serve as a venue for networking people from different communities interested in this problem, and hopefully foster fruitful longterm collaboration.
Submission deadline: October 8 October 16, 2018 (11:59pm AoE)
Notification of acceptance: November 1, 2018
Workshop: December 8, 2018
Submissions in the form of extended abstracts must be at most 4 pages long (not including references and an unlimited number of pages for supplemental material, which reviewers are not required to take into account) and adhere to the NeurIPS format. We do accept submissions of work recently published or currently under review. Submissions should be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have a link to arxiv or a pdf published on the workshop webpage.
If the new notification date causes issues with a potential visa application that depends specifically on the acceptance at this workshop, please contact us directly at ppml18@easychair.org.
We can offer the opportunity to purchase a NeurIPS registration to one author of each accepted paper.
From the Workshop FAQ: the reserve tickets guarantee attendance to the workshops, and depending on availability, also to the main conference and tutorials. We expect most of the reserve tickets to allow registration for tutorials, conference and workshops, but again, only the workshops part is for certain.
8:30  Welcome and Introduction 
8:50  Invited talk: Ian Goodfellow — Scalable PATE and the Secret Sharer [slides] [keynote] 
Coming soon


9:40  Invited talk: Shafi Goldwasser — Machine Learning and Cryptography: Challenges and Opportunities 
At the mid eighties researchers in computational learning theory pointed the way to examples of hard learning tasks such as learning parity with noise (LPN) and learning with errors (LWE) which have been extremely useful for building sophisticated cryptographic primitives such as homomorphic encryption which are unbreakable if LPN and LWE are hard to learn.
Today, with the rise of machine learning algorithms that use large amounts of data to come up with procedures which have the potential to replace human decision processes, cryptography, in turn, stands to provide machine learning, tools for keeping data private during both training and inference phases of ML and to provide methods to verify adherence of models with data. These promise to be important steps in ensuring the safe transition of power from human to algorithmic decision making. 

10:30  Coffee break 
11:00 
Privacy Amplification by Iteration
(contributed talk)
Vitaly Feldman, Ilya Mironov, Kunal Talwar and Abhradeep Thakurta 
Many commonly used learning algorithms work by iteratively updating an intermediate solution using one or a few data points in each iteration. Analysis of differential privacy for such algorithms often involves ensuring privacy of each step and then reasoning about the cumulative privacy cost of the algorithm. This is enabled by composition theorems for differential privacy that allow releasing of all the intermediate results. In this work, we demonstrate that for contractive iterations, not releasing the intermediate results strongly amplifies the privacy guarantees.
We describe several applications of this new analysis technique to solving convex optimization problems via noisy stochastic gradient descent. For example, we demonstrate that a relatively small number of nonprivate data points from the same distribution can be used to close the gap between private and nonprivate convex optimization. In addition, we demonstrate that we can achieve guarantees similar to those obtainable using the privacyamplificationbysampling technique in several natural settings where that technique cannot be applied. 

11:15 
Subsampled Renyi Differential Privacy and Analytical Moments Accountant
(contributed talk)
[slides]
YuXiang Wang, Borja Balle and Shiva Kasiviswanathan 
We study the problem of subsampling in differential privacy (DP), a question that is the centerpiece behind many successful differentially private machine learning algorithms. Specifically, we provide a tight upper bound on the Renyi Differential Privacy (RDP) parameters for algorithms that: (1) subsample the dataset, and then (2) applies a randomized mechanism M to the subsample, in terms of the RDP parameters of M and the subsampling probability parameter. Our results generalize the moments accounting technique, developed by Abadi et al. [CCS'16] for the Gaussian mechanism, to any subsampled RDP mechanism.


11:30 
The Power of The Hybrid Model for Mean Estimation
(contributed talk)
Yatharth Dubey and Aleksandra Korolova 
In this work we explore the power of the hybrid model of differential privacy (DP) proposed in~\cite{blender}, where some users desire the guarantees of the local model of DP and others are content with receiving the trusted curator model guarantees. In particular, we study the accuracy of mean estimation algorithms for arbitrary distributions in bounded support. We show that a hybrid mechanism which combines the sample mean estimates obtained from the two groups in an optimally weighted convex combination performs a constant factor better for a wide range of sample sizes than natural benchmarks. We analyze how this improvement factor is parameterized by the problem setting and how it varies with sample size.


11:45 
Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity
(contributed talk)
[slides]
Ulfar Erlingsson, Vitaly Feldman, Ilya Mironov, Ananth Raghunathan, Kunal Talwar and Abhradeep Thakurta 
Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users' private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, when each user's value may change only a small number of times. We describe an algorithm whose privacy cost is polylogarithmic in the number of times the statistic is collected.
More fundamentallyby building on anonymity of the users' reportswe also demonstrate how the privacy cost of our LDP algorithm can actually be much lower when viewed in the central model of differential privacy. We show, via a new and general technique, that any permutationinvariant algorithm satisfying $\varepsilon$local differential privacy will satisfy $(O(\varepsilon \sqrt{\log \frac 1 \delta} / \sqrt{n}), \delta)$central differential privacy. In the process, we clarify how the high noise and $\sqrt{n}$ overhead of LDP protocols is a consequence of them being significantly more private in the central model. As a final, practical corollary, our results also imply that industrial deployments of LDP mechanism may have much lower privacy cost than their advertised $\varepsilon$ would indicateat least if reports are anonymized. 

12:00  Lunch break 
13:30  Invited talk: Kamalika Chaudhuri — Challenges in the PrivacyPreserving Analysis of Structured Data [slides] 
Much data analysis today is done on sensitive data, and particular privacy challenges arise when this data is sensitive and structured. In this talk I will describe two such challenges in the privacypreserving analysis of complex, structured data that we have been working on in my group.
The first is continual release of graph statistics in an online manner from an expanding graph, which is motivated by a problem in HIV epidemiology. Even though node differentially private release of graph statistics is highly challenging, here we will describe how we can get a differentially private solution for this problem that performs better than the natural sequential composition baseline. Next, I will talk about analysis of sensitive structured, correlated data, while still preserving the privacy of events in the data. Differential privacy does not adequately address privacy issues in this kind of data, and hence will look at a form of inferential privacy, called Pufferfish, that is more appropriate. We will provide mechanisms, establish their composition properties, and finally evaluate them on real data on physical activity measurements across time. 

14:20  Invited talk: Adam Smith — Models for private data analysis of distributed data [slides] 
This talk will present a partial survey of the proposed (and implemented) models for private analysis of distributed data, together with some new results.


15:10  Coffee break 
15:30 
DPMAC: The Differentially Private Method of Auxiliary Coordinates for Deep Learning
(contributed talk)
Frederik Harder, Jonas Köhler, Max Welling and Mijung Park 
Developing a differentially private deep learning algorithm is challenging, due to the difficulty in analyzing the sensitivity of objective functions that are typically used to train deep neural networks. Many existing methods resort to the stochastic gradient descent algorithm and apply a predefined sensitivity to the gradients for privatizing weights. However, their slow convergence typically yields a high cumulative privacy loss. Here, we take a different route by employing the method of auxiliary coordinates, which allows us to independently update the weights per layer by optimizing a perlayer objective function. This objective function can be well approximated by a loworder Taylor's expansion, in which sensitivity analysis becomes tractable. We perturb the coefficients of the expansion for privacy, which we optimize using more advanced optimization routines than SGD for faster convergence. We empirically show that our algorithm provides a decent trained model quality under a modest privacy budget.


15:45 
Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware
(contributed talk)
Florian Tramer and Dan Boneh 
As Machine Learning (ML) gets applied to securitycritical or sensitive domains, there is a growing need for integrity and privacy for outsourced ML computations. A pragmatic solution comes from Trusted Execution Environments (TEEs), which use hardware and software protections to isolate sensitive computations from the untrusted software stack. However, these isolation guarantees come at a price in performance, compared to untrusted alternatives. This paper initiates the study of high performance execution of Deep Neural Networks (DNNs) in TEEs by efficiently partitioning DNN computations between trusted and untrusted devices. Building upon an efficient outsourcing scheme for matrix multiplication, we propose Slalom, a framework that securely delegates execution of all linear layers in a DNN from a TEE (e.g., Intel SGX or Sanctum) to a faster, yet untrusted, colocated processor. We evaluate Slalom by executing DNNs in an Intel SGX enclave, which selectively delegates work to an untrusted GPU. For two canonical DNNs, VGG16 and MobileNet, we obtain 20× and 6× increases in throughput for verifiable inference, and 11× and 4× for verifiable and private inference.


16:00 
Secure Two Party Distribution Testing
(contributed talk)
[slides]
Alexandr Andoni, Tal Malkin and Negev Shekel Nosatzki 
We study the problem of discrete distribution testing in the twoparty setting. For example, in the standard closeness testing problem, Alice and Bob each have t samples from, respectively, distributions a and b over [n], and they need to test whether a=b or a,b are εfar (in the ℓ1 distance) for some fixed ε>0. This is in contrast to the wellstudied oneparty case, where the tester has unrestricted access to samples of both distributions, for which optimal bounds are known for a number of variations. Despite being a natural constraint in applications, the twoparty setting has evaded attention so far.
We address two fundamental aspects of the twoparty setting: 1) what is the communication complexity, and 2) can it be accomplished securely, without Alice and Bob learning extra information about each other's input. Besides closeness testing, we also study the independence testing problem, where Alice and Bob have t samples from distributions a and b respectively, which may be correlated; the question is whether a,b are independent of εfar from being independent. Our contribution is threefold: ∙ Communication: we show how to gain communication efficiency as we have more samples, beyond the informationtheoretic bound on t. Furthermore, the gain is polynomially better than what one may obtain by adapting oneparty algorithms. For the closeness testing, our protocol has communication s=Θ_ε(n^2/t^2) as long as t is at least the informationtheoretic minimum number of samples. For the independence testing over domain [n]×[m], where n≥m, we obtain s=O_ε(n^2m/t^2+nm/t+m^1/2). ∙ Security: we define the concept of secure distribution testing and argue that it must leak at least some minimal information when the promise is not satisfied. We then provide secure versions of the above protocols with an overhead that is only polynomial in the security parameter. ∙ Lower bounds: we prove tightness of our tradeoff for the closeness testing, as well as that the independence testing requires tight Ω(m^1/2) communication for unbounded number of samples. These lower bounds are of independent interest as, to the best of our knowledge, these are the first 2party communication lower bounds for testing problems, where the inputs represent a set of i.i.d. samples. 

16:15 
Private Machine Learning in TensorFlow using Secure Computation
(contributed talk)
[slides]
Morten Dahl, Jason Mancuso, Yann Dupis, Ben DeCoste, Morgan Giraud, Ian Livingstone, Justin Patriquin and Gavin Uhma 
We present a framework for experimenting with secure multiparty computation directly in TensorFlow. By doing so we benefit from several properties valuable to both researchers and practitioners, including tight integration with ordinary machine learning processes, existing optimizations for distributed computation in TensorFlow, highlevel abstractions for expressing complex algorithms and protocols, and an expanded set of familiar tooling. We give an open source implementation of a stateoftheart protocol and report on concrete benchmarks using typical models from private machine learning.


16:30  Spotlight talks 


17:15  Poster session 
18:15  Wrap up 
Thanks to our generous sponsors, we are able to provide a limited number of travel grants of up to $800 to help partially cover the expenses of authors of accepted papers who have not received other travel support from NeurIPS this year. To apply, please send an email to ppml18@easychair.org with the subject “PPML18 Travel Grant Application” including your resume and a halfpage statement of purpose mentioning the title and the authors of your accepted paper and a summary of anticipated travel expenses. If you are an undergraduate or graduate student, we ask for a halfpage recommendation letter supporting your application to be sent to us by the deadline. The deadline for applications is November 11, 2018 (11:59pm AoE). The notifications will be sent by November 16. Please feel free to send us an email if you have any questions.