Towards Rigorous Foundations for Database Privacy
Adam Smith
Pennsylvania State University
Abstract
Collections of personal and sensitive data, previously the purview of
governments and statistical agencies, have become ubiquitous. The
social benefits of analyzing these databases are significant: better
informed policy decisions, more efficient markets, and more accurate
public health data, to name a few. At the same time, releasing
information from repositories of sensitive data can cause devastating
damage to the privacy of individuals or organizations whose
information is stored there. The challenge is to discover and release
global characteristics of these databases, while protecting the
privacy of individuals' records.
I will discuss a recent line of work exploring the tradeoff between these
conflicting goals -- first, how the goals can be formulated precisely and
second, to what extent they can both be satisfied.
I will explain why many popular approaches to data privacy fail to
protect privacy in the presence of even very simple auxiliary
information. In contrast, I will explain how a large class of
computations can be performed while providing meaningful privacy
guarantees, in the presence of *arbitrary* auxiliary information.
This is based on several works, joint with (subsets of) Cynthia Dwork,
Ranjit Ganta, Shiva Kasiviswanathan, Homin Lee, Frank McSherry, Kobbi
Nissim, and Sofya Raskhodnikova.