Operations Research
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


OPERATIONS RESEARCH
Vol. 55, No. 5, September-October 2007, pp. 890-908
DOI: 10.1287/opre.1070.0407
This Article
Right arrow Full Text (PDF)
Right arrow e-companion
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Nunez, M. A.
Right arrow Articles by Gopal, R. D.
Right arrow Search for Related Content

Stochastic Protection of Confidential Information in Databases: A Hybrid of Data Perturbation and Query Restriction

Manuel A. Nunez, Robert S. Garfinkel, Ram D. Gopal

School of Business, University of Connecticut, Storrs, Connecticut 06269
School of Business, University of Connecticut, Storrs, Connecticut 06269
School of Business, University of Connecticut, Storrs, Connecticut 06269

mnunez{at}business.uconn.edu
rgarfinkel{at}business.uconn.edu
ram{at}business.uconn.edu

Data perturbation and query restriction are two methods developed to protect confidential data in statistical databases. In the former, the data is systematically changed to yield answers to queries that are statistically similar to those that would have resulted from the original data. The latter provides exact answers to queries as long as the risk of exact disclosure of confidential data does not become too great. We present a new methodology to combine these techniques so that the advantages of both are captured. The model is appropriate and computationally viable for large databases whether the queries are linear or nonlinear. The query restriction phase consists of finding an optimal subset of queries to answer exactly without compromising the database. This is an N P-hard problem with a matroid intersection structure that lends itself to an efficient greedy heuristic. Then, given the queries that are answered exactly, we implement a data perturbation phase that provides stochastic protection and consistency. We present computational results on a large database with both linear and nonlinear queries. The results indicate that many queries can be answered exactly and the proposed perturbation approach provides more accurate answers than the standard perturbation method.

Subject classifications: statistical databases; database security; query restriction; data perturbation; matroid intersection; Hilbert spaces; nonlinear least-squares estimation.
History: Received February 2004; revision received July 2006; accepted September 2006.







HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Copyright © 2007 by INFORMS.