Generalized random rotation perturbation for vertically partitioned data sets

Zhenmin Lin, Jie Wang, Lian Liu, Jun Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

7 Scopus citations

Abstract

Random rotation is one of the common perturbation approaches for privacy preserving data classification, in which the data matrix is multiplied by a random rotation matrix before publishing in order to preserve data privacy. One distinct advantage of this approach is that it can maintain the geometric properties of the data matrix, so several categories of classifiers that are based on the geometric properties of the data can achieve similar accuracy on the transformed data as that on the original data. In this paper, we generalize this idea to the situation where the data matrix is assumed to be vertically partitioned into several sub-matrices and held by different owners. Each data holder can choose a rotation matrix randomly and independently to perturb their individual data. Then they all send the transformed data to a third party, who collects all of them and forms a whole data set for data mining or other analysis purposes. We show that under such a scheme the geometric properties of the data set is also preserved and thus it can maintain the accuracy of many classifiers and clustering techniques applied on the transformed data as on the original data. This method enables us to develop efficient centralized data mining algorithms instead of distributed algorithms to preserve privacy. Experiments on real data sets show that such generalization is effective for vertically partitioned data sets.

Original languageEnglish
Title of host publication2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009 - Proceedings
Pages159-162
Number of pages4
DOIs
StatePublished - 2009
Event2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009 - Nashville, TN, United States
Duration: Mar 30 2009Apr 2 2009

Publication series

Name2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009 - Proceedings

Conference

Conference2009 IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009
Country/TerritoryUnited States
CityNashville, TN
Period3/30/094/2/09

Keywords

  • Data mining
  • Data perturbation
  • Matrix rotation
  • Privacy preserving

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'Generalized random rotation perturbation for vertically partitioned data sets'. Together they form a unique fingerprint.

Cite this