TY - GEN
T1 - Simultaneous pattern and data hiding in supervised learning
AU - Lin, Pengpeng
AU - Zhang, Jun
AU - Wang, Xiwei
AU - Shindhelm, Art
PY - 2012
Y1 - 2012
N2 - The ability to hide private data and confidential patterns from potential adversaries while still maintaining data mining value is an important aspect in privacy preserving data mining. In this paper, we study a nonnegative matrix factorization technique, where we show how to define objective functions and derive corresponding multiplicative update functions. We then use that knowledge to propose a data value perturbation scheme that hides data values but still keeps the data pattern to a large degree. Based on the proposed data value perturbation scheme, we develop a dual data hiding scheme which not only hides data but also hides individual sample's class membership. The essential idea is to use an indicator matrix as a guide for the update process. The performance of the proposed schemes are examined on benchmark datasets for both utility value and data perturbation degree. The empirical results show that the data values are well perturbed and our schemes are capable of hiding a data sample's class membership without side effects. At the end, we draw some interesting conclusions and layout potential future work.
AB - The ability to hide private data and confidential patterns from potential adversaries while still maintaining data mining value is an important aspect in privacy preserving data mining. In this paper, we study a nonnegative matrix factorization technique, where we show how to define objective functions and derive corresponding multiplicative update functions. We then use that knowledge to propose a data value perturbation scheme that hides data values but still keeps the data pattern to a large degree. Based on the proposed data value perturbation scheme, we develop a dual data hiding scheme which not only hides data but also hides individual sample's class membership. The essential idea is to use an indicator matrix as a guide for the update process. The performance of the proposed schemes are examined on benchmark datasets for both utility value and data perturbation degree. The empirical results show that the data values are well perturbed and our schemes are capable of hiding a data sample's class membership without side effects. At the end, we draw some interesting conclusions and layout potential future work.
KW - Classification
KW - Indicator Matrix
KW - NMF
KW - PPDM
UR - http://www.scopus.com/inward/record.url?scp=84868331493&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84868331493&partnerID=8YFLogxK
U2 - 10.1109/IRI.2012.6303035
DO - 10.1109/IRI.2012.6303035
M3 - Conference contribution
AN - SCOPUS:84868331493
SN - 9781467322843
T3 - Proceedings of the 2012 IEEE 13th International Conference on Information Reuse and Integration, IRI 2012
SP - 385
EP - 392
BT - Proceedings of the 2012 IEEE 13th International Conference on Information Reuse and Integration, IRI 2012
T2 - 2012 IEEE 13th International Conference on Information Reuse and Integration, IRI 2012
Y2 - 8 August 2012 through 10 August 2012
ER -