TY - GEN
T1 - Simultaneous pattern and data hiding in unsupervised learning
AU - Wang, Jie
AU - Zhang, Jun
AU - Liu, Lian
AU - Han, Dianwei
PY - 2007
Y1 - 2007
N2 - How to control the level of knowledge disclosure and secure certain confidential patterns is a subtask comparable to confidential data hiding in privacy preserving data mining. We propose a technique to simultaneously hide data values and confidential patterns without undesirable side effects on distorting nonconfidential patterns. We use non-negative matrix factorization technique to distort the original dataset and preserve its overall characteristics. A factor swapping method is designed to hide particular confidential patterns for k-means clustering. The effectiveness of this novel hiding technique is examined on a benchmark dataset. Experimental results indicate that our technique can produce a single modified dataset to achieve both pattern and data value hiding. Under certain constraints on the nonnegative matrix factorization iterations, an optimal solution can be computed in which the user-specified confidential memberships or relationships are hidden without undesirable alterations on nonconfidential patterns.
AB - How to control the level of knowledge disclosure and secure certain confidential patterns is a subtask comparable to confidential data hiding in privacy preserving data mining. We propose a technique to simultaneously hide data values and confidential patterns without undesirable side effects on distorting nonconfidential patterns. We use non-negative matrix factorization technique to distort the original dataset and preserve its overall characteristics. A factor swapping method is designed to hide particular confidential patterns for k-means clustering. The effectiveness of this novel hiding technique is examined on a benchmark dataset. Experimental results indicate that our technique can produce a single modified dataset to achieve both pattern and data value hiding. Under certain constraints on the nonnegative matrix factorization iterations, an optimal solution can be computed in which the user-specified confidential memberships or relationships are hidden without undesirable alterations on nonconfidential patterns.
UR - http://www.scopus.com/inward/record.url?scp=49549115709&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=49549115709&partnerID=8YFLogxK
U2 - 10.1109/ICDMW.2007.83
DO - 10.1109/ICDMW.2007.83
M3 - Conference contribution
AN - SCOPUS:49549115709
SN - 0769530192
SN - 9780769530192
T3 - Proceedings - IEEE International Conference on Data Mining, ICDM
SP - 729
EP - 734
BT - ICDM Workshops 2007 - Proceedings of the 17th IEEE International Conference on Data Mining Workshops
T2 - 17th IEEE International Conference on Data Mining Workshops, ICDM Workshops 2007
Y2 - 28 October 2007 through 31 October 2007
ER -