Publishing Service

Polishing & Checking

Journal of Zhejiang University SCIENCE A

ISSN 1673-565X(Print), 1862-1775(Online), Monthly

Distributed anonymous data perturbation method for privacy-preserving data mining

Abstract: Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed systems. Meanwhile, data perturbation techniques are comparatively efficient but are mainly used in centralized privacy-preserving data mining (PPDM). In this paper, we propose a light-weight anonymous data perturbation method for efficient privacy preserving in distributed data mining. We first define the privacy constraints for data perturbation based PPDM in a semi-honest distributed environment. Two protocols are proposed to address these constraints and protect data statistics and the randomization process against collusion attacks: the adaptive privacy-preserving summary protocol and the anonymous exchange protocol. Finally, a distributed data perturbation framework based on these protocols is proposed to realize distributed PPDM. Experiment results show that our approach achieves a high security level and is very efficient in a large scale distributed environment.

Key words: Privacy-preserving data mining (PPDM), Distributed data mining, Data perturbation


Share this article to: More

Go to Contents

References:

<HIDE>

[1] Agrawal, D., Aggarwal, C.C., 2001. On the Design and Quantification of Privacy Preserving Data Mining Algorithms. Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems, p.247-255.

[2] Agrawal, R., Srikant, R., 2000. Privacy-preserving data mining. ACM SIGMOD Record, 29(2):439-450.

[3] Ashley, P., Hada, S., Karjoth, G., 2003. The Enterprise Privacy Authorization Language (EPAL 1.1), IBM. Available from: http://www.zurich.ibm.com/security/enterprise-privacy/epal/

[4] Ashrafi, M.Z., Taniar, D., Smith, K., 2003. Towards Privacy Preserving Distributed Association Rule Mining. Distributed Computing-IWDC, p.279-289.

[5] Beaver, D., 1991. Foundations of secure interactive computing. CRYPTO, 1991:377-391.

[6] Bertino, E., Fovino, I.N., Provenza, L.P., 2005. A framework for evaluating privacy preserving data mining algorithms. Data Min. Knowl. Discov., 11(2):121-154.

[7] Chaum, D., Crepeau, C., Damgard, I., 1988. Multiparty Unconditionally Secure Protocols. Proc. 20th Annual ACM Symp. on Theory of Computing, p.11-19.

[8] Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H., 2005. Toward Privacy in Public Databases. Theory of Cryptography Conf., p.363-385.

[9] Cramer, R., Damgard, I., Nielsen, J.B., 2001. Multiparty Computation from Threshold Homomorphic Encryption. Proc. EUROCRYPT, p.280-300.

[10] Cranor, L., Langheinrich, M., Marchiori, M., Presler-Marshall, M., Reagle, J. (Eds.), 2002. The Platform for Privacy Preferences 1.0 (P3P1.0) Specification. W3C. Available from: http://www.w3.org/TR/P3P/

[11] CSA (Canadian Standards Association), 2004. Privacy Code. Available from: http://www.csa.ca/standards/privacy/Default.asp?laguage=English

[12] Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J., 2004. Privacy preserving mining of association rules. Inf. Syst., 29(4):343-364.

[13] Fienberg, S.E., McIntyre, J., 2004. Data swapping: variations on a theme by dalenius and reiss. Priv. Statist. Datab., 3050:14-29.

[14] Fukasawa, T., Wang, J., Takata, T., Miyazaki, M., 2004. An Effective Distributed Privacy-preserving Data Mining Algorithm. Intelligent Data Engineering and Automated Learning (IDEAL), p.320-325.

[15] Goldreich, O., Micali, S., Wigderson, A., 1987. How to Play Any Mental Game or a Completeness Theorem for Protocols with Honest Majority. 19th ACM Symp. on the Theory of Computing, p.218-229.

[16] Kargupta, H., Das, K., Liu, K., 2007. Multi-party, privacy-preserving distributed data mining using a game theoretic framework. LNCS, 4702:523.

[17] Liew, C.K., Choi, U.J., Liew, C.J., 1985. A data distortion by probability distribution. ACM Trans. Datab. Syst., 10(3):395-411.

[18] Paillier, P., 1999. Public-key cryptosystems based on composite degree residuosity classes. Advances in Cryptology EUROCRYPT, 99:223-238. Available from: http://www.springerlink.com/content/kwjvf0k8fqyy2h3d/

[19] Rizvi, S.J., Haritsa, J.R., 2002. Maintaining Data Privacy in Association Rule Mining. Proc. 28th Int. Conf. on Very Large Data Bases, 28:682-693.

[20] Sweeney, L., 2002. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncert., Fuzz. and Knowl.-based Syst., 10(5):571-588.

[21] Yao, A.C., 1986. How to Generate and Exchange Secrets. Proc. 27th IEEE Symp. on Foundations of Computer Science, p.162-167.

[22] Zhang, P., Tong, Y.H., Tang, S.W., Yang, D.Q., 2005. Privacy Preserving Naive Bayes Classification Advanced Data Mining and Applications. 3584:744-752.

Open peer comments: Debate/Discuss/Question/Opinion

<1>

Please provide your name, email address and a comment





DOI:

10.1631/jzus.A0820320

CLC number:

TP391.7

Download Full Text:

Click Here

Downloaded:

4827

Clicked:

5704

Cited:

3

On-line Access:

Received:

2008-04-08

Revision Accepted:

2008-08-10

Crosschecked:

2009-04-29

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952276; Fax: +86-571-87952331; E-mail: jzus@zju.edu.cn
Copyright © 2000~ Journal of Zhejiang University-SCIENCE