CLC number: TP391.7
On-line Access: 2024-08-27
Received: 2023-10-17
Revision Accepted: 2024-05-08
Crosschecked: 2009-04-29
Cited: 3
Clicked: 6381
Feng LI, Jin MA, Jian-hua LI. Distributed anonymous data perturbation method for privacy-preserving data mining[J]. Journal of Zhejiang University Science A, 2009, 10(7): 952-963.
@article{title="Distributed anonymous data perturbation method for privacy-preserving data mining",
author="Feng LI, Jin MA, Jian-hua LI",
journal="Journal of Zhejiang University Science A",
volume="10",
number="7",
pages="952-963",
year="2009",
publisher="Zhejiang University Press & Springer",
doi="10.1631/jzus.A0820320"
}
%0 Journal Article
%T Distributed anonymous data perturbation method for privacy-preserving data mining
%A Feng LI
%A Jin MA
%A Jian-hua LI
%J Journal of Zhejiang University SCIENCE A
%V 10
%N 7
%P 952-963
%@ 1673-565X
%D 2009
%I Zhejiang University Press & Springer
%DOI 10.1631/jzus.A0820320
TY - JOUR
T1 - Distributed anonymous data perturbation method for privacy-preserving data mining
A1 - Feng LI
A1 - Jin MA
A1 - Jian-hua LI
J0 - Journal of Zhejiang University Science A
VL - 10
IS - 7
SP - 952
EP - 963
%@ 1673-565X
Y1 - 2009
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/jzus.A0820320
Abstract: Privacy is a critical requirement in distributed data mining. Cryptography-based secure multiparty computation is a main approach for privacy preserving. However, it shows poor performance in large scale distributed systems. Meanwhile, data perturbation techniques are comparatively efficient but are mainly used in centralized privacy-preserving data mining (PPDM). In this paper, we propose a light-weight anonymous data perturbation method for efficient privacy preserving in distributed data mining. We first define the privacy constraints for data perturbation based PPDM in a semi-honest distributed environment. Two protocols are proposed to address these constraints and protect data statistics and the randomization process against collusion attacks: the adaptive privacy-preserving summary protocol and the anonymous exchange protocol. Finally, a distributed data perturbation framework based on these protocols is proposed to realize distributed PPDM. Experiment results show that our approach achieves a high security level and is very efficient in a large scale distributed environment.
[1] Agrawal, D., Aggarwal, C.C., 2001. On the Design and Quantification of Privacy Preserving Data Mining Algorithms. Proc. 20th ACM SIGMOD-SIGACT-SIGART Symp. on Principles of Database Systems, p.247-255.
[2] Agrawal, R., Srikant, R., 2000. Privacy-preserving data mining. ACM SIGMOD Record, 29(2):439-450.
[3] Ashley, P., Hada, S., Karjoth, G., 2003. The Enterprise Privacy Authorization Language (EPAL 1.1), IBM. Available from: http://www.zurich.ibm.com/security/enterprise-privacy/epal/
[4] Ashrafi, M.Z., Taniar, D., Smith, K., 2003. Towards Privacy Preserving Distributed Association Rule Mining. Distributed Computing-IWDC, p.279-289.
[5] Beaver, D., 1991. Foundations of secure interactive computing. CRYPTO, 1991:377-391.
[6] Bertino, E., Fovino, I.N., Provenza, L.P., 2005. A framework for evaluating privacy preserving data mining algorithms. Data Min. Knowl. Discov., 11(2):121-154.
[7] Chaum, D., Crepeau, C., Damgard, I., 1988. Multiparty Unconditionally Secure Protocols. Proc. 20th Annual ACM Symp. on Theory of Computing, p.11-19.
[8] Chawla, S., Dwork, C., McSherry, F., Smith, A., Wee, H., 2005. Toward Privacy in Public Databases. Theory of Cryptography Conf., p.363-385.
[9] Cramer, R., Damgard, I., Nielsen, J.B., 2001. Multiparty Computation from Threshold Homomorphic Encryption. Proc. EUROCRYPT, p.280-300.
[10] Cranor, L., Langheinrich, M., Marchiori, M., Presler-Marshall, M., Reagle, J. (Eds.), 2002. The Platform for Privacy Preferences 1.0 (P3P1.0) Specification. W3C. Available from: http://www.w3.org/TR/P3P/
[11] CSA (Canadian Standards Association), 2004. Privacy Code. Available from: http://www.csa.ca/standards/privacy/Default.asp?laguage=English
[12] Evfimievski, A., Srikant, R., Agrawal, R., Gehrke, J., 2004. Privacy preserving mining of association rules. Inf. Syst., 29(4):343-364.
[13] Fienberg, S.E., McIntyre, J., 2004. Data swapping: variations on a theme by dalenius and reiss. Priv. Statist. Datab., 3050:14-29.
[14] Fukasawa, T., Wang, J., Takata, T., Miyazaki, M., 2004. An Effective Distributed Privacy-preserving Data Mining Algorithm. Intelligent Data Engineering and Automated Learning (IDEAL), p.320-325.
[15] Goldreich, O., Micali, S., Wigderson, A., 1987. How to Play Any Mental Game or a Completeness Theorem for Protocols with Honest Majority. 19th ACM Symp. on the Theory of Computing, p.218-229.
[16] Kargupta, H., Das, K., Liu, K., 2007. Multi-party, privacy-preserving distributed data mining using a game theoretic framework. LNCS, 4702:523.
[17] Liew, C.K., Choi, U.J., Liew, C.J., 1985. A data distortion by probability distribution. ACM Trans. Datab. Syst., 10(3):395-411.
[18] Paillier, P., 1999. Public-key cryptosystems based on composite degree residuosity classes. Advances in Cryptology EUROCRYPT, 99:223-238. Available from: http://www.springerlink.com/content/kwjvf0k8fqyy2h3d/
[19] Rizvi, S.J., Haritsa, J.R., 2002. Maintaining Data Privacy in Association Rule Mining. Proc. 28th Int. Conf. on Very Large Data Bases, 28:682-693.
[20] Sweeney, L., 2002. Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncert., Fuzz. and Knowl.-based Syst., 10(5):571-588.
[21] Yao, A.C., 1986. How to Generate and Exchange Secrets. Proc. 27th IEEE Symp. on Foundations of Computer Science, p.162-167.
[22] Zhang, P., Tong, Y.H., Tang, S.W., Yang, D.Q., 2005. Privacy Preserving Naive Bayes Classification Advanced Data Mining and Applications. 3584:744-752.
Open peer comments: Debate/Discuss/Question/Opinion
<1>