Towards Optimization of Privacy-Utility Trade-Off Using Similarity and Diversity Based Clustering

Majeed, Abdul; Khan, Safiullah; Hwang, Seong Oun

Detailed Information

Cited 0 time in webofscience

Cited 0 time in scopus

Metadata Downloads

Towards Optimization of Privacy-Utility Trade-Off Using Similarity and Diversity Based Clustering

Authors: Majeed, Abdul; Khan, Safiullah; Hwang, Seong Oun

Issue Date: Jan-2024

Publisher: Institute of Electrical and Electronics Engineers (IEEE)

Keywords: Clustering; diversity; generalization; personal data; privacy; privacy preserving data publishing; similarity; utility

Citation: IEEE Transactions on Emerging Topics in Computing, v.12, no.1, pp 368 - 385

Pages: 18

Journal Title: IEEE Transactions on Emerging Topics in Computing

Volume: 12

Number: 1

Start Page: 368

End Page: 385

URI: https://scholarworks.bwise.kr/gachon/handle/2020.sw.gachon/91028

DOI: 10.1109/tetc.2023.3258528

ISSN: 2168-6750

Abstract: Most data owners publish personal data for information consumers, which is used for hidden knowledge discovery. But data publishing in its original form may be subjected to unwanted disclosure of subjects' identities and their associated sensitive information, and therefore, data is usually anonymized before publication. Many anonymization techniques have been proposed, but most of them often sacrifice utility for privacy, or vice versa, and explicitly disclose sensitive information when original data have skewed distributions. To address these technical problems, we propose a novel anonymization method using similarity and diversity-based clustering that effectively preserves both the subjects' privacy and anonymous-data utility. We identify influential attributes from the original data using a machine learning algorithm that assists in preserving a subject's privacy in imbalanced clusters, and that remained unexplored in previous research. The objective function of the clustering process considers both similarity and diversity in the attributes while assigning records to clusters, whereas most of the existing clustering-based anonymity techniques consider either similarity or diversity, thereby sacrificing either privacy or utility. Attribute values in each cluster set are minimally generalized to effectively achieve both competing goals. Extensive experiments were conducted on four real-world benchmark datasets to prove the feasibility of proposed method. The experimental results showed that the common and AI-based privacy risks were reduced by 13.01% and 24.3% respectively in contrast to existing methods. Data utility was augmented by 11.25% and 20.21% on two distinct metrics compared to its counterparts. The complications (e.g., # of iterations) of the clustering process were 2.25× lower than the state-of-the-art methods. IEEE

Files in This Item: There are no files associated with this item.

Appears in Collections: ETC > 1. Journal Articles

Show full item record

qrcode

Related Researcher

Researcher Hwang, Seong Oun photo

Hwang, Seong Oun: College of IT Convergence (컴퓨터공학부(컴퓨터공학전공))

Read more

Altmetrics

Total Views & Downloads

STATISTICS: Total View :4,157,629; Today View :19,434

RSS_1.0 RSS_2.0 ATOM_1.0

1342, Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do, Republic of Korea(13120)031-750-5114

Certain data included herein are derived from the © Web of Science of Clarivate Analytics. All rights reserved.
You may not copy or re-distribute this material in whole or in part without the prior written consent of Clarivate Analytics.

Detailed Information

Related Researcher

Altmetrics

Total Views & Downloads

BROWSE