Hybrid Approach for Data De-Identification during the Storage Phase in Big Data
Main Article Content
Abstract
With the exponential growth of big data, protecting sensitive information during storage has become a significant challenge. Data de-identification techniques, such as k-anonymity, l-diversity, and t-closeness, ensure privacy by anonymizing identifiable attributes. However, these methods often result in a trade-off between privacy and data utility. This paper proposes a hybrid approach combining Genetic Algorithms (GA) and Simulated Annealing (SA) to optimize data de-identification during the storage phase. The proposed framework balances privacy preservation with minimal information loss, making it suitable for secure storage in large-scale datasets. Experimental results demonstrate the hybrid approach’s effectiveness in enhancing privacy while maintaining high data utility for subsequent analytics.