An Extended Mondrian Algorithm – XMondrian to Protect Identity Disclosure
In recent days, Privacy Preserving Data Publishing (PPDP) is considered as vital research area due to rapid increasing rate of data being published in the Internet day by day. Many Organizations often need to publish their data in internet for research and analysis purpose, but there is no guarantee that those data would be used only for ethical purposes. Hence data anonymization comes into picture and play a vital role in preventing identity disclosure, also it restricts the amount of data that can be seen or used by the external users. It is an extensively used PPDP technique among data encryption, data anonymization and data perturbation methods. Mondrian is considered as one such data anonymization technique that has outperformed compare to many anonymization algorithms, because of its fast and scalable nature. However, the algorithm insists to encode the categorical values into numerical values and decode it, to generalize the data. To overcome this problem, a new extended version of Mondrian algorithm is proposed, and it is called XMondrian algorithm. The proposed algorithm can handle both numerical and categorical attributes without encoding or decoding the categorical values.The effectiveness of the proposed algorithm has been analysed through experimental study and observed that the proposed XMondrian algorithm outshine the existing Mondrian algorithm in terms of anonymization time and Cavg. Cavg is one of the metric used to quantify the utility of data.