Skip to main content

Data De-Identification

In today’s world, there is an abundance of personally identifiable data, and companies need to take the right steps in order to protect this data. Data de-identification is the process of manipulating the data in order to prevent an individual’s identity from being discerned. Nokia has tasked our team with creating a testing framework that can test different de-identification algorithms to attain the best algorithm for telecommunication data. Due to the complexity of the existing de-identification algorithms and Nokia’s data, we decided on testing two existing algorithms, ARX and DP-Fields, and created a custom algorithm. We have automated our testing framework to run each algorithm and output a pdf that includes four metrics of evaluation: runtime, distribution, k-anonymity, and boxplot. The metrics act as a way for Nokia to compare and contrast quantifiable differences between algorithms. Nokia plans to build upon our scalable testing framework in order to assess which de-identification algorithm is best suited for their needs.

Team Members: 

Jared Campbell

Justin Chen

Jack Hammett

Nikhil Kanzarkar

Abhishek Mandal

Ryan McCray

Sponsors
Nokia Bell Labs
Semester