Estimating the success of re-identifications in incomplete datasets using generative models
99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes, suggesting that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR.
This is a big concern about privacy and a problem for Data Engineering, especially for those working with anonymized personal information. Paper provides a way to re-identify person from anonymized dataset, this can be useful for people who work for government or security companies
https://www.reddit.com/r/science/comments/chko43/9998_of_americans_would_be_correctly_reidentified/
#privacy #gdpr #federatedlearning #ml
99.98% of Americans would be correctly re-identified in any dataset using 15 demographic attributes, suggesting that even heavily sampled anonymized datasets are unlikely to satisfy the modern standards for anonymization set forth by GDPR.
This is a big concern about privacy and a problem for Data Engineering, especially for those working with anonymized personal information. Paper provides a way to re-identify person from anonymized dataset, this can be useful for people who work for government or security companies
https://www.reddit.com/r/science/comments/chko43/9998_of_americans_would_be_correctly_reidentified/
#privacy #gdpr #federatedlearning #ml
Reddit
From the science community on Reddit: 99.98% of Americans would be correctly re-identified in any dataset using 15 demographicโฆ
Explore this post and more from the science community