Interessanter Beitrag wie die DSGVO Data Science beeinflussen wird.
GDPR impacts data science across several different areas. Firstly, there are limits imposed on the ways businesses profile customers and process personal data. Depending on how you define it, that’s a huge part of a data scientist’s job role. Under GDPR, profiling is determined to be any kind of automated personal data processing that analyzes or predicts certain aspects of an individual’s behavior, socioeconomic situation, movements, preferences, health and so forth.
If profiling occurs, then an organization must notify the person involved, list potential consequences and then provide an opportunity to opt out. That is for events where there is a legitimate business purpose to the profiling (that doesn’t infringe an individual’s rights), such as when a credit card processor might use personal data to determine someone’s credit limit.
When profiling is taking place – and automated decision making is being done off the back of it – then a business must prevent any discriminatory factors like race, politics or religious beliefs from having an effect. Bias can be a huge issue in many machine learning algorithms (as seen in a system called COMPAS used to assist criminal sentencing that’s biased towards minorities). There are many underlying reasons behind this, including a machine learning algorithm being built with small biases not recognized by the teams (or data scientists) behind it. The repercussions of these biases only increase through the algorithm’s positive feedback loop.
Data scientists, therefore, have a huge task in front of them – as any perceived bias within algorithms is likely to breach GDPR. If you didn’t already know, any breach of GDPR can result in a fine of up to €20 million or 4% of global turnover (whichever is greater).