When we think about big data, we barely think about the so-called "big data lakes" and thier potential impact for the digital economy.
A data lake is a centralized repository that allows the storing of structured and unstructured data at any scale and run different types of analytics — from dashboards and visualizations to big data processing, real-time analytics and process automation.
The practice of developing data lakes is growing exponentially, as interest in predictive analytics, automation and Artificial Intelligence (AI) rises. In fact, it is likely that AI will boost interest in data lakes like never before and affect many industry and sectors previously unaware of the value of their data.
This process, however, comes along with many questions and, most importantly, some key legal issues to consider before setting up a data lake - especially if connected to AI-powered technologies.
Here is a list of some of the main legal issues and their relevant implications from various points of view. In particular:
1. Privacy – Even when a database does not include personal data, AI systems may progressively infer (or re-identify) personal information. Once "personalized", such data will have to be processed in accordance with all applicable data protection regulations - thus including GDPR.
2. Antitrust – Big data economy is already a spread antitrust concern. To avoid regulatory risks concerning vast or combined data usage the set up of a data lake should be assessed carefully from an antitrust perspective, especially when it comes to dynamic pricing.
3. Product safety – Data sets may include, among other things, production and performance data, with potential implications on product safety risks. When processing such data, product safety laws will have to be addressed carefully.
4. Insider law – In some countries investors must be protected by the misuse of inside information. It should therefore be carefully assessed when data shold not be inserted in data lakes to comply with the applicable corporate disclosure obligations.
5. Intellectual Property – A data lake can be subject to the same level of protection granted to databases under copyright laws, being understood that such protection does not extend to actual datasets. If data does not fall under copyright or sui generis protection, as is often the case for AI powered dynamic databases, contractual provisions may help.
For further questions please send an email to email@example.com and do not forget to sign up to Dentons Italy's TMT Bites Newsletter.
Datasets processed through AI systems are becoming increasingly popular in many industry and sectors, with an exponential increase in potential “use cases”. Here is a list of some top legal issues to consider.