On June 19, 2025, the French Data Protection Authority (“CNIL”) published two recommendations for AI developers.  The first recommendation covers reliance on the GDPR’s legitimate interest legal basis for developing an AI model.  It provides examples of legitimate interests that can justify the use of personal data for AI development.  The second recommendation discusses measures to implement when collecting personal data through “web scraping.”  It provides a list of measures that, if followed, will ensure compliance with the GDPR’s accountability principle.

Legitimate Interest as AI Training Legal Basis

The CNIL acknowledges that legitimate interest is the most likely legal basis for AI developers to rely upon, given the challenges in obtaining data subjects’ consent.  In accordance with existing European case law and guidance (described in our blog post here), the CNIL specifies that data controllers can rely on legitimate interest when:

  • The interest pursued is “legitimate” (e.g., scientific research, facilitating public access to information, offering a chatbot service to assist users, improving a product or service to increase its performance, and developing an AI system for fraud prevention purposes).  Commercial interests may also constitute a legitimate interest.
  • The processing is necessary to achieve the legitimate interest pursued.
  • The processing does not disproportionately affect the rights and interests of the persons.  The CNIL outlines some benefits and impacts of processing personal data for AI development.  Furthermore, it outlines relevant mitigating measures that controllers can implement to minimize the impact of processing on data subjects.  Such mitigating measures include, among others, anonymization, the use of synthetic data, and the provision of a prior opt-out mechanism.  Implementing such measures may enable AI developers to rely on legitimate interest for AI training even when the activity poses some risks to data subjects.

The CNIL provides practical examples that demonstrate how the three limbs of the legitimate interest test could apply in the context of AI development.

Web Scraping

The CNIL does not impose a complete ban on web scraping.  Instead, it provides a list of measures and conditions AI developers need to consider when conducting web scraping.  Notably, the CNIL outlines mandatory measures that AI developers must comply with, including precise collection criteria, the exclusion of certain data categories from collection, and the timely deletion of irrelevant data collected.  Finally, the CNIL recommends additional considerations for performing the legitimate interest test when collecting personal data through web scraping, complementing the guidance discussed above.  Such considerations include drawing up a list of websites from which data collection is excluded, excluding data collection from websites that object to web scraping, and limiting data collection to freely accessible data, among others.

*    *    *

The Covington team will continue to monitor developments on AI, and we regularly advise the world’s top technology companies on their most challenging regulatory and compliance issues in the EU and other major markets.  If you have questions about AI regulation, or other tech regulatory matters, we are happy to assist with any queries.

(This blog post was written with the contribution of Alberto Vogel.)

Print:
Email this postTweet this postLike this postShare this post on LinkedIn
Photo of Kristof Van Quathem Kristof Van Quathem

Kristof Van Quathem advises clients on information technology matters and policy, with a focus on data protection, cybercrime and various EU data-related initiatives, such as the Data Act, the AI Act and EHDS.

Kristof has been specializing in this area for over twenty…

Kristof Van Quathem advises clients on information technology matters and policy, with a focus on data protection, cybercrime and various EU data-related initiatives, such as the Data Act, the AI Act and EHDS.

Kristof has been specializing in this area for over twenty years and developed particular experience in the life science and information technology sectors. He counsels clients on government affairs strategies concerning EU lawmaking and their compliance with applicable regulatory frameworks, and has represented clients in non-contentious and contentious matters before data protection authorities, national courts and the Court of the Justice of the EU.

Kristof is admitted to practice in Belgium.