AI Guide

GDPR: Data protection requirements for enterprise AI systems

The General Data Protection Regulation (GDPR) is the EU's binding legal framework governing how organisations collect, process, and store personal data. When enterprises deploy AI systems - from document automation and chatbots to large language models - GDPR applies to every stage of the data lifecycle. Understanding what GDPR requires in an AI context is a prerequisite for any enterprise operating in or serving European markets.

Key Facts
  • Cumulative GDPR fines have exceeded €7.1 billion, with €2.3 billion in penalties in 2025 alone - a 38% year-on-year increase.
  • European data protection authorities now receive 443 breach notifications per day, 22% more than the prior year.
  • 90% of enterprises use AI in daily operations, but only 18% have fully implemented AI governance frameworks (EDPB, 2025).
  • GDPR and the EU AI Act are parallel obligations - compliance with one does not satisfy the other.
  • LLMs deployed as data processors require a valid lawful basis under Article 6, a mandatory DPIA for high-risk processing, and verifiable training data provenance.

Definition: GDPR

The General Data Protection Regulation (GDPR, EU 2016/679) is the EU’s binding legal framework that governs the collection, processing, storage, and transfer of personal data about individuals in the European Economic Area, regardless of where the processing organisation is based.

Core characteristics of GDPR

GDPR establishes mandatory principles that apply to any organisation processing personal data about EU residents. These principles are not optional and cannot be contractually waived.

  • Lawfulness, fairness, and transparency - every data processing activity requires a valid legal basis under Article 6
  • Purpose limitation - data collected for one purpose cannot be repurposed without a new legal basis
  • Data minimisation - only data necessary for the stated purpose may be collected and retained
  • Storage limitation - personal data must be deleted once it is no longer needed for its original purpose

GDPR vs. EU AI Act

GDPR and the EU AI Act are often treated as interchangeable compliance tasks, but they govern different things. GDPR protects individuals’ rights over their personal data - who processes it, how, and on what basis. The EU AI Act governs the risk classification, design safety, and transparency requirements of AI systems as products. An AI system can satisfy the EU AI Act’s transparency requirements and still violate GDPR if it processes personal data without a valid lawful basis. Both regulations apply simultaneously and independently. Enterprises must satisfy both, and the documentation for each is separate.

Importance of GDPR in enterprise AI

Almost every enterprise AI deployment - customer-facing or internal - processes personal data and therefore falls under GDPR. Cumulative GDPR fines have exceeded €7.1 billion since enforcement began, with €2.3 billion in penalties in 2025 alone - a 38% year-on-year increase (Kiteworks, 2026). For enterprises building or procuring AI agents, GDPR compliance is an operational obligation that must be built into every deployment, not added after the fact.

Methods and procedures for GDPR compliance in AI

Three core procedures apply when deploying AI systems that process personal data.

Data Protection Impact Assessment (DPIA)

A DPIA is required under GDPR Article 35 whenever an AI system is likely to result in high risk to individuals’ rights - including automated profiling, large-scale processing of sensitive categories of data, and systematic monitoring of employees or customers. The DPIA must be completed before deployment begins.

  • Describe the processing operation, its purposes, and the legitimate interests being pursued
  • Assess the necessity and proportionality of the processing relative to those purposes
  • Identify and evaluate risks to individuals’ rights and freedoms
  • Document the measures put in place to address identified risks

Lawful basis assessment

Before any AI system processes personal data, the organisation must identify and document a valid lawful basis under GDPR Article 6. For most enterprise AI deployments - internal automation, intelligent document processing, operational analytics - the basis is either contractual necessity (Article 6(1)(b)) or legitimate interests (Article 6(1)(f)). The latter requires a documented Legitimate Interests Assessment (LIA) demonstrating that the organisation’s interests are not overridden by individuals’ rights. Consent is rarely the appropriate basis for internal AI deployments.

Vendor and processor due diligence

When an enterprise deploys a third-party large language model, the AI vendor typically acts as a data processor under GDPR Article 28, requiring a signed Data Processing Agreement (DPA). Many LLM provider agreements are ambiguous about whether prompts are used for model training. If they are, the vendor may qualify as a joint controller - which carries broader obligations and cannot be managed through a standard DPA alone.

Important KPIs for GDPR compliance in AI

Tracking compliance posture across AI deployments requires three categories of metrics.

Compliance posture metrics

  • DPIAs completed: 100% of high-risk AI use cases before go-live
  • Data Processing Agreements signed: 100% of AI vendors handling personal data
  • Lawful basis documented: 100% of AI processing activities in Article 30 records
  • Data subject rights response time: under 30 days (regulatory maximum)

Incident and breach metrics

European data protection authorities receive 443 breach notifications per day - a 22% year-on-year increase (Kiteworks, 2026). The mandatory GDPR breach notification window is 72 hours from the moment an organisation becomes aware of a personal data breach. Enterprises should track mean time to breach detection and mean time to supervisory authority notification as live operational metrics.

Governance maturity metrics

Only 18% of enterprises have fully implemented AI governance frameworks despite 90% using AI in daily operations (EDPB, 2025). Maturity can be tracked by the percentage of AI use cases with documented data provenance, completeness of Article 30 processing records, and frequency of data protection training for teams operating AI systems.

Risk factors and controls for GDPR compliance in AI

Three risk areas are specific to AI systems and not adequately addressed by standard GDPR compliance programmes.

Training data lawfulness

AI models trained on personal data without a valid lawful basis expose the deploying enterprise to enforcement action - even if the runtime deployment is compliant. The EDPB’s April 2025 report on large language models found that anonymisation of training data is rarely achievable in practice; data used in training is likely still personal data subject to GDPR.

  • Require AI vendors to provide documented lawful basis for training data sources
  • Confirm anonymisation methodology meets EDPB standards, not just vendor claims
  • Assess whether the model can reconstruct personal data from outputs (memorisation risk)

Cross-border data transfers

Sending personal data to AI APIs hosted outside the EEA requires a transfer mechanism under GDPR Chapter V. Standard Contractual Clauses (SCCs) are the most common mechanism, but they require a Transfer Impact Assessment (TIA) evaluating whether the destination country’s legal system undermines the SCC protections. For US-hosted AI services, the CLOUD Act risk - US government access to data on US providers’ infrastructure - must be specifically assessed. This is where data governance policy and AI procurement intersect most directly.

Shadow AI and ungoverned deployments

Employees using personal AI accounts for work tasks create GDPR risk because no DPA exists between the employer and the AI vendor. Controls: establish and enforce an approved AI tool list, configure access controls that prevent personal-account usage on work data, and train staff on which tools are permissible for which data categories.

Practical example

A German logistics company with 800 employees deployed an intelligent document processing system to extract invoice and customs data from incoming supplier documents. Before deployment, the compliance team completed a DPIA identifying that supplier contact names constitute personal data under GDPR. They documented legitimate interests as the lawful basis, signed a Data Processing Agreement with the AI vendor explicitly prohibiting prompt use for model training, and configured automated deletion of extracted personal data after 90 days. The team was able to demonstrate full audit readiness within eight weeks of project start.

  • DPIA completed and signed off before any processing began
  • Lawful basis documented in the Article 30 records of processing activities
  • Data Processing Agreement signed with explicit prohibition on training data use
  • Automated 90-day deletion schedule configured and tested
  • Staff trained on which document types may enter the system and which may not

Current developments and effects

Three regulatory developments are reshaping GDPR compliance for AI in 2026.

EDPB LLM opinion (April 2025)

The EDPB’s April 2025 opinion on LLM privacy risks established that anonymisation of training data is rarely sufficient to exclude GDPR applicability, and that enterprises deploying third-party LLMs must conduct Legitimate Interests Assessments before processing personal data through those models.

  • LLM anonymisation claims require documented evidence, not vendor assurances
  • LIA documentation must be reviewed each time the underlying model version changes
  • Personal data inadvertently included in prompts creates a separate lawful basis obligation

Dual compliance burden with EU AI Act

The EU AI Act’s high-risk provisions take effect in August 2026. High-risk AI systems - covering employment decisions, credit scoring, and certain monitoring applications - must satisfy both GDPR and EU AI Act requirements. The AI Act requires a conformity assessment; GDPR requires a DPIA. Both must be completed, and neither substitutes for the other. Workflow automation projects that touch HR or financial data are most likely to fall into this dual-compliance zone.

Enforcement shift to AI-specific cases

GDPR enforcement through 2024 focused primarily on Big Tech and adtech. From 2025, supervisory authorities in Germany, France, Italy, and Ireland have opened investigations specifically targeting enterprise AI deployments - including generative AI tools and automated decision-making in operational contexts. The enforcement focus has moved from cookies and websites to AI systems and data pipelines.

Conclusion

GDPR applies to every enterprise AI deployment that processes personal data - which in practice means almost all of them. The regulation’s requirements around lawful basis, DPIAs, data minimisation, and vendor contracts are well-established, but applying them to AI introduces specific complexity: training data provenance, LLM vendor accountability, and the interaction with the EU AI Act. Enterprises that treat GDPR as a one-off project rather than an ongoing operational discipline will find that regulators and auditors increasingly disagree. Compliant AI systems are built with data protection by design, not retrofitted after a penalty notice arrives.

Frequently Asked Questions

Does GDPR apply to AI systems that only process internal employee data?

Yes. Employee personal data - names, work patterns, communications, and performance records - is personal data under GDPR regardless of whether it is customer data. Internal AI systems such as HR automation, productivity monitoring, or internal chatbots processing employee data require a valid lawful basis and, in many cases, a DPIA. The legal basis for employee data processing is typically contractual necessity or legitimate interests, not consent.

What is the difference between a DPIA and an EU AI Act conformity assessment?

A DPIA is a GDPR requirement under Article 35 that assesses risks to individuals’ personal data rights from a specific processing activity. A conformity assessment under the EU AI Act evaluates whether a high-risk AI system meets the EU AI Act’s technical and governance requirements as a product. They cover different risks, produce different documentation, and cannot substitute for each other - though they can share inputs and the AI Act documentation can inform the DPIA process.

When is an LLM vendor a data processor vs. a joint controller?

The vendor is a data processor when it processes personal data only on the enterprise’s documented instructions and for no independent purpose of its own. The vendor becomes a joint controller if it uses input data - such as prompts - for its own purposes, including model training or product improvement. Many commercial LLM agreements are ambiguous on this point. Enterprises should require vendors to confirm in the DPA that prompts are not used for model training, or treat the vendor as a joint controller and document the arrangement accordingly.

Do we need a new DPIA every time we update our AI system?

A DPIA should be reviewed and updated whenever there is a significant change to the system’s purpose, the categories of data processed, the processing volume, or the risk profile. Updating to a new model version, adding a new data source, or extending the system to a new use case all typically require a DPIA review. A DPIA completed at initial launch does not remain valid indefinitely and should be treated as a living document.

What transfer mechanism is required when using a US-hosted AI API?

Data transfers to US-based AI APIs require a transfer mechanism under GDPR Chapter V. The most common mechanism is Standard Contractual Clauses (SCCs), which must be accompanied by a Transfer Impact Assessment (TIA) evaluating whether the US legal framework - including the CLOUD Act - undermines the SCC protections in practice. The TIA must be documented and kept current; it is not a one-time exercise.

What should an AI-specific Data Processing Agreement contain?

An AI-specific DPA should specify the categories of personal data the vendor processes, the purposes for which the vendor may use that data (with an explicit prohibition on using prompts for model training), the sub-processors the vendor relies on, data deletion schedules, security measures, breach notification obligations (within 24-48 hours to allow the controller to meet the 72-hour GDPR window), and the vendor’s process for handling data subject rights requests such as erasure or access.

Building better software Contact us together