The EU AI Act
& Data Vault
The regulation that turns your data warehouse into compliance infrastructure.
A bank has 3 teams pulling the same customer data from 3 different sources, training 3 different AI models — and nobody documented which data was used, which version, or whether anyone checked for bias. ScaleFree calls this AI Spaghetti. The EU AI Act makes it a compliance violation.
Team A pulls customer transaction data from the core banking system, trains a fraud detection model. Team B pulls the same customer data from a different extract, trains a credit scoring model. Team C pulls customer data from a CRM export, trains a churn predictor. None of them documented which data they used, which version, or whether anyone checked for bias.
Team A’s fraud model uses a customer table from January. Team B’s credit model uses the same table from March. The January version had 50,000 rows with a labeling error in the “high risk” column. Nobody caught it because nobody tracks which version went where.
An auditor asks: “What data trained this fraud model?” The answer is a shrug and a Slack thread from 6 months ago.
A data scientist pulls data directly from Salesforce, trains a model on their laptop, deploys it to production. Bypasses the entire data warehouse. Now you have GDPR exposure (was there a legal basis for this processing?) AND AI Act exposure (where’s the documentation for Article 10?). Nobody in the data team even knows this model exists.
Route all data — including AI training data — through the Data Vault layers. Source → Staging → Raw Vault → Business Vault → AI-Mart. Every row has record_source (where it came from) and load_date (when it arrived). Every transformation is documented by the layer it passes through.
The spaghetti becomes a pipeline with a paper trail.
Regulation (EU) 2024/1689 — entered into force 1 August 2024. The world’s first comprehensive legal framework for artificial intelligence.
Despite being called an “Act,” it’s technically a Regulation — directly binding in all 27 member states, no national transposition needed. Same legal form as GDPR. Extraterritorial scope — applies to any organization whose AI system’s output is used within the EU, regardless of where that organization is based.
The AI Act doesn’t regulate AI itself. It regulates the people who build and use AI systems — proportional to the risk those systems pose.
“Provider” — the entity that develops the AI system or places it on the market. The one who designed the thing. Think: the factory.
“Deployer” — the entity that uses the AI system professionally in their business context. Think: the driver.
Example: HEC Paris buys an AI admissions tool (deployer). The company that built and sold the tool is the provider. If HEC then fine-tunes the model significantly, they might become a provider too. A company can be both simultaneously.
Providers carry the heavier compliance burden — conformity assessments, documentation, post-market monitoring. Deployers must implement human oversight, inform users, and keep logs.
“Regulation” vs “Directive” — The AI Act is technically a Regulation. That matters: a Regulation is directly binding in all 27 member states, no national transposition needed. A Directive (like the old Data Protection Directive 95/46/EC that GDPR replaced) requires each country to write its own implementing law.
The AI Act, like GDPR, is a Regulation — same text applies in France, Germany, Ireland, everywhere. No room for 27 different interpretations.
“The AI Act doesn’t regulate AI itself — it regulates the people who build and use AI systems, proportional to the risk those systems pose.”
Think of it like fire safety codes. A candle on your desk — no rules. A restaurant kitchen — must label the fire exit. A hospital — fire alarms, sprinkler systems, evacuation plans, annual inspections. A building full of explosives — you can’t build that at all.
The AI Act works the same way: four levels based on how much damage the AI can do to people’s lives. The higher the risk, the more rules you follow.
Unacceptable Risk — Social scoring by a government (rating citizens’ behavior to restrict services) — banned outright, Article 5. Subliminal manipulation also banned. No compliance path — these are prohibited.
High Risk — A bank’s AI deciding who gets a loan, or an HR tool screening CVs. These touch people’s livelihoods: conformity assessments, documented data governance (Article 10), human oversight, and a human who can override the AI.
Limited Risk — A chatbot on a retail website. Just has to tell the user “you’re talking to an AI.” Deepfakes and AI-generated content also fall here — transparency labels required.
Minimal Risk — A spam filter, a video game NPC. No obligations at all.
Article 5 prohibitions are already in force since Feb 2025 — social scoring and subliminal manipulation are banned now.
Two routes to high-risk classification:
Annex II (product safety) — AI embedded in already-regulated products: medical devices, vehicles, machinery, toys. Enforcement: August 2027.
Annex III (standalone) — AI systems in 8 specific sensitive domains (see table below). These trigger Article 10 data governance. Enforcement: August 2026.
The staggered timeline matters: financial services clients (Annex III Domain 5) must comply a full year before manufacturing clients with product-embedded AI (Annex II).
Annex III Domain 5 (credit scoring, insurance risk) is where most ScaleFree financial services clients land — the domain triggering Article 10 obligations. Manufacturing clients may hit Domain 2 (critical infrastructure).
Digital Omnibus proposal (Nov 2025): The Commission proposed delaying Annex III enforcement by up to 16 months because harmonised standards aren’t ready. Not yet adopted. Germany’s Federal Cabinet approved the KI-MIG in February 2026, designating the Bundesnetzagentur as primary AI market surveillance authority — though the law is still completing its parliamentary process.
ScaleFree’s clients are in limbo: the law exists, the enforcement timeline is shifting, and the companies that built compliant infrastructure early get competitive advantage regardless of when the deadline lands.
| # | Domain | Examples |
|---|---|---|
| 1 | Biometrics | Remote biometric identification, emotion recognition |
| 2 | Critical infrastructure | Safety components in digital infrastructure, road traffic, utilities |
| 3 | Education | School admissions, learning evaluation, test proctoring |
| 4 | Employment | Recruitment, CV screening, promotion/termination, performance monitoring |
| 5 | Essential services | Credit scoring, insurance risk, public benefit eligibility |
| 6 | Law enforcement | Risk of victimization, evidence reliability, reoffending risk |
| 7 | Migration & border | Visa/asylum examination, security/health risk assessment |
| 8 | Justice & democracy | Legal research AI, election influence |
| Date | What Happens | Status |
|---|---|---|
| 1 Aug 2024 | AI Act enters into force | Done |
| 2 Feb 2025 | Prohibited practices banned (Art. 5) + AI literacy (Art. 4) | In effect |
| 2 Aug 2025 | GPAI obligations + national authorities operational | In effect |
| 2 Aug 2026 | High-risk system obligations enforceable (Annex III) | Upcoming |
| 2 Aug 2027 | Product-embedded AI (Annex II) + legacy GPAI compliance | Future |
| Violation | Max Fine | Turnover % |
|---|---|---|
| Prohibited practices (Art. 5) | EUR 35,000,000 | 7% |
| High-risk non-compliance (Arts. 9–15) | EUR 15,000,000 | 3% |
| Incorrect information | EUR 7,500,000 | 1% |
Compare: GDPR’s maximum is EUR 20M / 4%. The AI Act’s top tier is nearly double. The EU is signaling that AI non-compliance is treated more seriously than data protection non-compliance.
Article 10 is the article that turns a data warehouse into compliance infrastructure. Every requirement it lists maps to something Data Vault already does.
Say a bank trains an AI to score loan applications. Article 10 says: you must know where that training data came from (which source system, when it was extracted). You must document every transformation (how raw data became the features the model consumed). You must check whether the data is biased. And you must keep immutable records of all of this for the auditor.
Data Vault does every one of these things as part of its base design — not as an add-on.
Origin tracking — DV puts record_source and load_date on every single row. Article 10 requires knowing where data came from — DV provides it by default.
Transformation documentation — Data passes through staging, raw vault, business vault, and mart — each layer is a documented step.
Bias examination — Profiling queries on the Business Vault (e.g., “what percentage of this training set is female vs male?”).
Immutable records — Satellites are append-only. Yesterday’s data is still there next to today’s data. You never overwrite history.
The full Article 10 → DV mapping:
Article 10(5) opens a narrow door: you can process special category data — race, health, religion — for bias detection purposes, even though GDPR Article 9 normally prohibits it. But it’s a dependent derogation — it activates GDPR Art. 9(2)(g), not a standalone override. Only when anonymized or synthetic data won’t do the job, and strict safeguards apply.
This matters for ScaleFree clients: checking whether a credit scoring model discriminates by ethnicity requires looking at ethnicity data — Article 10(5) is the legal basis.
Article 10(6): Even non-training systems (rule-based expert systems that don’t learn from data) must still govern their testing datasets. The compliance chain is universal.
A specialized Information Mart at the very top of the DV stack, right before data reaches the AI model. This is where quality checks happen, bias audits run, and representativeness is validated. Data gets “cleaned, integrated, and approved by data experts” before the model ever sees it.
The AI-Mart is the compliance enforcement point — not the Raw Vault (which stores everything as-is), not the Business Vault (which integrates but doesn’t gatekeep for AI purposes).
After the AI model runs, ScaleFree recommends loading its logs back into the warehouse: what data went in, what features were derived, what the model parameters were, what decisions it made, with what confidence scores. This creates the Article 12 audit trail.
The warehouse becomes a closed loop — data flows out to the AI, and the AI’s behavior flows back in.
“Article 10 requires data provenance, transformation documentation, bias examination, and immutable audit trails. Data Vault delivers all of these by design — record_source, load_date, layered architecture, append-only Satellites. The compliance capability is the base architecture, not a bolt-on.”
GDPR says delete the data. The AI Act says keep the records. Both are law. Both apply to the same system. Here’s what that actually looks like.
1. Bias detection needs sensitive data
A German insurance company’s credit scoring AI rejects 40% more applicants from certain postal codes. To check whether the model discriminates by ethnicity, someone needs to look at ethnicity data. But GDPR Article 9 says: no processing special category data without explicit consent or another legal basis.
Article 10(5) of the AI Act creates a narrow exception: process sensitive data for bias detection — but only when anonymized or synthetic data won’t work, and only with strict safeguards (access controls, time limits, deletion after use).
In practice: the bias audit team gets temporary, logged access to ethnicity data in the Business Vault, runs profiling queries, documents the results, and access is revoked. The PII Satellite holds the sensitive data; access is controlled at the mart level.
2. Automated decisions need human oversight
A bank’s AI auto-rejects a loan application. Under GDPR Article 22: the applicant has the right not to be subject to a solely automated decision with legal effects. Under AI Act Article 14: the deployer must implement human oversight — a person who can understand, override, and stop the AI.
Both laws push the same direction. ScaleFree’s architecture supports this: the AI-Mart logs every decision with its inputs and confidence score, so the human reviewer has the data to actually override meaningfully — not just rubber-stamp.
3. Transparency has two different audiences
A company deploys a chatbot that recommends financial products. GDPR says: tell the individual that their data is being processed, by whom, for what purpose, and their rights. AI Act says: tell the deployer how the system works, what its limitations are, and what data it was trained on.
Both transparency obligations must be satisfied, but they point in different directions — one toward the end user, one toward the organization. A company that only does GDPR transparency (privacy notice) hasn’t touched AI Act transparency (technical documentation).
4. Documentation builds on GDPR foundations
A company already maintains GDPR Article 30 records of processing activities. AI Act Articles 11-12 require technical documentation and logging for high-risk AI. The underlying data documentation overlaps significantly — both need to know what data exists, where it comes from, who accesses it.
A company with solid GDPR Article 30 records has maybe 60% of the AI Act documentation foundation already built. The DV architecture helps here: record_source, load_date, and the layered transformation trail serve both.
5. Delete vs. keep — the hard one
A customer whose data was used to train the insurance company’s credit scoring model submits a GDPR Article 17 right-to-erasure request. GDPR says: delete their personal data. The AI Act says: maintain audit trails of your training data for regulatory review.
You must prove what data trained the model AND delete this person’s data. Nobody has fully resolved this in court yet.
The emerging approach: delete from the PII Satellite (satisfies GDPR), keep pseudonymized metadata in non-PII Satellites — “a 34-year-old male from postal code 10115 was in the training set” without the name or identifiers (satisfies AI Act audit trail). If technically feasible, retrain the model without that individual’s data.
Two clocks: GDPR data breach notification = 72 hours to the supervisory authority. AI Act serious incident = 15 days to the market surveillance authority (2 days if critical infrastructure).
A single event — say, a data breach that exposes AI training data — can trigger both clocks simultaneously. ~90% of Annex III high-risk AI involves personal data. A single system might need both a DPIA (GDPR) and a conformity assessment (AI Act).
“GDPR and the AI Act compound each other — most high-risk AI processes personal data, so you need both. Data Vault handles this through PII Satellite isolation — delete for GDPR, retain pseudonymized audit trail for AI Act.”
ScaleFree’s position is simple: if your company uses AI, all the data feeding that AI must flow through the governed data warehouse. No data scientist pulls data from Salesforce on their laptop. No team builds a side pipeline from a CSV export. Everything goes through the same layers, with the same documentation, the same audit trail.
1. AI-Mart as the last governed layer before data reaches any AI model — where quality, bias, and representativeness get checked.
2. AI Log Loading — after the model runs, feed its logs (inputs, outputs, confidence scores, parameters) back into the warehouse.
3. Data lineage as compliance infrastructure — not a nice-to-have. Article 10 makes it a legal requirement.
4. Access controls at the mart level — not everyone gets to pull data for AI training; access is logged and governed.
ScaleFree’s framing to clients: “If you cannot explain why your AI gave a specific answer or which data it used, you could face fines up to EUR 15 million or 3% of global turnover.” That line converts architectural decisions into budget approvals.
Christof Wenzeritt (co-CEO) ran a Feb 2026 webinar on the “AI-Enabling Data Platform” — DV 2.0 as the foundation for trustworthy AI. a senior consultant speaks at conferences about trustworthy AI.
The argument isn’t that DV was designed for AI — it wasn’t. The argument is that DV’s existing properties (lineage via record_source, historization via append-only Satellites, separation of concerns via Hub/Link/Sat) happen to be exactly what the AI Act requires. Compliance as a side effect of good architecture.
Lina Sibbel (ScaleFree team) presented in Oct 2025 on agentic AI: treating AI agents as identity-bearing entities with role-based access control — the agent gets a record in the Hub just like any other business entity.
Applies to ALL AI systems, not just high-risk. Already in effect since Feb 2025. No standalone fine — but non-compliance is an aggravating factor if you violate other provisions.
For BI consultants working with European clients, understanding the AI Act isn’t optional professional development — it’s a legal obligation. When advising clients on AI-enabling data platforms, consultants are expected to know the regulatory requirements.
“ScaleFree positions Data Vault 2.0 as the foundation for AI Act compliance. The AI-Mart is the compliance enforcement point — the last governed layer before data reaches the AI model. Quality checks, bias auditing, and representativeness validation all happen there. And AI logs get loaded back into the warehouse for Article 12 audit trails.”
The ability to translate between legal obligation and architectural implementation is rare. When Article 10 requires bias examination, that means profiling queries on the Business Vault and validation gates on the AI-Mart. That translation layer — from regulation to architecture — is what European clients increasingly need from their data consultants.
“A hospital’s AI system analyses radiology scans and flags potential cancer for a radiologist to review.”
High Risk (Annex III Domain 1 — Biometrics / Annex II if embedded in a medical device)
The AI output directly influences a clinical decision that can affect a patient’s life. Even though a human reviews it, the AI’s flag shapes what the radiologist looks for. The stakes (life/health) place it firmly in high-risk. It is not Unacceptable (no social scoring or manipulation), not Limited (not merely informational), not Minimal (consequences are significant).
Exercise 1 — Risk Tier Classification
Match each scenario to the correct risk tier: Unacceptable / High / Limited / Minimal.
Article 10 requirement: “The high-risk AI system shall use data that is relevant, sufficiently representative, and to the best extent possible, free of errors.”
Validation gates at AI-Mart / Feature Mart
“Relevant, representative, free of errors” means you CHECK before the AI sees the data. That check cannot happen in the Raw Vault (stores everything as-is) or the Business Vault (integrates but does not gatekeep for AI). The AI-Mart is the last governed layer — where quality, representativeness, and bias audits run.
Exercise 2 — Article 10 → DV Feature Matching
Match each Article 10 requirement to the Data Vault feature that addresses it.
A bank’s AI approves or rejects mortgage applications. A customer complains their application was rejected without explanation.
GDPR Art. 22 + AI Act Art. 14 — Alignment
GDPR: Article 22 — the customer has the right not to be subject to solely automated decisions with legal effect. They can request human review.
AI Act: Article 14 — the deployer must implement human oversight. A person must be able to understand and override the AI.
Tension? None — both push the same direction. The bank needs a human reviewer who can actually understand and override the AI, AND the customer has a right to request that review.
Architecture: AI-Mart logs each decision with inputs and confidence score — the human reviewer has the data to override meaningfully.
Exercise 3 — Dual Compliance Scenario
A German insurance company uses an AI to assess health insurance risk (Annex III high-risk). The AI is trained on data from their Data Vault. A customer submits a GDPR Article 17 right-to-erasure request.
Walk through: (a) which AI Act obligations apply, (b) which GDPR obligations apply, (c) what the conflict is, and (d) how you’d advise the client.