AI Guide

Foundation Model: The pretrained AI base behind enterprise systems

A foundation model is a large neural network pretrained on massive amounts of data that can be adapted to many downstream tasks - from chat assistants to document extraction to autonomous agents. GPT-4o, Claude Opus 4.7, Gemini 2.5 Pro, Llama 4, and Mistral Large 2 are the dominant foundation models in 2026. Learn below how foundation models work, how they differ from purpose-built models, and how enterprises choose and govern them.

Key Facts
  • Stanford HAI introduced the term foundation model in 2021 to describe broadly capable, pretrained models adapted to many downstream tasks
  • Training a frontier foundation model in 2025-2026 costs between USD 50 million and USD 200 million in compute alone
  • The EU AI Act classifies models trained with more than 10^25 floating point operations as general-purpose AI with systemic risk
  • Fewer than ten organisations worldwide can afford to train a frontier foundation model end-to-end
  • Most enterprises consume foundation models through APIs (OpenAI, Anthropic, Google) or cloud platforms (Azure OpenAI, AWS Bedrock, Google Vertex)

Definition: Foundation Model

A foundation model is a large neural network pretrained on broad, general-purpose data that serves as a reusable base for many downstream AI applications - chat, document analysis, code generation, image generation, agentic workflows - rather than being built for one specific task.

Core characteristics of foundation models

Foundation models are general-purpose by design and reach scale that single-task models could not justify, which lets them transfer learnt patterns to tasks they were never explicitly trained on.

  • Trained on massive datasets covering text, code, images, or multimodal content
  • Trained once at enormous cost, then adapted many times through prompting, fine-tuning, or retrieval
  • Capable of zero-shot and few-shot performance on tasks the model was not specifically built for
  • Distributed primarily through APIs and cloud platforms rather than packaged software

Foundation Model vs. purpose-built model

A purpose-built model is trained for one task on task-specific data - a spam classifier, a credit-scoring model, a defect-detection model. A foundation model is trained for general capability and then specialised for many tasks downstream. The economic logic flipped around 2021: instead of every team training their own machine learning model, a small number of foundation models now power thousands of downstream applications via prompt engineering and retrieval.

Importance of foundation models in enterprise AI

Foundation models are the substrate of almost every enterprise generative-AI deployment in 2026. According to Stanford HAI’s 2025 AI Index, more than 90 percent of new enterprise AI applications are built on top of a foundation model rather than trained from scratch. The strategic question for enterprises is no longer whether to use a foundation model, but which one, under which contractual terms, and through which deployment route.

Methods and procedures for foundation models

Enterprises rarely train foundation models themselves; they consume and adapt them through four established patterns.

API consumption through frontier providers

The most common pattern connects enterprise applications to a foundation model via the provider’s API - OpenAI for GPT-4o and o3, Anthropic for Claude, Google for Gemini, Mistral for European-resident inference.

  • Sign a Data Processing Agreement covering GDPR and any sector-specific obligations
  • Configure region routing to keep data within EU boundaries where required
  • Set system prompts, tool definitions, and output formats per use case

Cloud platform deployment

Enterprises in regulated industries deploy foundation models through AWS Bedrock, Azure OpenAI, or Google Vertex - the model runs inside the cloud account the company already governs, with existing IAM policies, audit logs, and compliance attestations.

Fine-tuning and adaptation

For workloads where a generic foundation model is not specific enough, enterprises fine-tune the base model on company data or use parameter-efficient methods (LoRA, instruction tuning). Adaptation is far cheaper than pretraining - thousands of EUR rather than millions - and produces a model specialised to the domain without losing the base capabilities.

Important KPIs for foundation models

Evaluating a foundation model requires metrics across capability, cost, and operational risk - not just headline benchmarks.

Capability metrics

  • Task accuracy on domain-specific evaluations - the only number that matters in production
  • Context window size - 200,000 to 2,000,000 tokens are now common, enabling full-document workflows
  • Multimodal support - text, image, audio, video inputs and outputs depending on the model
  • Tool-use reliability - how consistently the model calls APIs correctly in agent workflows

Cost and throughput metrics

The pricing model varies sharply across providers and tiers. McKinsey’s 2025 State of AI survey found that enterprises typically spend 60-80 percent of their generative-AI infrastructure budget on foundation-model inference costs. Tracking cost per task - not per token - is the only way to compare providers honestly.

Compliance and risk metrics

  • EU AI Act classification - models above 10^25 FLOPs carry General-Purpose AI obligations on the provider
  • Data residency guarantees - where inference runs and where training data was sourced
  • Training-data transparency - some providers publish provenance, others do not
  • Hallucination rate on grounded vs. open-ended queries

Risk factors and controls for foundation models

Vendor concentration risk

The frontier foundation-model market is concentrated in fewer than ten organisations. Enterprises that build deeply on one foundation model take on commercial exposure - pricing changes, API deprecations, capability shifts between model versions. Multi-model abstraction layers and prompt portability are increasingly standard controls.

  • Build an abstraction layer that allows model swaps without code rewrites
  • Re-test critical workflows against alternative models quarterly
  • Negotiate enterprise contracts with version-lock commitments

Data exposure through training and inference

When a foundation model is hosted by an external provider, prompts and outputs may, depending on the contract, be retained or used for model training. The default consumer terms of most providers are not enterprise-safe. An explicit no-training clause and EU-resident processing are the minimum controls for enterprise data.

Hallucination and reasoning failure

Foundation models produce plausible but incorrect outputs in tasks where they lack grounding. Enterprise deployments mitigate this through retrieval-augmented generation, structured outputs, and validation against business rules before any action is taken.

Practical example

A 600-person German Mittelstand manufacturer deployed a multilingual large language model from Anthropic through AWS Bedrock EU to power a customer-support assistant covering DE, EN, FR, and IT. The foundation model handled understanding and drafting across all four languages without separate per-language models or retraining.

  • Single foundation model serving four customer-support languages with consistent quality
  • AWS Bedrock EU keeps prompts and outputs within Frankfurt and Ireland regions
  • Prompt templates and RAG ground the assistant in the company’s product documentation
  • Quarterly evaluation against an alternative provider keeps the architecture portable

Current developments and effects

Reasoning models and extended thinking

The 2025-2026 generation of foundation models added explicit reasoning capabilities - OpenAI o1 and o3, Claude Extended Thinking, Gemini Deep Think. These models trade higher latency and cost for substantially better performance on complex, multi-step problems where earlier models would shortcut to a wrong answer.

  • Reasoning models cost 3-5x more per token than chat-tier models
  • Suited to legal analysis, financial modelling, complex code review, scientific tasks
  • Should not be used for simple lookup or summarisation - the cost premium is unjustified

EU sovereign foundation models

European-trained foundation models gained meaningful share in 2025-2026 as data-residency and AI-sovereignty concerns rose. Mistral Large 2, Aleph Alpha Luminous Supreme, and the EuroLLM consortium target enterprises that need EU-trained and EU-hosted models for regulatory or commercial reasons.

EU AI Act obligations for foundation-model providers

The EU AI Act, fully applicable from August 2026, places specific obligations on providers of general-purpose AI models trained above the 10^25 FLOPs threshold - documentation, copyright compliance, training-data summaries, and additional obligations for systemic-risk models. Enterprises consuming these models inherit transparency obligations downstream.

Conclusion

Foundation models are the substrate of modern enterprise AI - the shared, pretrained capability that turns AI from a per-team project into a per-task configuration. For the Mittelstand, foundation models open access to capabilities that no individual company could build, while concentrating commercial and compliance risk in a handful of providers. The competitive question is no longer whether to use foundation models, but how to consume them safely - multi-model architectures, EU-resident inference, no-training contracts, and disciplined evaluation against the actual business task. As reasoning models, sovereign European alternatives, and the EU AI Act all mature in parallel, the foundation-model layer becomes a strategic procurement decision, not a technical detail.

Frequently Asked Questions

What is the difference between a foundation model and a large language model?

A foundation model is the broader category - any pretrained, general-purpose model that can be adapted to many tasks, including text, image, audio, video, and multimodal models. A large language model is one type of foundation model that specifically handles text. All current LLMs (GPT, Claude, Gemini, Llama, Mistral) are foundation models, but not all foundation models are LLMs - image generators like Stable Diffusion and Flux are foundation models as well.

Which foundation models do enterprises actually use in 2026?

The dominant frontier foundation models in 2026 are OpenAI GPT-4o and o3, Anthropic Claude Opus 4.7 and Sonnet 4.6, Google Gemini 2.5 Pro, Meta Llama 4, and Mistral Large 2. European Mittelstand companies frequently use Claude (via AWS Bedrock EU) or Mistral for data-residency reasons, GPT-4o for the broadest ecosystem support, and Llama for self-hosted deployments where compliance demands it.

How much does it cost to train a foundation model?

Training a frontier foundation model in 2025-2026 costs between USD 50 million and USD 200 million in compute, plus comparable amounts in data acquisition, talent, and evaluation. Stanford HAI’s 2025 AI Index reports the most expensive training runs exceeded USD 200 million. This is why fewer than ten organisations worldwide train at the frontier, and why enterprises consume rather than train.

Can the Mittelstand train its own foundation model?

Practically no. The compute, data, and engineering required for a frontier foundation model are out of reach for any Mittelstand company. The realistic path is fine-tuning an existing foundation model on company-specific data, which costs thousands rather than millions of EUR. For most enterprise use cases, retrieval-augmented generation on top of an unmodified foundation model is enough.

How does the EU AI Act treat foundation models?

The EU AI Act creates a specific category for general-purpose AI models, with stricter obligations for models trained above 10^25 FLOPs. Provider obligations include training-data summaries, copyright compliance, technical documentation, and additional duties for models classified as having systemic risk. Enterprises that consume these models inherit transparency obligations - users must be informed when interacting with AI-generated content in specific contexts.

What is fine-tuning and when do we need it?

Fine-tuning is the process of taking a pretrained foundation model and continuing training on company-specific data to specialise it for a domain - legal drafting in a specific jurisdiction, customer support in your product terminology, document extraction on your forms. You typically need fine-tuning when prompt engineering and retrieval are not enough to reach the required accuracy, when latency or cost demands a smaller specialised model, or when the domain language is sufficiently different from the foundation model’s training distribution.

Building better software Contact us together