Skip to content
Article Issue #5164

Foundation Model

What to know

Foundation Model refers to an AI model trained at significant scale on diverse, general-purpose data and designed as a reusable base for a wide range of applications; Foundation models are pretrained using self-supervised objectives, such as next-token prediction for language or contrastive matching for vision-language; For most product teams, foundation models eliminate the need to collect training data or run GPU clusters

Foundation Model, WikiWalls Glossary illustration

« Back to Glossary Index

Foundation Model refers to an AI model trained at significant scale on diverse, general-purpose data and designed as a reusable base for a wide range of applications. Rather than building a separate model for each task, teams fine-tune or prompt a single foundation model to handle translation, summarization, code generation, image understanding, and more.

How it works

Foundation models are pretrained using self-supervised objectives, such as next-token prediction for language or contrastive matching for vision-language. The pretraining phase requires enormous compute and data; the resulting model encodes general knowledge that downstream tasks exploit through fine-tuning, few-shot prompting, or retrieval augmentation.

Key facts

  • Origin: The term was introduced in a 2021 Stanford CRFM paper highlighting emergent risks and capabilities from training at scale.
  • Modalities: Foundation models now cover text, images, audio, video, and code.
  • Adaptation methods: Teams adapt them via fine-tuning, LoRA, prompt engineering, or RAG rather than training from scratch.
  • Providers: OpenAI, Anthropic, Google, Meta, Mistral, and Cohere are primary providers of publicly accessible foundation models.

For builders

For most product teams, foundation models eliminate the need to collect training data or run GPU clusters. The engineering work shifts to prompt design, retrieval pipelines, and evaluation rather than model training. Selecting a foundation model involves assessing context length, modality support, licensing, and hosted inference cost.

Sources

« Back to Definition Index
Administrator · 41 published guides · Joined 2016

Welcome to wikiwalls

The WikiWalls Journal · Free, weekly

One careful fix in your inbox each Wednesday.

No affiliate links inside the diagnosis. No sponsored "top 10". One careful fix per week — unsubscribe in one click.

No tracking pixels · No spam · Edited by a human.