Large Language Model (LLM)
Large Language Model (LLM) is a class of neural network trained on billions of tokens of text to predict and generate coherent language; LLMs are trained using next-token prediction: given a sequence of tokens, the model learns to assign probabilities to every possible next token; Builders interact with LLMs through REST APIs that accept a prompt and return a completion
Large Language Model (LLM) is a class of neural network trained on billions of tokens of text to predict and generate coherent language. These models learn statistical patterns across grammar, facts, reasoning chains, and code, enabling them to answer questions, write software, and synthesize information without task-specific retraining.
How it works
LLMs are trained using next-token prediction: given a sequence of tokens, the model learns to assign probabilities to every possible next token. Through transformer-based attention, the model captures long-range dependencies in text. Scale, in terms of parameters and training data, is the primary driver of emergent capabilities.
Key facts
- Parameter count: Modern frontier LLMs range from 7 billion to over 1 trillion parameters.
- Training objective: Most are pretrained with causal language modeling, then aligned with RLHF or similar techniques.
- Pricing unit: API access is billed per token, typically split into input and output pricing.
- Context limit: Each model has a fixed context window that caps how much text it can process in one call.
For builders
Builders interact with LLMs through REST APIs that accept a prompt and return a completion. Choosing the right model involves balancing capability, latency, and cost per token. Most production applications wrap the raw API with prompt templates, retrieval layers, or tool-calling scaffolding.
Sources
- Vaswani, A., et al. (2017). Attention Is All You Need. arXiv:1706.03762. arxiv.org
- Brown, T., et al. (2020). Language Models are Few-Shot Learners (GPT-3). arXiv:2005.14165. arxiv.org
- Bommasani, R., et al. (2021). On the Opportunities and Risks of Foundation Models. Stanford CRFM. arxiv.org
- NIST. (2023). AI Risk Management Framework (AI RMF 1.0). nist.gov
- Stanford HAI. Foundation Models research portal. hai.stanford.edu