§ AI Wiki / Glossary
One-line definitions, the AI dictionary.
§ Search this category
Search the Wiki →The maximum number of tokens a language model can process in a single forward pass.
The total token count consumed in a single model call, used against the model's context-window limit.
A tokenisation algorithm that builds a sub-word vocabulary by iteratively merging the most frequent character pairs.
An attention mechanism where one sequence attends to a different sequence, typically connecting encoder and decoder.
A version of attention where multiple parallel 'heads' learn different relationships at the same time.
The Transformer component that generates the next token conditioned on what came before.
The mechanism that lets a model decide how much weight to give different parts of its input.
The Transformer component that turns input into a meaningful internal representation.
A more clinically accurate term for LLM 'hallucination' — confidently filling gaps with plausible-sounding fiction.
A similarity measure based on the angle between two vectors, returning a value between -1 and 1.
A mechanism where each element in a sequence attends to every other element in the same sequence.
Google's language-agnostic tokeniser library that treats whitespace as just another character.
The sampling parameter that controls how 'creative' or 'deterministic' a model's output is.
The process of converting raw text into a sequence of model-readable tokens.
A sampling strategy that picks the next token from only the K most likely candidates.
A sampling method that draws from the smallest set of candidates whose cumulative probability exceeds P.
Next-generation LLMs that can process hundreds of thousands — sometimes millions — of tokens in a single context.
A list of numbers representing a point in high-dimensional space — direction and magnitude in one bundle.
Google's likelihood-driven sub-word algorithm, similar in spirit to BPE and used by BERT.