§ AI Wiki / Glossary
One-line definitions, the AI dictionary.
§ Search this category
Search the Wiki →When a model performs extensive internal reasoning across many tokens before producing its final answer.
An architecture where only a subset of expert sub-networks activates per token, combining huge capacity with cheaper inference.
Memory that persists beyond a single session, available to an agent across future runs.
Next-generation LLMs that can process hundreds of thousands — sometimes millions — of tokens in a single context.