Streaming is the technique of sending an LLM's Tokens to the client as they are produced, rather than waiting for the full response — exactly what gives ChatGPT its familiar 'typing' feel. The user sees the first words after just a TTFT-sized delay, which dissolves the 'slow model' perception almost entirely; the same total response time feels dramatically faster when streamed. Technically it is delivered over SSE or WebSockets, and OpenAI, Anthropic and the other major APIs support it natively. It is now a foundational ingredient of modern LLM UX — a non-streaming chat interface is essentially unshippable in 2026.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2022
Streaming Output
Sending the model's response token-by-token in real time rather than waiting for the complete answer.
- EN — English term
- Streaming Output
- TR — Turkish term
- Akış Çıktısı