Cold start describes the noticeably slow first response that follows when a model or service has been idle and has to initialise on demand. In serverless architecture it is the container that must spin up; in the LLM world it is typically loading model weights from disk or remote storage into GPU memory, which can take anywhere from seconds to tens of seconds. Modern serving stacks fight this with warm pools, model caching, preloading and traffic-shaping; for large, rarely-used models it is a major UX obstacle. The 'cold versus warm' distinction sits at the heart of generative-AI cost-saving strategies.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2018
Cold Start
The slow first response when a model or service has been idle and must initialise on demand.
- EN — English term
- Cold Start
- TR — Turkish term
- Soğuk Başlatma