STT, also known as automatic speech recognition (ASR), is the technology that converts spoken audio into text. Google's speech services in the 2010s, Apple Siri, and Microsoft Cortana brought the field to consumers; OpenAI's Whisper (2022) then democratized multilingual ASR with open weights. It's a foundational component for meeting transcription, captions, voice search, and voice agent lines. Together with TTS, it forms the two ends of any voice AI assistant.
MEVZU N°124ISTANBULYEAR I — VOL. III
Glossary · Beginner · 2015
STT — Speech-to-Text
Technology that converts spoken audio into text.
- EN — English term
- STT (Speech-to-Text)
- TR — Turkish term
- STT — Sesten Metne