STT — Speech-to-Text

Technology that converts spoken audio into text.

EN — English term: STT (Speech-to-Text)
TR — Turkish term: STT — Sesten Metne

STT, also known as automatic speech recognition (ASR), is the technology that converts spoken audio into text. Google's speech services in the 2010s, Apple Siri, and Microsoft Cortana brought the field to consumers; OpenAI's Whisper (2022) then democratized multilingual ASR with open weights. It's a foundational component for meeting transcription, captions, voice search, and voice agent lines. Together with TTS, it forms the two ends of any voice AI assistant.