Transcribe

aws/ml aws/ai aws/service

💡 Definition

Amazon Transcribe makes it easy for developers to add speech-to-text capabilities to their applications. It uses advanced deep learning concepts for automatic speech recognition (ASR).

🔑 Key Concepts

Speech-to-Text: Converts audio files (MP3, WAV) into written text.
Speaker Identification: Can identify different speakers in a conversation ("Speaker 1", "Speaker 2").
Custom Vocabulary: You can add specific terms (e.g., medical jargon, brand names) to improve accuracy.
Transcribe Medical: specialized version for medical dictation.

⚙️ How it Works

Upload an audio file to S3, start a transcription job. The service processes the audio and outputs a JSON file with the text, timestamps, and confidence scores.

🎯 Use Cases

Subtitling: Automatically generating captions for videos.
Call Center Analytics: Transcribing customer support calls for analysis (with Comprehend).
Meeting Notes: Creating transcripts of meetings.

💰 Pricing Model

Audio Seconds: Charged per second of audio transcribed.

📝 Exam Tips (CLF-C02)

Keyword: "Speech to Text", "ASR", "Subtitles".
Converts Audio -> Text.

See Also: * Translate (often used after Transcribe) * Polly (Text to Speech - the opposite)