Transcribe
💡 Definition
Amazon Transcribe makes it easy for developers to add speech-to-text capabilities to their applications. It uses advanced deep learning concepts for automatic speech recognition (ASR).
🔑 Key Concepts
- Speech-to-Text: Converts audio files (MP3, WAV) into written text.
- Speaker Identification: Can identify different speakers in a conversation ("Speaker 1", "Speaker 2").
- Custom Vocabulary: You can add specific terms (e.g., medical jargon, brand names) to improve accuracy.
- Transcribe Medical: specialized version for medical dictation.
⚙️ How it Works
Upload an audio file to S3, start a transcription job. The service processes the audio and outputs a JSON file with the text, timestamps, and confidence scores.
🎯 Use Cases
- Subtitling: Automatically generating captions for videos.
- Call Center Analytics: Transcribing customer support calls for analysis (with Comprehend).
- Meeting Notes: Creating transcripts of meetings.
💰 Pricing Model
- Audio Seconds: Charged per second of audio transcribed.
📝 Exam Tips (CLF-C02)
- Keyword: "Speech to Text", "ASR", "Subtitles".
- Converts Audio -> Text.
See Also: * Translate (often used after Transcribe) * Polly (Text to Speech - the opposite)