Last Updated on March 28, 2024 by Ashish
Introduction
Microsoft has announced the general availability of OpenAI’s Whisper speech-to-text model on Azure. This advanced AI model offers powerful capabilities for transcribing and translating audio data across 57 languages.
Key Features of Whisper on Azure
- Fast processing time for audio transcription and translation
- Support for 57 languages
- Flexible options via Azure OpenAI Service and Azure AI Speech
Use Cases
Since the public preview, customers across healthcare, education, finance, manufacturing, media, agriculture and more are using Whisper for:
Use Case | Description |
---|---|
Call Center Conversations | Transcribing customer calls for analysis |
Accessibility | Adding captions to audio/video content |
Data Mining | Extracting insights from audio/video data |
Deployment Options
Whisper is available via two Azure services, catering to different workload needs:
Azure OpenAI Service
Feature | Description |
---|---|
Model | Mirrors OpenAI Whisper model |
Best For | Time-sensitive, smaller file workloads |
API | Whisper REST API in Azure OpenAI Studio |
Azure AI Speech
Feature | Description |
---|---|
File Size | Up to 1GB per file |
Batch Processing | Up to 1000 files per request |
Speaker Diarization | Distinguish between speakers |
Model Customization | Fine-tune Whisper with custom data |
With Whisper generally available on Azure, enterprises now have a powerful speech AI solution for understanding voice data at scale while benefiting from Azure’s enterprise-grade capabilities.
Getting Started
To start using Whisper:
- Azure OpenAI Service: Apply for access, create a resource, use from OpenAI Studio
- Azure AI Speech: Access via Batch Speech-to-Text in Speech Studio
The release of Whisper reinforces Microsoft’s commitment to offering cutting-edge generative AI models to accelerate innovation across industries.
For more information, refer to this Microsoft Azure blog.