Skip to main content

Precision-2

Our state-of-the-art Premium model delivering the most accuracy for teams and enterprises with support for voiceprints and speaker identification. 28% more accurate than Community-1.

Community-1

Our latest open-source model. Useful for local development and research. Also available as a hosted option for teams who want to avoid infrastructure management.

Choosing the right model

Precision-2

Best for: Startups, SMEs, and enterprises who need the state-of-the-art in speaker diarization accuracy and advanced features like voiceprints and speaker identification.
Self-hosted options for Precision-2 are available on Enterprise plans.
Typical use cases: phone call analytics, meeting transcription with speaker attribution, video dubbing and timestamp-critical workflows, building training data for voice assistants, and more. Advanced features:
  • Speaker identification with voiceprints: Identify known speakers in your audio using pre-enrolled voiceprints
  • Exclusive diarization mode: Returns speaker diarization where only one single speaker (the most likely to be transcribed) is active at a time, making STT reconciliation easier
  • Flexible speaker count control: Set minSpeakers, maxSpeakers and numSpeakers parameters for any number of speakers
  • Human-in-the-loop correction: Use confidence scores to help streamline manual correction processes
Learn more about Precision-2

Community-1 (hosted)

Best for: Teams who want the open-source model without managing infrastructure Typical use cases: Prototyping, low-volume production workloads, testing and validation Key Benefits:
  • Cost efficiency: hosted at cost, ideal for experimentation and low-volume workloads
  • No infrastructure management: Focus on your application while we handle the deployment
  • Easy migration: Start with hosted Community-1 and upgrade to Precision-2 when needed
  • Same powerful model: Access the same Community-1 model through our API without setup complexity
Learn more about Community-1

Community-1 (self-hosted with pyannote.audio 4.0)

Best for: Researchers, developers, and personal hobby projects who want full control over their diarization models and workflows. Typical use cases: Academic work, product-iteration, prototyping, and custom diarization deployment (e.g., dataset-specific fine-tuning or custom reconciliation with STT). Key Benefits:
  • Best open-source speaker diarization model available - outperforms pyannote.audio 3.1 across all key metrics
  • Open-source flexibility: Full transparency into model weights and code allowing local and offline training and inference.
Trade-offs:
  • Lower accuracy compared to Precision-2
  • No support for advanced features like speaker identification and voiceprints
  • Requires deploying the model on your own infrastructure
Learn more about pyannote.audio 4.0

How to specify a model in diarization requests

When making a diarization request, you can specify which model to use using the model parameter:
curl -X POST "https://api.pyannote.ai/v1/diarize" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://files.pyannote.ai/marklex1min.wav",
    "model": "precision-2"
  }'
By default, if you do not specify a model, the API will use the Precision-2 model.

Switch between models

You can easily switch between models by changing the model parameter:
  • "model": "community-1" for Community-1
  • "model": "precision-2" for Precision-2
Note: Speaker identification and voiceprint features are not available for Community-1 models. These advanced features are exclusive to Precision-1 and Precision-2.

Compare results between models

To compare performance between models on your specific data:
  1. Process the same audio file with both models
  2. Compare the diarization results
  3. Evaluate which model provides better accuracy for your use case

Pricing

For detailed pricing information, visit our pricing page.