Number of speakers
By default, pyannoteAI automatically detects the number of speakers in your audio with no upper limit. However, you can improve accuracy and performance by providing speaker count constraints when you have this information.Exact speaker count
When you know the exact number of speakers, usenumSpeakers for better results. This is common for:
- Phone conversations (2 speakers)
- Interviews (2 speakers)
- Panel discussions with known participants
- Meeting recordings with known attendees
Setting
numSpeakers typically results in better overall diarization performance since the model can optimize for a specific speaker count.Speaker count ranges
When the exact number is unknown but you have reasonable bounds, useminSpeakers and maxSpeakers:
minSpeakers: Minimum number of speakers to detectmaxSpeakers: Maximum number of speakers to detect
- Conference calls with variable attendance
- Classroom recordings where some students may be absent
- Broadcast content with variable guest counts
Parameter rules and constraints
numSpeakerscannot be used together withminSpeakersormaxSpeakers- If both
minSpeakersandmaxSpeakersare set,minSpeakersmust be ≤maxSpeakers - Setting
numSpeakers=2is equivalent tominSpeakers=2andmaxSpeakers=2
Exclusive diarization
By default, diarization results may include overlapping speech segments where multiple speakers are talking simultaneously. While this provides true accurate diarization, some applications require non-overlapping speaker turns. Enable exclusive diarization by setting"exclusive": true. This provides:
- Non-overlapping segments: Each time period is assigned to exactly one speaker
- Easier integration: Simpler to combine with speech-to-text or other processing
Exclusive diarization results are provided in the
exclusiveDiarization field of the job output, alongside the regular diarization results.When to use exclusive diarization
Exclusive diarization is particularly useful for:- Transcription workflows: Easier to align with ASR output
- Meeting minutes: Cleaner, more readable summaries
- Content analysis: Simpler speaker turn analysis
- Legal proceedings: Clear attribution of speech segments
Best practices
Start with automatic detection
When unsure about speaker count, begin with automatic detection (no parameters) to understand your audio content, then refine with constraints in subsequent processing.Use exact counts when possible
If you have reliable information about speaker count, always usenumSpeakers for optimal performance.
Consider your use case
- Analysis and research: Use regular diarization to capture natural speech patterns
- Transcription and documentation: Consider exclusive diarization for cleaner output
- Unknown number of speakers: Use speaker count ranges to handle variability