Configuring the number of speakers

When working with audio diarization, you often have prior knowledge about the expected number of speakers or specific requirements for how overlapping speech should be handled. This tutorial covers the key configuration options available in pyannoteAI for speaker detection and exclusive diarization.

Number of speakers

By default, pyannoteAI automatically detects the number of speakers in your audio with no upper limit. However, you can improve accuracy and performance by providing speaker count constraints when you have this information.

Exact speaker count

When you know the exact number of speakers, use numSpeakers for better results. This is common for:

Phone conversations (2 speakers)
Interviews (2 speakers)
Panel discussions with known participants
Meeting recordings with known attendees

Setting numSpeakers typically results in better overall diarization performance since the model can optimize for a specific speaker count.

Speaker count ranges

When the exact number is unknown but you have reasonable bounds, use minSpeakers and maxSpeakers:

minSpeakers: Minimum number of speakers to detect
maxSpeakers: Maximum number of speakers to detect

This is useful when there are optional participants in your recordings, such as:

Conference calls with variable attendance
Classroom recordings where some students may be absent
Broadcast content with variable guest counts

Parameter rules and constraints

numSpeakers cannot be used together with minSpeakers or maxSpeakers
If both minSpeakers and maxSpeakers are set, minSpeakers must be ≤ maxSpeakers
Setting numSpeakers=2 is equivalent to minSpeakers=2 and maxSpeakers=2

Exclusive diarization

By default, diarization results may include overlapping speech segments where multiple speakers are talking simultaneously. While this provides true accurate diarization, some applications require non-overlapping speaker turns. Enable exclusive diarization by setting "exclusive": true. This provides:

Non-overlapping segments: Each time period is assigned to exactly one speaker
Easier integration: Simpler to combine with speech-to-text or other processing

Exclusive diarization results are provided in the exclusiveDiarization field of the job output, alongside the regular diarization results.

When to use exclusive diarization

Exclusive diarization is particularly useful for:

Transcription workflows: Easier to align with ASR output
Meeting minutes: Cleaner, more readable summaries
Content analysis: Simpler speaker turn analysis
Legal proceedings: Clear attribution of speech segments

Best practices

Start with automatic detection

When unsure about speaker count, begin with automatic detection (no parameters) to understand your audio content, then refine with constraints in subsequent processing.

Use exact counts when possible

If you have reliable information about speaker count, always use numSpeakers for optimal performance.

Consider your use case

Analysis and research: Use regular diarization to capture natural speech patterns
Transcription and documentation: Consider exclusive diarization for cleaner output
Unknown number of speakers: Use speaker count ranges to handle variability

Test with your data

Different audio quality and recording conditions may affect how well the constraints work. Test with representative samples from your specific use case.

Getting Started

Tutorials

Support

Webhooks

Configuring the number of speakers

Number of speakers

Exact speaker count

Speaker count ranges

Parameter rules and constraints

Exclusive diarization

When to use exclusive diarization

Best practices

Start with automatic detection

Use exact counts when possible

Consider your use case

Test with your data

Getting Started

Tutorials

Support

Webhooks

​Number of speakers

​Exact speaker count

​Speaker count ranges

​Parameter rules and constraints

​Exclusive diarization

​When to use exclusive diarization

​Best practices

​Start with automatic detection

​Use exact counts when possible

​Consider your use case

​Test with your data

Number of speakers

Exact speaker count

Speaker count ranges

Parameter rules and constraints

Exclusive diarization

When to use exclusive diarization

Best practices

Start with automatic detection

Use exact counts when possible

Consider your use case

Test with your data