> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pyannote.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# How to identify speakers with voiceprints

> This tutorial shows how to create voiceprints and identify speakers using the pyannoteAI API

Speaker identification is the process of determining who is speaking in an audio file by comparing their voice characteristics against known voiceprints. Unlike diarization which only separates speakers into generic labels (`SPEAKER_00`, `SPEAKER_01`, etc.), identification assigns specific identities to speakers.

### What are voiceprints?

A voiceprint is a unique digital representation of a person's voice characteristics, similar to a fingerprint but for voice. It captures the distinctive features of how someone speaks, allowing the system to recognize that person in future audio recordings.

<Note>
  **Voiceprints are for identification only** - they do not improve the accuracy of diarization. Diarization separates speakers, while identification assigns names/labels to those speakers.
</Note>

### Voiceprint requirements

* **One voiceprint per speaker**: Create only one voiceprint for each person.
* **Single speaker only**: The recording must contain only the target speaker's voice with no overlapping speakers.
* **Maximum duration**: Audio samples must be at most 30 seconds long for creating voiceprints.
* **Consistent speaking style**: The voiceprint should capture the person's normal speaking voice.
* **Language**: Our models are language agnostic, so voiceprints can be created in any spoken language.

### Prerequisites

Before you start, you'll need:

* pyannoteAI account with credit or active subscription
* An API key
* An audio recording of a single speaker for the voiceprint creation
* An audio recording with multiple speakers for diarization + identification

For help creating an account and getting your API key, see the [quickstart guide](/quickstart). For pricing and charging details, see [Billing](/administration/billing).

## 1. Create a voiceprint

First, create a voiceprint for each speaker you want to identify. This is a one-time process for each person.

Send a POST request to the [voiceprint endpoint](/api-reference/voiceprint) with an audio file containing the speaker's voice.

<CodeGroup dropdown>
  ```python create_voiceprint.py theme={null}
  import requests

  url = "https://api.pyannote.ai/v1/voiceprint"
  api_key = "YOUR_API_KEY"  # In production, use environment variables: os.getenv("PYANNOTE_API_KEY")

  headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
  data = {"url": "https://example.com/speaker-voice-sample.wav"}

  response = requests.post(url, headers=headers, json=data)

  if response.status_code != 200:
      print(f"Error: {response.status_code} - {response.text}")
  else:
      print(response.json())
  ```

  ```bash theme={null}
  # Set your API key as an environment variable in production
  # export PYANNOTE_API_KEY="your_api_key_here"
  curl -X POST "https://api.pyannote.ai/v1/voiceprint" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{"url": "https://example.com/speaker-voice-sample.wav"}'
  ```

  ```typescript create_voiceprint.ts theme={null}
  const url = "https://api.pyannote.ai/v1/voiceprint";
  const apiKey = "YOUR_API_KEY"; // In production, use environment variables: process.env.PYANNOTE_API_KEY
  const headers = {
    Authorization: `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  };
  const data = {
    url: "https://example.com/speaker-voice-sample.wav",
  };

  const response = await fetch(url, {
    method: "POST",
    headers,
    body: JSON.stringify(data),
  });

  if (!response.ok) {
    console.error(`Error: ${response.status} - ${await response.text()}`);
  } else {
    console.log(await response.json());
  }
  ```
</CodeGroup>

The response will include a `jobId` to track the voiceprint creation:

```json Example response theme={null}
{
  "jobId": "3c8a89a5-dcc6-4edb-a75d-ffd64739674d",
  "status": "created"
}
```

### Get voiceprint results

To retrieve the voiceprint results, use the same polling or webhook approach described in the [How to diarize an audio file](/tutorials/how-to-diarize-audio) tutorial. The process works identically for voiceprint jobs.

<Note>
  **Save voiceprints to your own data storage**

  * Job outputs (including voiceprints) are automatically **deleted after 24 hours**.
  * Voiceprints are reusable, so store them securely for future identification requests.
</Note>

```json example job voiceprint output theme={null}
{
  "jobId": "3c8a89a5-dcc6-4edb-a75d-ffd64739674d",
  "status": "succeeded",
  "createdAt": "2024-02-20T12:00:00Z",
  "updatedAt": "2024-02-20T12:00:00Z",
  "output": {
    "voiceprint": "U29tZVZvaWNlUHJpbnREYXRhMQ=="
  }
}
```

***

## 2. Identify speakers in audio

Now that you have a voiceprint, you can identify a speaker in new audio recordings.

Send a POST request to the [identify endpoint](/api-reference/identify) with the audio file URL and the voiceprints you want to match against.

<CodeGroup dropdown>
  ```python identify_speakers.py theme={null}
  import requests

  url = "https://api.pyannote.ai/v1/identify"
  api_key = "YOUR_API_KEY"

  headers = {"Authorization": f"Bearer {api_key}", "Content-Type": "application/json"}
  data = {
      "url": "https://example.com/meeting-audio.wav",
      "voiceprints": [
          {
              "label": "John Doe", # The speaker label you want to assign
              "voiceprint": "U29tZVZvaWNlUHJpbnREYXRhMQ=="  # Replace with actual voiceprint
          },
          # Add more voiceprints as needed
      ],
      # Optional matching parameters
      "matching": {
          "threshold": 50,  # Only match if confidence is 50% or higher
          "exclusive": True  # Prevent multiple speakers matching same voiceprint
      }
  }

  response = requests.post(url, headers=headers, json=data)

  if response.status_code != 200:
      print(f"Error: {response.status_code} - {response.text}")
  else:
      print(response.json())
  ```

  ```bash theme={null}
  curl -X POST "https://api.pyannote.ai/v1/identify" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d '{
      "url": "https://example.com/meeting-audio.wav",
      "voiceprints": [
      {
        "label": "John Doe",
        "voiceprint": "U29tZVZvaWNlUHJpbnREYXRhMQ=="
      }
      ],
      "matching": {
        "threshold": 50,
        "exclusive": true
      }
    }'
  ```

  ```typescript identify_speakers.ts theme={null}
  const url = "https://api.pyannote.ai/v1/identify";
  const apiKey = "YOUR_API_KEY";
  const headers = {
    Authorization: `Bearer ${apiKey}`,
    "Content-Type": "application/json",
  };
  const data = {
    url: "https://example.com/meeting-audio.wav",
    voiceprints: [
      {
        label: "John Doe", // The speaker label you want to assign
        voiceprint: "U29tZVZvaWNlUHJpbnREYXRhMQ==", // Replace with actual voiceprint
      },
      // Add more voiceprints as needed
    ],
    // Optional matching parameters
    matching: {
      threshold: 50, // Only match if confidence is 50% or higher
      exclusive: true, // Prevent multiple speakers matching same voiceprint
    },
  };

  const response = await fetch(url, {
    method: "POST",
    headers,
    body: JSON.stringify(data),
  });

  if (!response.ok) {
    console.error(`Error: ${response.status} - ${await response.text()}`);
  } else {
    console.log(await response.json());
  }
  ```
</CodeGroup>

The response will include a `jobId` for tracking the identification job:

```json Example response theme={null}
{
  "jobId": "4d9b9ab6-edd7-5feca-b86e-gee75840775e",
  "status": "created"
}
```

<Note>
  **Multiple voiceprints**: You can add multiple voiceprints for different people in the same request. Each voiceprint must have a unique label. The system will attempt to match all provided voiceprints against the audio.
</Note>

<Warning>
  **Voiceprint selection**: Voiceprints may match to speakers even when the person isn't actually in the audio. Be cautious about including voiceprints of people who may not be present. Review confidence scores carefully and set appropriate thresholds when unsure about speaker presence.
</Warning>

***

## 3. Get identification results

To retrieve the identification results, use the same polling or webhook approach described in the [How to diarize an audio file](/tutorials/how-to-diarize-audio) tutorial. The process works identically for identification jobs.

```json Example identification output theme={null}
{
  "jobId": "4d9b9ab6-edd7-5feca-b86e-gee75840775e",
  "status": "succeeded",
  "createdAt": "2025-11-06T09:07:49.932Z",
  "updatedAt": "2025-11-06T09:07:53.229Z",
  "output": {
    "diarization": [
      {
        "speaker": "SPEAKER_00",
        "start": 3.005,
        "end": 5.945
      },
      {
        "speaker": "SPEAKER_01",
        "start": 6.345,
        "end": 9.565
      },
      ...
    ],
    "identification": [
      {
        "speaker": "John Doe",
        "start": 3.005,
        "end": 5.945,
        "diarizationSpeaker": "SPEAKER_00",
        "match": "John Doe"
      },
      {
        "speaker": "SPEAKER_01",
        "start": 6.345,
        "end": 9.565,
        "diarizationSpeaker": "SPEAKER_01",
        "match": null
      },
      ...
    ],
    "voiceprints": [
      {
        "speaker": "SPEAKER_00",
        "match": "John Doe",
        "confidence": {
          "John Doe": 86
        }
      },
      {
        "speaker": "SPEAKER_01",
        "match": null,
        "confidence": {
          "John Doe": 16
        }
      }
    ]
  }
}
```

Learn more details about each parameter of the identification output in the [identification schema reference](/api-reference/schemas/identifyschema).

***

## Understanding the Results

### Diarization vs Identification

* **Diarization**: Separates audio into speaker segments with generic labels (SPEAKER\_00, SPEAKER\_01, etc.)
* **Identification**: Matches those segments to known voiceprints with specific labels (John Doe, Jane Smith, etc.)

### Confidence scores

The confidence scores show how well each voiceprint matches each speaker segment:

* Higher scores indicate better matches
* Use the `threshold` parameter to filter out low-confidence matches
* Consider the context when interpreting confidence scores

### Matching options

* **`matching.threshold`**: Minimum confidence score required for a match (0-100, default: `0`). Set higher values (50-70) for more strict matching, lower values for more lenient matching.
* **`matching.exclusive`**: Prevent multiple speakers from matching the same voiceprint (default: `true`). Set to `false` if you want multiple speakers to potentially match the same voiceprint.
