Calame: Transcription pipeline for all

Tools for research and application workflow from raw audio to clean, structured transcripts.

Calame preview
Calame preview
Calame preview
Calame preview
Calame preview

Why Calame

The transcription pipeline
at the state of the art

Calame turns your interviews and field recordings into clean, structured transcripts with diarization, anonymisation, and full privacy.

Transcription

Turn interviews, focus groups, and field recordings into accurate text transcripts in minutes.

Whisper
Diarization

Automatically identify and label each speaker so you always know who said what.

Pyannote
Anonymisation

Calame detects and redacts names, places, and identifying details to keep your participants protected.

Stanza

All tools are open-source and available on Hugging Face

Performance

Fast transcriptionon your hardware

Processing time (minutes) for transcription, diarization, and in total for audio files of different durations.
Hardware File tTRS tDIA t
i7-1260P 5 min 5.51 3.41 8.92
30 min 26.34 22.72 49.06
60 min 51.31 39.45 90.76
RTX 2070 5 min 3.24 0.39 3.63
30 min 24.99 4.30 29.29
60 min 38.32 13.24 51.56
RTX 4060 5 min 1.02 0.24 1.26
30 min 3.72 1.36 5.08
60 min 6.99 2.83 9.82

System Requirements

Built to runon almost any machine

12 GB

RAM

GPU / CPU

Supported

Docker

Environment

Future directions

Current research anddevelopment focus

Under-resourced languages

Expanding first-class support beyond Québécois French to more dialects and low-resource languages.

Targeted diarization

Focus speaker identification on a single participant versus the full group, for interviews and one-on-one recordings.

Multi-model usage

Choose from multiple transcription or speaker separation models to improve accuracy across diverse audio conditions.

User collaboration

Share projects, review transcripts together, and manage team access for collaborative research workflows.