Hebrew TTS Snapshot
A snapshot of Hebrew text-to-speech capabilities as of 22 March 2025, comparing voice quality across multiple TTS providers — including voice cloning experiments via Replicate.
Key Findings
MiniMax — Voice cloning (Replicate) · Best · Most impressive results — cloned voices sounded natural in Hebrew
Edge TTS — Stock voices (Avri, Hila) · Good · Microsoft's free TTS; tested at 100% and 70% speed
Gemini — Stock voices (Puck, Zephyr) · Good · Via Google AI Studio, Gemini 2.5 Flash Preview TTS
ElevenLabs — Stock voices (v3 model) · Good · Requires
language_code: "he"— multilingual v2 is unintelligibleChatterbox — Voice cloning (Replicate) · Poor · Output was generic; cloning didn't carry through to Hebrew
Resemble AI — Stock + voice cloning · Poor · Needs nekudot (diacritics) for intelligible output
Voice Cloning Method
Voice clones for MiniMax and Chatterbox were generated using Replicate:
~1 minute of English source audio per voice (see
voice-sources/english/)Voice clone IDs created on Replicate from those samples
Hebrew text generated using the cloned voice IDs with Hebrew language parameter
Hebrew reference audio also tested (see
voice-sources/hebrew/)
MiniMax used the T2A v2.6 Turbo model with voice clones and Hebrew boost enabled.
Repository Structure
├── samples/ # Generated TTS audio output
│ ├── chatterbox/ # Chatterbox multilingual voice cloning (via Replicate)
│ │ ├── run1/ # English reference audio, unvowelised input
│ │ └── run2/ # Hebrew reference audio
│ ├── edge-tts/ # Microsoft Edge TTS
│ │ ├── avri-100pc/ # Avri voice, normal speed
│ │ ├── avri-70pc/ # Avri voice, 70% speed
│ │ ├── hila-100pc/ # Hila voice, normal speed
│ │ └── hila-70pc/ # Hila voice, 70% speed
│ ├── elevenlabs/ # ElevenLabs v3 model (Rachel, Adam, Bella)
│ ├── gemini/ # Google Gemini 2.5 Flash Preview TTS (Puck, Zephyr)
│ ├── minimax/ # MiniMax T2A v2.6 Turbo voice clones (Corn, Herman)
│ └── resemble/ # Resemble AI
│ ├── stock/ # Avigail (Hebrew preset voice)
│ └── voice-clone/ # Chatterbox multilingual clone (Herman)
├── voice-sources/ # Input audio used for voice cloning
│ ├── english/ # ~1 min English samples (corn, daniel, herman)
│ └── hebrew/ # Hebrew reference samples
│ ├── corn/
│ ├── daniel/
│ └── herman/
├── texts/ # Hebrew text prompts
│ ├── source/ # Texts used for TTS generation (+ PDF versions)
│ └── target/ # Additional test texts (cooking, music, weather)
└── resources.md # Links to related tools and services
Provider Details
MiniMax (Best Results)
Model: T2A v2.6 Turbo
Method: Voice clones created on Replicate, Hebrew boost enabled
Voices tested: Corn, Herman
Edge TTS
Voices: Avri (male), Hila (female)
Speed variants: 100% (normal) and 70% (slowed)
Texts: Family, Jerusalem, sample text, travel, and more
ElevenLabs
Model:
eleven_v3withlanguage_code: "he"Voices: Rachel (F), Adam (M), Bella (F)
Note: Multilingual v2 (
eleven_multilingual_v2) produces unintelligible Hebrew. v3 with explicit Hebrew language code is required.
Google Gemini
Model: Gemini 2.5 Flash Preview TTS
Interface: Google AI Studio
Voices: Puck, Zephyr
Chatterbox (via Resemble/Replicate)
Engine: Chatterbox Multilingual
Method: Voice cloning via Replicate API with
language: "he"Result: Output sounded generic — voice cloning did not carry through to Hebrew
Issue in Run 2: Dicta ONNX model not loaded; output was still gibberish
Resemble AI
Stock voice: Avigail (Hebrew preset) — usable but limited
Custom clones: Unintelligible without nekudot (diacritics) in input text
Resources
Phonikud TTS — Hebrew TTS with automatic diacritization
Add Diacritics in Hebrew — Diacritics preprocessor
LightBlue TTS Hebrew — Open source Hebrew TTS
Ivrit.ai — Israeli speech technology
Replicate TTS Collection — TTS models on Replicate