Voice Apps Index
danielrosehill/Index View on GitHubIndex of voice typing, dictation, and speech-to-text applications and utilities.
In Development
VoiceType
A fork of Deepgram's Linux starter with CLI-to-GUI conversion, hotkey support, API key editing, and cost tracking. Uses Deepgram streaming ASR.
danielrosehill/VoiceType View on GitHubParakeet Type Ubuntu
On-device voice typing for Linux using Parakeet and NeMo ASR models via sherpa-onnx. Built-in punctuation, multiple model profiles, system tray app with configurable hotkeys. No cloud, no GPU required.
danielrosehill/Parakeet-Type-Ubuntu View on GitHubAI Typer V2
Voice dictation with multimodal AI cleanup — speak naturally, get polished text. Uses Gemini multimodal audio processing.
danielrosehill/AI-Typer-V2 View on GitHubWayland Voice Typer
Simple GUI around whisper.cpp for voice-to-text on Linux.
danielrosehill/Wayland-Voice-Typer View on GitHubQuick STT
Optimised always-on STT for Ubuntu with ROCm support.
danielrosehill/Quick-STT View on GitHubhyprvoice
Voice-powered typing for Wayland/Hyprland desktops.
danielrosehill/hyprvoice View on GitHubMooshine Dictation App
Moonshine-based dictation application.
danielrosehill/Mooshine-Dictation-App-0326 View on GitHubLocal STT App
Local speech-to-text application.
danielrosehill/Local-STT-App View on GitHubVoice Typing 1125
Voice typing application iteration.
danielrosehill/Voice-Typing-1125 View on GitHubOld Iterations
AI Transcription Notepad
Voice note-taking utility that uses cloud audio multimodal models for single-pass transcription and text cleanup.
danielrosehill/AI-Transcription-Notepad View on GitHubThought Pad
Linux desktop application providing a two-stage process for creating notes from dictated speech — transcription via Whisper API followed by light text formatting. Exports to markdown docs.
danielrosehill/Thought-Pad View on GitHubWhisper Typer 0911
Earlier Whisper-based voice typing iteration.
danielrosehill/Whisper-Typer-0911 View on GitHubDeepgram Voice Keyboard Ubuntu
WIP STT utility using cloud STT APIs on Ubuntu.
danielrosehill/Deepgram-Voice-Keyboard-Ubuntu View on GitHubVoiceflow V1
Early voice flow implementation.
danielrosehill/Voiceflow-V1 View on GitHubVoiceflow Dev
Voice flow development iteration.
danielrosehill/Voiceflow-Dev View on GitHubVoice Flow Idea Dev
Voice flow idea development workspace.
danielrosehill/Voice-Flow-Idea-Dev View on GitHubWhisper Typing Linux 1125
Whisper-based typing tool for Linux.
danielrosehill/Whisper-Typing-Linux-1125 View on GitHubVoice Keyboard
Voice keyboard application.
danielrosehill/Voice-Keyboard View on GitHubAndroid Voice Keyboard
Voice keyboard for Android.
danielrosehill/Android-Voice-Keyboard View on GitHubVoice Notepad Android
Android fork of transcription UI.
danielrosehill/Voice-Notepad-Android View on GitHubTranscription Tools
Gemini Audio Transcriber
File upload based multimodal transcription tool using Gemini via Open Router.
danielrosehill/Gemini-Audio-Transcriber View on GitHubGemini Transcription Notepad
Gemini-powered transcription notepad with cleanup.
danielrosehill/Gemini-Transcription-Notepad View on GitHubGemini ASR Transcriber
Transcription notepad for Gemini ASR.
danielrosehill/Gemini-ASR-Transcriber View on GitHubDVR Transcriber
Workflow workspace for importing recordings from a DVR and using AI for transcription.
danielrosehill/DVR-Transcriber View on GitHubTranscript Creator
Audio cleanup and transcription tool.
danielrosehill/Transcript-Creator View on GitHubLocal Multimodal Transcriber
Local transcription app with audio multimodal design.
danielrosehill/Local-Multimodal-Transcriber View on GitHubASR Transcription Pipeline
ASR transcription pipeline.
danielrosehill/ASR-Transcription-Pipeline View on GitHubTranscription MCPs
Gemini Transcription MCP
MCP server for Gemini multimodal audio transcription with built-in post-processing.
danielrosehill/Gemini-Transcription-MCP View on GitHubCloud ASR MCP
MCP for using various cloud ASR models for speech-to-text and transcription.
danielrosehill/Cloud-ASR-MCP View on GitHubLocal AI Transcription MCP
MCP for local AI transcription.
danielrosehill/Local-AI-Transcription-MCP View on GitHubLocal Transcription MCP
WIP MCP for local STT with cleanup on AMD GPU machines.
danielrosehill/Local-Transcription-MCP View on GitHubOR Audio Transcription MCP
Open Router-based audio transcription MCP server.
danielrosehill/OR-Audio-Transcription-MCP View on GitHubEvaluations & Benchmarks
Whisper Fine Tune Accuracy Eval
Comparing Whisper fine-tunes versus stock Whisper on local inference.
danielrosehill/Whisper-Fine-Tune-Accuracy-Eval View on GitHubWhisper WPM Background Noise Eval
Quick eval to answer: how much does speaking pace affect WER/accuracy in ASR?
danielrosehill/Whisper-WPM-Background-Noise-Eval View on GitHubTranscription Cleanup Eval
Evaluating various cloud audio understanding models on the transcribe-and-cleanup workflow.
danielrosehill/Transcription-Cleanup-Eval-1225 View on GitHubOne Shot Transcription Microphone Eval
Test samples for various microphones with an STT accuracy evaluation.
danielrosehill/One-Shot-Transcription-Microphone-Eval View on GitHubLocal ASR STT Benchmark
Quick evaluation to find the best STT model in Speech Note (Ubuntu) for local hardware.
danielrosehill/Local-ASR-STT-Benchmark View on GitHubWhisper WPM Test
Whisper words-per-minute testing.
danielrosehill/Whisper-WPM-Test View on GitHubGemini 3.1 Lite Audio Understanding Eval
Evaluation of Gemini 3.1 Lite on audio understanding tasks.
danielrosehill/Gemini-31-Lite-Audio-Understanding-Eval View on GitHubVoice Cleanup Prompt Experiment
Testing various permutations in system prompting for raw audio transcript cleanup and comparing multimodal ASR vs. the STT + LLM approach.
danielrosehill/Voice-Cleanup-Prompt-Experiment View on GitHubWhisper Fine-Tuning & Setup
Whisper Finetune V2
Whisper fine-tuning iteration.
danielrosehill/Whisper-Finetune-V2 View on GitHubModal Whisper Finetune Script
Validated fine-tuning script for fine-tuning Whisper on Modal GPU with a preformatted audio dataset.
danielrosehill/Modal-Whisper-Finetune-Script View on GitHubWhisper Fine Tuning Data
Whisper fine-tuning dataset.
danielrosehill/Whisper-Fine-Tuning-Data View on GitHubWhisper Fine Tune 171125
Whisper fine-tuning iteration.
danielrosehill/Whisper-Fine-Tune-171125 View on GitHubWhisper Base FUTO
Whisper base model via FUTO.
danielrosehill/Whisper-Base-FUTO View on GitHubLocal STT Fine Tune Tests
Local STT fine-tuning tests.
danielrosehill/Local-STT-Fine-Tune-Tests View on GitHubFine Tuned STT Formats
Fine-tuned STT data formats.
danielrosehill/Fine-Tuned-STT-Formats View on GitHubwhisper-wayland-rocm
Whisper-Wayland with ROCm GPU acceleration — Docker setup for AMD GPUs.
danielrosehill/whisper-wayland-rocm View on GitHubwhisper-cpp-rocm-setup
whisper.cpp ROCm setup scripts.
danielrosehill/whisper-cpp-rocm-setup View on GitHubWhisper Local Notes
Notes on local Whisper usage.
danielrosehill/Whisper-Local-Notes View on GitHubASR Training Data
ASR Training Data Collector
GUI to facilitate gathering training data for ASR/STT apps in organised datasets with audio capture, text capture, and JSONL metadata construction. Supports LLM-generated text and user-provided.
danielrosehill/ASR-Training-Data-Collector View on GitHubASR Training Data Collector GUI Template
GUI template for ASR training data collection.
danielrosehill/ASR-Training-Data-Collector-GUI-Template View on GitHubASR Training Data Chunker
Breaks up texts by approximate reading duration for ASR training.
danielrosehill/ASR-Training-Data-Chunker View on GitHubOther Utilities
Voice Note Recorder Ubuntu
GUI for recording voice notes on Ubuntu/Linux.
danielrosehill/Voice-Note-Recorder-Ubuntu View on GitHubReadiness Voice Agent
Voice agent implementation for readiness checklists.
danielrosehill/Readiness-Voice-Agent View on GitHubVoice Note Classification Model
Model for classifying voice notes.
danielrosehill/Voice-Note-Classification-Model View on GitHubVoice Note Classifier Model
Voice note classifier model.
danielrosehill/Voice-Note-Classifier-Model View on GitHubVoice Note Dataset
Frontend for open source voice note dataset for annotation/classification project.
danielrosehill/Voice-Note-Dataset View on GitHubVoice Note Ragie Pipeline
Test pipeline: voice context data to Ragie.
danielrosehill/Voice-Note-Ragie-Pipeline View on GitHubVoice Prompt Cleanup Script
Audio processing cleanup script.
danielrosehill/Voice-Prompt-Cleanup-Script View on GitHubTranscription Macropad
Macropad configuration for transcription workflows.
danielrosehill/Transcription-Macropad View on GitHubDictation Macropad
Plan/key allocation for a macropad optimised for heavy daily dictation workflows.
danielrosehill/Dictation-Macropad View on GitHubVoicepad
Planning notes for a macropad for STT users.
danielrosehill/Voicepad View on GitHubVoice Typer HW
Voice typer hardware notes.
danielrosehill/Voice-Typer-HW View on GitHubVoice Headset Design
Voice headset design notes.
danielrosehill/Voice-Headset-Design View on GitHubDictation Microphones
Dictation microphone notes and comparisons.
danielrosehill/Dictation-Microphones View on GitHubspeech-notes-with-text-fixes
Speech Note Linux app with text fixes — note taking, reading and translating with offline STT, TTS, and machine translation.
danielrosehill/speech-notes-with-text-fixes View on GitHubHebrish Whisper Tester
Testing Whisper with Hebrew-English mixed speech.
danielrosehill/Hebrish-Whisper-Tester View on GitHubNotes & Ideas
VoiceBox
Concept for a speech tech solution — specced out by Claude.
danielrosehill/VoiceBox View on GitHubLinux Realtime Voice Typing
Planning and research for real-time voice typing on Linux (Deepgram, Gemini, Parakeet).
danielrosehill/linux-realtime-voice-typing View on GitHubLinux Voice Typing App Notes
Planning notes for a Linux voice typing tool.
danielrosehill/Linux-Voice-Typing-App-Notes View on GitHubSpeech To Text Chain Notes
Notes on STT processing chain for future voice projects.
danielrosehill/Speech-To-Text-Chain-Notes View on GitHubCloud STT Price Points
Point-in-time pricing snapshots for ASR services.
danielrosehill/Cloud-STT-Price-Points-1225 View on GitHubASR And STT AI Notebook
Prompts and outputs on STT, ASR, and fine-tuning with Claude.
danielrosehill/ASR-And-STT-AI-Notebook View on GitHubLinux Friendly Voice Tech
List of resources for voice technology with Linux support, encompassing STT, ASR, and dev frameworks.
danielrosehill/Linux-Friendly-Voice-Tech View on GitHubVoice Control Linux
Claude-enhanced research for voice control platforms with Linux support.
danielrosehill/Voice-Control-Linux View on GitHubvoice-typing-collection
Collection of voice typing / STT GitHub repos for testing on Linux.
danielrosehill/voice-typing-collection View on GitHubAwesome Whisper Apps
Useful speech-to-text tools that use Whisper under the hood (API/local).
danielrosehill/Awesome-Whisper-Apps View on GitHubVoiceflow Planner
Voiceflow planning notes.
danielrosehill/Voiceflow-Planner View on GitHubSTT TTS Train 1125
STT and TTS training notes.
danielrosehill/STT-TTS-Train-1125 View on GitHub