Voice Apps Index

Index of voice typing, dictation, and speech-to-text applications and utilities.

Last updated: 06/04/2026

Voice Apps Index

Master Index

danielrosehill/Index View on GitHub

Index of voice typing, dictation, and speech-to-text applications and utilities.

In Development

VoiceType

A fork of Deepgram's Linux starter with CLI-to-GUI conversion, hotkey support, API key editing, and cost tracking. Uses Deepgram streaming ASR.

View Repo

danielrosehill/VoiceType View on GitHub

Parakeet Type Ubuntu

On-device voice typing for Linux using Parakeet and NeMo ASR models via sherpa-onnx. Built-in punctuation, multiple model profiles, system tray app with configurable hotkeys. No cloud, no GPU required.

View Repo

danielrosehill/Parakeet-Type-Ubuntu View on GitHub

AI Typer V2

Voice dictation with multimodal AI cleanup — speak naturally, get polished text. Uses Gemini multimodal audio processing.

View Repo

danielrosehill/AI-Typer-V2 View on GitHub

Wayland Voice Typer

Simple GUI around whisper.cpp for voice-to-text on Linux.

View Repo

danielrosehill/Wayland-Voice-Typer View on GitHub

Quick STT

Optimised always-on STT for Ubuntu with ROCm support.

View Repo

danielrosehill/Quick-STT View on GitHub

hyprvoice

Voice-powered typing for Wayland/Hyprland desktops.

View Repo

danielrosehill/hyprvoice View on GitHub

Mooshine Dictation App

Moonshine-based dictation application.

View Repo

danielrosehill/Mooshine-Dictation-App-0326 View on GitHub

Local STT App

Local speech-to-text application.

View Repo

danielrosehill/Local-STT-App View on GitHub

Voice Typing 1125

Voice typing application iteration.

View Repo

danielrosehill/Voice-Typing-1125 View on GitHub

Old Iterations

AI Transcription Notepad

Voice note-taking utility that uses cloud audio multimodal models for single-pass transcription and text cleanup.

View Repo

danielrosehill/AI-Transcription-Notepad View on GitHub

Thought Pad

Linux desktop application providing a two-stage process for creating notes from dictated speech — transcription via Whisper API followed by light text formatting. Exports to markdown docs.

View Repo

danielrosehill/Thought-Pad View on GitHub

Whisper Typer 0911

Earlier Whisper-based voice typing iteration.

View Repo

danielrosehill/Whisper-Typer-0911 View on GitHub

Deepgram Voice Keyboard Ubuntu

WIP STT utility using cloud STT APIs on Ubuntu.

View Repo

danielrosehill/Deepgram-Voice-Keyboard-Ubuntu View on GitHub

Voiceflow V1

Early voice flow implementation.

View Repo

danielrosehill/Voiceflow-V1 View on GitHub

Voiceflow Dev

Voice flow development iteration.

View Repo

danielrosehill/Voiceflow-Dev View on GitHub

Voice Flow Idea Dev

Voice flow idea development workspace.

View Repo

danielrosehill/Voice-Flow-Idea-Dev View on GitHub

Whisper Typing Linux 1125

Whisper-based typing tool for Linux.

View Repo

danielrosehill/Whisper-Typing-Linux-1125 View on GitHub

Voice Keyboard

Voice keyboard application.

View Repo

danielrosehill/Voice-Keyboard View on GitHub

Android Voice Keyboard

Voice keyboard for Android.

View Repo

danielrosehill/Android-Voice-Keyboard View on GitHub

Voice Notepad Android

Android fork of transcription UI.

View Repo

danielrosehill/Voice-Notepad-Android View on GitHub

Transcription Tools

Gemini Audio Transcriber

File upload based multimodal transcription tool using Gemini via Open Router.

View Repo

danielrosehill/Gemini-Audio-Transcriber View on GitHub

Gemini Transcription Notepad

Gemini-powered transcription notepad with cleanup.

View Repo

danielrosehill/Gemini-Transcription-Notepad View on GitHub

Gemini ASR Transcriber

Transcription notepad for Gemini ASR.

View Repo

danielrosehill/Gemini-ASR-Transcriber View on GitHub

DVR Transcriber

Workflow workspace for importing recordings from a DVR and using AI for transcription.

View Repo

danielrosehill/DVR-Transcriber View on GitHub

Transcript Creator

Audio cleanup and transcription tool.

View Repo

danielrosehill/Transcript-Creator View on GitHub

Local Multimodal Transcriber

Local transcription app with audio multimodal design.

View Repo

danielrosehill/Local-Multimodal-Transcriber View on GitHub

ASR Transcription Pipeline

ASR transcription pipeline.

View Repo

danielrosehill/ASR-Transcription-Pipeline View on GitHub

Transcription MCPs

Gemini Transcription MCP

MCP server for Gemini multimodal audio transcription with built-in post-processing.

View Repo

danielrosehill/Gemini-Transcription-MCP View on GitHub

Cloud ASR MCP

MCP for using various cloud ASR models for speech-to-text and transcription.

View Repo

danielrosehill/Cloud-ASR-MCP View on GitHub

Local AI Transcription MCP

MCP for local AI transcription.

View Repo

danielrosehill/Local-AI-Transcription-MCP View on GitHub

Local Transcription MCP

WIP MCP for local STT with cleanup on AMD GPU machines.

View Repo

danielrosehill/Local-Transcription-MCP View on GitHub

OR Audio Transcription MCP

Open Router-based audio transcription MCP server.

View Repo

danielrosehill/OR-Audio-Transcription-MCP View on GitHub

Evaluations & Benchmarks

Whisper Fine Tune Accuracy Eval

Comparing Whisper fine-tunes versus stock Whisper on local inference.

View Repo

danielrosehill/Whisper-Fine-Tune-Accuracy-Eval View on GitHub

Whisper WPM Background Noise Eval

Quick eval to answer: how much does speaking pace affect WER/accuracy in ASR?

View Repo

danielrosehill/Whisper-WPM-Background-Noise-Eval View on GitHub

Transcription Cleanup Eval

Evaluating various cloud audio understanding models on the transcribe-and-cleanup workflow.

View Repo

danielrosehill/Transcription-Cleanup-Eval-1225 View on GitHub

One Shot Transcription Microphone Eval

Test samples for various microphones with an STT accuracy evaluation.

View Repo

danielrosehill/One-Shot-Transcription-Microphone-Eval View on GitHub

Local ASR STT Benchmark

Quick evaluation to find the best STT model in Speech Note (Ubuntu) for local hardware.

View Repo

danielrosehill/Local-ASR-STT-Benchmark View on GitHub

Whisper WPM Test

Whisper words-per-minute testing.

View Repo

danielrosehill/Whisper-WPM-Test View on GitHub

Gemini 3.1 Lite Audio Understanding Eval

Evaluation of Gemini 3.1 Lite on audio understanding tasks.

View Repo

danielrosehill/Gemini-31-Lite-Audio-Understanding-Eval View on GitHub

Voice Cleanup Prompt Experiment

Testing various permutations in system prompting for raw audio transcript cleanup and comparing multimodal ASR vs. the STT + LLM approach.

View Repo

danielrosehill/Voice-Cleanup-Prompt-Experiment View on GitHub

Whisper Fine-Tuning & Setup

Whisper Finetune V2

Whisper fine-tuning iteration.

View Repo

danielrosehill/Whisper-Finetune-V2 View on GitHub

Modal Whisper Finetune Script

Validated fine-tuning script for fine-tuning Whisper on Modal GPU with a preformatted audio dataset.

View Repo

danielrosehill/Modal-Whisper-Finetune-Script View on GitHub

Whisper Fine Tuning Data

Whisper fine-tuning dataset.

View Repo

danielrosehill/Whisper-Fine-Tuning-Data View on GitHub

Whisper Fine Tune 171125

Whisper fine-tuning iteration.

View Repo

danielrosehill/Whisper-Fine-Tune-171125 View on GitHub

Whisper Base FUTO

Whisper base model via FUTO.

View Repo

danielrosehill/Whisper-Base-FUTO View on GitHub

Local STT Fine Tune Tests

Local STT fine-tuning tests.

View Repo

danielrosehill/Local-STT-Fine-Tune-Tests View on GitHub

Fine Tuned STT Formats

Fine-tuned STT data formats.

View Repo

danielrosehill/Fine-Tuned-STT-Formats View on GitHub

whisper-wayland-rocm

Whisper-Wayland with ROCm GPU acceleration — Docker setup for AMD GPUs.

View Repo

danielrosehill/whisper-wayland-rocm View on GitHub

whisper-cpp-rocm-setup

whisper.cpp ROCm setup scripts.

View Repo

danielrosehill/whisper-cpp-rocm-setup View on GitHub

Whisper Local Notes

Notes on local Whisper usage.

View Repo

danielrosehill/Whisper-Local-Notes View on GitHub

ASR Training Data

ASR Training Data Collector

GUI to facilitate gathering training data for ASR/STT apps in organised datasets with audio capture, text capture, and JSONL metadata construction. Supports LLM-generated text and user-provided.

View Repo

danielrosehill/ASR-Training-Data-Collector View on GitHub

ASR Training Data Collector GUI Template

GUI template for ASR training data collection.

View Repo

danielrosehill/ASR-Training-Data-Collector-GUI-Template View on GitHub

ASR Training Data Chunker

Breaks up texts by approximate reading duration for ASR training.

View Repo

danielrosehill/ASR-Training-Data-Chunker View on GitHub

Other Utilities

Voice Note Recorder Ubuntu

GUI for recording voice notes on Ubuntu/Linux.

View Repo

danielrosehill/Voice-Note-Recorder-Ubuntu View on GitHub

Readiness Voice Agent

Voice agent implementation for readiness checklists.

View Repo

danielrosehill/Readiness-Voice-Agent View on GitHub

Voice Note Classification Model

Model for classifying voice notes.

View Repo

danielrosehill/Voice-Note-Classification-Model View on GitHub

Voice Note Classifier Model

Voice note classifier model.

View Repo

danielrosehill/Voice-Note-Classifier-Model View on GitHub

Voice Note Dataset

Frontend for open source voice note dataset for annotation/classification project.

View Repo

danielrosehill/Voice-Note-Dataset View on GitHub

Voice Note Ragie Pipeline

Test pipeline: voice context data to Ragie.

View Repo

danielrosehill/Voice-Note-Ragie-Pipeline View on GitHub

Voice Prompt Cleanup Script

Audio processing cleanup script.

View Repo

danielrosehill/Voice-Prompt-Cleanup-Script View on GitHub

Transcription Macropad

Macropad configuration for transcription workflows.

View Repo

danielrosehill/Transcription-Macropad View on GitHub

Dictation Macropad

Plan/key allocation for a macropad optimised for heavy daily dictation workflows.

View Repo

danielrosehill/Dictation-Macropad View on GitHub

Voicepad

Planning notes for a macropad for STT users.

View Repo

danielrosehill/Voicepad View on GitHub

Voice Typer HW

Voice typer hardware notes.

View Repo

danielrosehill/Voice-Typer-HW View on GitHub

Voice Headset Design

Voice headset design notes.

View Repo

danielrosehill/Voice-Headset-Design View on GitHub

Dictation Microphones

Dictation microphone notes and comparisons.

View Repo

danielrosehill/Dictation-Microphones View on GitHub

speech-notes-with-text-fixes

Speech Note Linux app with text fixes — note taking, reading and translating with offline STT, TTS, and machine translation.

View Repo

danielrosehill/speech-notes-with-text-fixes View on GitHub

Hebrish Whisper Tester

Testing Whisper with Hebrew-English mixed speech.

View Repo

danielrosehill/Hebrish-Whisper-Tester View on GitHub

Notes & Ideas

VoiceBox

Concept for a speech tech solution — specced out by Claude.

View Repo

danielrosehill/VoiceBox View on GitHub

Linux Realtime Voice Typing

Planning and research for real-time voice typing on Linux (Deepgram, Gemini, Parakeet).

View Repo

danielrosehill/linux-realtime-voice-typing View on GitHub

Linux Voice Typing App Notes

Planning notes for a Linux voice typing tool.

View Repo

danielrosehill/Linux-Voice-Typing-App-Notes View on GitHub

Speech To Text Chain Notes

Notes on STT processing chain for future voice projects.

View Repo

danielrosehill/Speech-To-Text-Chain-Notes View on GitHub

Cloud STT Price Points

Point-in-time pricing snapshots for ASR services.

View Repo

danielrosehill/Cloud-STT-Price-Points-1225 View on GitHub

ASR And STT AI Notebook

Prompts and outputs on STT, ASR, and fine-tuning with Claude.

View Repo

danielrosehill/ASR-And-STT-AI-Notebook View on GitHub

Linux Friendly Voice Tech

List of resources for voice technology with Linux support, encompassing STT, ASR, and dev frameworks.

View Repo

danielrosehill/Linux-Friendly-Voice-Tech View on GitHub

Voice Control Linux

Claude-enhanced research for voice control platforms with Linux support.

View Repo

danielrosehill/Voice-Control-Linux View on GitHub

voice-typing-collection

Collection of voice typing / STT GitHub repos for testing on Linux.

View Repo

danielrosehill/voice-typing-collection View on GitHub

Awesome Whisper Apps

Useful speech-to-text tools that use Whisper under the hood (API/local).

View Repo

danielrosehill/Awesome-Whisper-Apps View on GitHub

Voiceflow Planner

Voiceflow planning notes.

View Repo

danielrosehill/Voiceflow-Planner View on GitHub

STT TTS Train 1125

STT and TTS training notes.

View Repo

danielrosehill/STT-TTS-Train-1125 View on GitHub