Choosing the Right Model
Tscribe uses Whisper models of various sizes. Choosing the right one depends on your audio quality, language, and how much time you have.
Quick Recommendations
Fastest for English
Use tiny-q5_1 or base-q5_1. They are optimized for English and process audio almost instantly.
True Multilingual
Use medium-q5_0. For Whisper, reliable multilingual accuracy starts at the medium size. Best for Ukrainian, Spanish, etc.
Maximum Accuracy
Use large-v3-turbo-q5_0. It's the gold standard for accuracy, optimized for speed without sacrificing quality.
Noisy Audio
Use large-v3. It is the slowest model (3.1 GB) but excels at deciphering complex, noisy, or multi-speaker environments.
Model Comparison
tiny-q5_1
32 MB~32 MB. Fast, small. Good for English voice commands and simple dictation.
base-q5_1
60 MB~60 MB. Optimized for English. Very fast but may struggle with non-English languages.
small-q5_1
190 MB~190 MB. Good balance for English tasks. More accurate than base, but still not recommended for deep multilingual work.
medium-q5_0
DEFAULT 539 MB~540 MB. Best for Ukrainian and non-English audio. Perfect for podcasts and YouTube content.
large-v3-turbo-q5_0
FASTEST ACCURACY 574 MB~574 MB. Same quality as large-v3 but 2x faster. Best overall accuracy for most tasks.
large-v3
3095 MB~3.1 GB. Maximum accuracy. Slowest, but best for noisy/complex audio with multiple speakers.