Smart Voice Dictation on Linux: Automatically Detect Keyboard Layout
Published:
Imagine you’re writing a document in both English and Arabic. You need to switch between dictation modes every time you change languages. It’s tedious. On Windows, you just press Win+H and start talking. But on Linux, you’re stuck manually selecting the right speech model.
That’s the problem I set out to solve.
The Solution
I created nerd-dictation-auto-switch-languages, a wrapper script that automatically detects your current keyboard layout and uses the appropriate speech-to-text model. No more manual switching. Just press one shortcut and start talking.
How It Works
- Press
Super+Hto start dictation - The script detects your keyboard layout (English, Arabic, German, etc.)
- Launches the correct VOSK model automatically
- Speak — your words appear on screen
- Press
Super+Shift+Hto stop
The Technical Details
Key Components
| Component | Purpose |
|---|---|
| nerd-dictation | Base speech-to-text tool using VOSK |
| VOSK Models | Neural network models for each language |
| xkblayout-state | Detects current keyboard layout |
The Magic: Layout Detection
The key innovation is detecting the keyboard layout in real-time:
# Get current keyboard layout
CURRENT_LAYOUT=$(xkblayout-state print "%s")
# Result: "us", "ara", "de", etc.
Then map it to the appropriate model:
case "$CURRENT_LAYOUT" in
us)
MODEL_DIR="$HOME/.config/nerd-dictation/model"
LANG_NAME="English"
;;
ara)
MODEL_DIR="$HOME/.config/nerd-dictation/model-ar"
LANG_NAME="Arabic"
;;
esac
The Scripts
dictate-start — Main script that:
- Detects keyboard layout
- Selects appropriate model
- Starts dictation with 30-second timeout
- Shows desktop notification
dictate-stop — Cleanup script that:
- Reads which model was used
- Stops the dictation process
- Types the transcribed text
- Shows completion notification
Installation
Prerequisites
# Install system dependencies
sudo apt install python3-pip xdotool zenity libnotify-bin wget unzip git
# Install xkblayout-state
git clone https://github.com/nonpop/xkblayout-state.git
cd xkblayout-state && make && sudo make install
Download Models
# English (required)
mkdir -p ~/.config/nerd-dictation
wget https://alphacephei.com/vosk/models/vosk-model-small-en-us-0.15.zip
unzip vosk-model-small-en-us-0.15.zip
mv vosk-model-small-en-us-0.15 ~/.config/nerd-dictation/model
# Arabic (optional)
wget https://alphacephei.com/vosk/models/vosk-model-ar-mgb2-0.4.zip
unzip vosk-model-ar-mgb2-0.4.zip
mv vosk-model-ar-mgb2-0.4 ~/.config/nerd-dictation/model-ar
Set Up Scripts
# Clone scripts
cp dictate-start ~/.nerd-dictation/
cp dictate-stop ~/.nerd-dictation/
chmod +x ~/.nerd-dictation/dictate-*
Configure Shortcuts
In your desktop settings:
| Shortcut | Command |
|---|---|
Super+H | ~/.nerd-dictation/dictate-start |
Super+Shift+H | ~/.nerd-dictation/dictate-stop |
Results
Before
1. Switch keyboard to Arabic
2. Find and open Arabic dictation app
3. Start dictation
4. Speak
5. Switch back to English
6. Find and open English dictation app
7. Repeat...
After
1. Press Super+H
2. Speak
3. Press Super+Shift+H
4. Done!
Supported Languages
| Layout | Language | Model Size |
|---|---|---|
us | English | 40 MB |
ara | Arabic | 333 MB |
de | German | 45 MB |
fr | French | 45 MB |
Adding new languages is straightforward — just download the VOSK model and add a case to the script.
Challenges & Solutions
Challenge 1: Reliable Layout Detection
Problem: gsettings didn’t update in real-time when switching layouts.
Solution: Used xkblayout-state which reads directly from X11.
Challenge 2: Model Organization
Problem: Multiple language models needed management.
Solution: Standardized directory naming (model, model-ar, model-de).
Challenge 3: Error Handling
Problem: Users might not understand why dictation fails.
Solution: Desktop notifications + zenity popups with clear messages.
Future Improvements
- Add support for more languages
- Implement auto-language switching mid-dictation
- Create GUI for configuration
- Add Whisper as alternative backend
Conclusion
Linux speech recognition has come a long way. With nerd-dictation-auto-switch-languages and VOSK, you can have seamless, offline, private voice typing that automatically adapts to your language. No cloud services, no subscriptions, just open-source speech-to-text that just works.
The project is fully open-source and available on GitHub. Contributions welcome!
Resources
Have questions or suggestions? Open an issue on GitHub!





Leave a Comment
Your email address will not be published. Required fields are marked *