- Python 96.8%
- Dockerfile 3.2%
| .env.example | ||
| docker-compose.yml | ||
| Dockerfile | ||
| pyrightconfig.json | ||
| README.md | ||
| requirements.txt | ||
| voice_to_trilium.py | ||
voice-to-trilium
Send a Signal voice note to yourself → auto-transcribed by Whisper → saved as a child note under today's Trilium dateNote.
Architecture
Android Signal app
└─ voice note (to yourself)
└─ signal-cli-rest-api (linked device, Docker)
└─ voice_to_trilium.py (polls every 10s)
├─ faster-whisper (GPU transcription)
└─ Trilium ETAPI (creates child note)
└─ Signal reply (confirmation)
Prerequisites
- Docker + Docker Compose with NVIDIA runtime (for GPU) on nova/media4
- Trilium running and accessible from the pipeline host
- Your Signal account (no second number needed — we link as a secondary device)
Setup
1. Clone / copy files
mkdir ~/voice-to-trilium && cd ~/voice-to-trilium
# copy all files here
cp .env.example .env
2. Configure .env
nano .env
Fill in:
SIGNAL_NUMBER— your mobile number in international format (e.g.+447700000000)TRILIUM_URL— e.g.http://192.168.1.50:8080TRILIUM_TOKEN— from Trilium → Options → ETAPI → Generate new token
3. Link Signal (one-time)
Start only the Signal API container:
docker compose up signal-api
Open in your browser:
http://<host-ip>:8888/v1/qrcodelink?device_name=voice-bot
On your phone: Signal → Settings → Linked Devices → + → scan the QR code.
You should see "voice-bot" appear in your linked devices list. Done — press Ctrl+C.
4. Start everything
docker compose up -d
5. Test it
Open Signal on your phone. Send a voice note to yourself (your own number in contacts, or the "Note to Self" conversation).
Within ~10 seconds you should:
- See a new child note appear under today's date in Trilium
- Receive a Signal reply confirming the transcript
6. Logs
docker compose logs -f voice-pipeline
Trilium Note Structure
Each voice note creates:
📅 2025-01-15 (dateNote)
└── 🎙️ Voice Note 14:32 #voicenote
<transcript text>
Running on nova (no GPU)
If running on nova without a dedicated GPU, change in .env:
WHISPER_DEVICE=cpu
WHISPER_MODEL=medium # large-v3 is slow on CPU
WHISPER_COMPUTE=int8
And remove the deploy.resources block from docker-compose.yml.
Running on media4 (RTX 4070)
This is the ideal host. The default config is already set for CUDA. Make sure the NVIDIA Container Toolkit is installed:
# Test GPU access in Docker
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi
Optional: NixOS module
If you want to manage this via NixOS on nova/media4 rather than raw Docker Compose, the key pieces are:
virtualisation.oci-containers.containers.signal-apivirtualisation.oci-containers.containers.voice-pipeline- A systemd service that runs
voice_to_trilium.pywith a venv
The Docker Compose approach is simpler to iterate on first.
Whisper model sizes (reference)
| Model | VRAM | Speed (RTX 4070) | Quality |
|---|---|---|---|
| tiny | ~1 GB | very fast | basic |
| small | ~2 GB | fast | good |
| medium | ~5 GB | moderate | great |
| large-v3 | ~10 GB | moderate | best |
RTX 4070 has 12GB VRAM so large-v3 fits comfortably.
Troubleshooting
"Signal receive error" — check the signal-api container is healthy:
curl http://localhost:8888/v1/about
"Could not get dateNote" — verify your TRILIUM_TOKEN and TRILIUM_URL. Test:
curl -H "Authorization: Bearer YOUR_TOKEN" http://YOUR_TRILIUM/etapi/calendar/days/2025-01-15
Voice note not detected — Signal voice notes are audio/ogg or audio/opus. If you're using a third-party recorder app sending a different format, it should still work as long as it's an audio MIME type. Check logs for "Attachment has no id" messages.