No description
  • Python 96.8%
  • Dockerfile 3.2%
Find a file
2026-03-08 09:48:46 +00:00
.env.example Initial commit: Add voice-to-Trilium note transcription tool 2026-03-08 08:30:32 +00:00
docker-compose.yml Persist HuggingFace model cache to ./voice-models volume 2026-03-08 09:48:46 +00:00
Dockerfile Switch to nvidia/cuda base image to fix missing libcublas.so.12 2026-03-08 09:29:12 +00:00
pyrightconfig.json Fix voice note detection, add startup Signal message, tighten types 2026-03-08 09:05:30 +00:00
README.md Initial commit: Add voice-to-Trilium note transcription tool 2026-03-08 08:30:32 +00:00
requirements.txt Initial commit: Add voice-to-Trilium note transcription tool 2026-03-08 08:30:32 +00:00
voice_to_trilium.py Send immediate acknowledgement when voice note is received 2026-03-08 09:45:29 +00:00

voice-to-trilium

Send a Signal voice note to yourself → auto-transcribed by Whisper → saved as a child note under today's Trilium dateNote.

Architecture

Android Signal app
  └─ voice note (to yourself)
       └─ signal-cli-rest-api  (linked device, Docker)
            └─ voice_to_trilium.py  (polls every 10s)
                 ├─ faster-whisper  (GPU transcription)
                 └─ Trilium ETAPI   (creates child note)
                      └─ Signal reply (confirmation)

Prerequisites

  • Docker + Docker Compose with NVIDIA runtime (for GPU) on nova/media4
  • Trilium running and accessible from the pipeline host
  • Your Signal account (no second number needed — we link as a secondary device)

Setup

1. Clone / copy files

mkdir ~/voice-to-trilium && cd ~/voice-to-trilium
# copy all files here
cp .env.example .env

2. Configure .env

nano .env

Fill in:

  • SIGNAL_NUMBER — your mobile number in international format (e.g. +447700000000)
  • TRILIUM_URL — e.g. http://192.168.1.50:8080
  • TRILIUM_TOKEN — from Trilium → Options → ETAPI → Generate new token

Start only the Signal API container:

docker compose up signal-api

Open in your browser:

http://<host-ip>:8888/v1/qrcodelink?device_name=voice-bot

On your phone: Signal → Settings → Linked Devices → + → scan the QR code.

You should see "voice-bot" appear in your linked devices list. Done — press Ctrl+C.

4. Start everything

docker compose up -d

5. Test it

Open Signal on your phone. Send a voice note to yourself (your own number in contacts, or the "Note to Self" conversation).

Within ~10 seconds you should:

  1. See a new child note appear under today's date in Trilium
  2. Receive a Signal reply confirming the transcript

6. Logs

docker compose logs -f voice-pipeline

Trilium Note Structure

Each voice note creates:

📅 2025-01-15 (dateNote)
  └── 🎙️ Voice Note 14:32 #voicenote
        <transcript text>

Running on nova (no GPU)

If running on nova without a dedicated GPU, change in .env:

WHISPER_DEVICE=cpu
WHISPER_MODEL=medium   # large-v3 is slow on CPU
WHISPER_COMPUTE=int8

And remove the deploy.resources block from docker-compose.yml.

Running on media4 (RTX 4070)

This is the ideal host. The default config is already set for CUDA. Make sure the NVIDIA Container Toolkit is installed:

# Test GPU access in Docker
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi

Optional: NixOS module

If you want to manage this via NixOS on nova/media4 rather than raw Docker Compose, the key pieces are:

  • virtualisation.oci-containers.containers.signal-api
  • virtualisation.oci-containers.containers.voice-pipeline
  • A systemd service that runs voice_to_trilium.py with a venv

The Docker Compose approach is simpler to iterate on first.

Whisper model sizes (reference)

Model VRAM Speed (RTX 4070) Quality
tiny ~1 GB very fast basic
small ~2 GB fast good
medium ~5 GB moderate great
large-v3 ~10 GB moderate best

RTX 4070 has 12GB VRAM so large-v3 fits comfortably.

Troubleshooting

"Signal receive error" — check the signal-api container is healthy:

curl http://localhost:8888/v1/about

"Could not get dateNote" — verify your TRILIUM_TOKEN and TRILIUM_URL. Test:

curl -H "Authorization: Bearer YOUR_TOKEN" http://YOUR_TRILIUM/etapi/calendar/days/2025-01-15

Voice note not detected — Signal voice notes are audio/ogg or audio/opus. If you're using a third-party recorder app sending a different format, it should still work as long as it's an audio MIME type. Check logs for "Attachment has no id" messages.