No description

Python 96.9%
Dockerfile 3.1%

Find a file

Nimmo aad8c5c946 All checks were successful Build and Push Docker Image / build (push) Successful in 23m53s Details Adding contsiner build actions		2026-03-31 20:26:40 +01:00
.forgejo/workflows	Adding contsiner build actions	2026-03-31 20:26:40 +01:00
.env.example	Initial commit: Add voice-to-Trilium note transcription tool	2026-03-08 08:30:32 +00:00
docker-compose.yml	Persist HuggingFace model cache to ./voice-models volume	2026-03-08 09:48:46 +00:00
Dockerfile	Switch to nvidia/cuda base image to fix missing libcublas.so.12	2026-03-08 09:29:12 +00:00
pyrightconfig.json	Fix voice note detection, add startup Signal message, tighten types	2026-03-08 09:05:30 +00:00
README.md	Initial commit: Add voice-to-Trilium note transcription tool	2026-03-08 08:30:32 +00:00
requirements.txt	Initial commit: Add voice-to-Trilium note transcription tool	2026-03-08 08:30:32 +00:00
voice_to_trilium.py	Unload Whisper model from memory after each transcription	2026-03-31 06:21:58 +01:00

README.md

voice-to-trilium

Send a Signal voice note to yourself → auto-transcribed by Whisper → saved as a child note under today's Trilium dateNote.

Architecture

Android Signal app
  └─ voice note (to yourself)
       └─ signal-cli-rest-api  (linked device, Docker)
            └─ voice_to_trilium.py  (polls every 10s)
                 ├─ faster-whisper  (GPU transcription)
                 └─ Trilium ETAPI   (creates child note)
                      └─ Signal reply (confirmation)

Prerequisites

Docker + Docker Compose with NVIDIA runtime (for GPU) on nova/media4
Trilium running and accessible from the pipeline host
Your Signal account (no second number needed — we link as a secondary device)

Setup

1. Clone / copy files

mkdir ~/voice-to-trilium && cd ~/voice-to-trilium
# copy all files here
cp .env.example .env

2. Configure .env

nano .env

Fill in:

SIGNAL_NUMBER — your mobile number in international format (e.g. +447700000000)
TRILIUM_URL — e.g. http://192.168.1.50:8080
TRILIUM_TOKEN — from Trilium → Options → ETAPI → Generate new token

3. Link Signal (one-time)

Start only the Signal API container:

docker compose up signal-api

Open in your browser:

http://<host-ip>:8888/v1/qrcodelink?device_name=voice-bot

On your phone: Signal → Settings → Linked Devices → + → scan the QR code.

You should see "voice-bot" appear in your linked devices list. Done — press Ctrl+C.

4. Start everything

docker compose up -d

5. Test it

Open Signal on your phone. Send a voice note to yourself (your own number in contacts, or the "Note to Self" conversation).

Within ~10 seconds you should:

See a new child note appear under today's date in Trilium
Receive a Signal reply confirming the transcript

6. Logs

docker compose logs -f voice-pipeline

Trilium Note Structure

Each voice note creates:

📅 2025-01-15 (dateNote)
  └── 🎙️ Voice Note 14:32 #voicenote
        <transcript text>

Running on nova (no GPU)

If running on nova without a dedicated GPU, change in .env:

WHISPER_DEVICE=cpu
WHISPER_MODEL=medium   # large-v3 is slow on CPU
WHISPER_COMPUTE=int8

And remove the deploy.resources block from docker-compose.yml.

Running on media4 (RTX 4070)

This is the ideal host. The default config is already set for CUDA. Make sure the NVIDIA Container Toolkit is installed:

# Test GPU access in Docker
docker run --rm --gpus all nvidia/cuda:12.0-base-ubuntu22.04 nvidia-smi

Optional: NixOS module

If you want to manage this via NixOS on nova/media4 rather than raw Docker Compose, the key pieces are:

virtualisation.oci-containers.containers.signal-api
virtualisation.oci-containers.containers.voice-pipeline
A systemd service that runs voice_to_trilium.py with a venv

The Docker Compose approach is simpler to iterate on first.

Whisper model sizes (reference)

Model	VRAM	Speed (RTX 4070)	Quality
tiny	~1 GB	very fast	basic
small	~2 GB	fast	good
medium	~5 GB	moderate	great
large-v3	~10 GB	moderate	best

RTX 4070 has 12GB VRAM so large-v3 fits comfortably.

Troubleshooting

"Signal receive error" — check the signal-api container is healthy:

curl http://localhost:8888/v1/about

"Could not get dateNote" — verify your TRILIUM_TOKEN and TRILIUM_URL. Test:

curl -H "Authorization: Bearer YOUR_TOKEN" http://YOUR_TRILIUM/etapi/calendar/days/2025-01-15

Voice note not detected — Signal voice notes are audio/ogg or audio/opus. If you're using a third-party recorder app sending a different format, it should still work as long as it's an audio MIME type. Check logs for "Attachment has no id" messages.