# GemmaNotes Push-to-talk voice notes for Obsidian, transcribed **entirely on-device** with Google's Gemma 4 (E2B / E4B) running in-process via transformers.js. No server, no API key β€” fully offline after a one-time model download. πŸš€ Install here: https://community.obsidian.md/plugins/gemmanotes Built with Love and Antigravity CLI, Gemini 3.5 Flash ``` β–„β–€β–€β–„ Antigravity CLI 1.0.2 β–€β–€β–€β–€β–€β–€ xxxxxxxxxxxxxxxxxxxxx β–€β–€β–€β–€β–€β–€β–€β–€ Gemini 3.5 Flash (High) β–„β–€β–€ β–€β–€β–„ ~/gemmanotes β–„β–€β–€ β–€β–€β–„ ``` ## How it works 1. Toggle recording with the ribbon mic button or the **Toggle voice recording** command (bind a hotkey to it). 2. On stop, a placeholder is pinned into the current note. 3. Audio is decoded to 16 kHz mono, split into ≀25 s chunks, and transcribed by Gemma 4. Transcriptions run through a FIFO queue, so you can start the next recording while one is still processing. 4. The placeholder is swapped for the transcribed text. 5. A status-bar hint offers a one-click **rewrite** into tidy prose. 6. Network use: Only needed for downloading and caching models from huggingface. ## Local Setup Git clone: ```bash gh clone https://github.com/sarath/gemmanotes ``` Development: ```bash npm install npm run build # production bundle -> main.js npm run dev # watch mode ``` Install into a vault for local testing: Mac ```bash git clone https://github.com/sarath/gemmanotes ./install.sh /path/to/your/vault ``` This fetches `origin`, checks out `origin/` cleanly, runs `npm run build`, and copies `main.js`, `manifest.json`, and `styles.css` into `/.obsidian/plugins/gemmanotes/`. Then enable the plugin and use **Settings β†’ GemmaNotes β†’ Download** to fetch the model (~3.2 GB for E2B, ~5 GB for E4B). ## Local Testing and Development If you are developing inside Google Cloud Shell and want to deploy, test, and debug the plugin directly in a local Obsidian instance: ### 1. Tunnel Remote Debugging Port Start your local Obsidian instance with remote debugging enabled (e.g., `obsidian --remote-debugging-port=9222`). Then, run the following command to tunnel port `9222` from your local machine to your Cloud Shell workspace: ```bash gcloud cloud-shell ssh --ssh-flag="-L 9222:localhost:9222" ``` This allows scripts running in the Cloud Shell environment to connect to the Chrome DevTools protocol on your local Obsidian instance. ### 2. Build, Deploy, and Test 1. Compile the production bundle: ```bash npm run build ``` 2. Deploy the files (`main.js`, `manifest.json`, `styles.css`) to the connected Obsidian instance and automatically reload the plugin: ```bash npm run local:deploy ``` 3. Test model loading and verify initialization: ```bash npm run local:test-load ``` ## Known v1 limitations - **Fixed-window chunking** β€” long recordings are cut on a 25 s boundary, which can split a word. Silence-aware chunking is the planned next step. - **transformers.js Gemma 4 audio API** β€” the processor/model call in `src/transcriber.ts` follows the documented image-text-to-text pattern; the exact audio surface should be confirmed against the model card sample code. - **Undo granularity** β€” placeholderβ†’text and rewrite swaps are written via the vault API; they are not always a single editor-undo step. - **Model cache** β€” stored in the browser Cache API. Persistent and offline-safe, but not yet relocated to the plugin data dir. - Desktop only β€” the model size and WebGPU requirement rule out mobile.