# 实时语音转写 · Realtime Transcription for Obsidian
中文 | English
基于 SenseVoice-Small + Silero VAD + sherpa-onnx 的本地实时语音转写 Obsidian 插件
A local, real-time speech-to-text Obsidian plugin powered by SenseVoice-Small + Silero VAD + sherpa-onnx
---
## 中文文档
### 功能特性
| 功能 | 说明 |
|------|------|
| **本地实时转写** | 完全本地运行,无需联网,支持边说边显示 |
| **多语言识别** | 中文 / 英文 / 日文 / 韩文 / 粤语 |
| **识别语言范围** | 可限定为纯中文、纯英文或中英混杂模式 |
| **实时预览模式** | 稳态档(更准)/ 极速档(更快)两档切换 |
| **自动翻译** | 检测到非中文内容时,自动调用 OpenAI 兼容 API 翻译成中文 |
| **AI 文本润色** | 手动触发,将口语化转写润色为规范书面语 |
| **AI 自动摘要** | 按字数阈值自动生成摘要(默认每 500 字触发一次) |
| **二次摘要(综合总结)** | 累积多个摘要后自动生成一份综合总结 |
| **导出为笔记** | 一键导出为 Obsidian Markdown 笔记,支持时间戳/AI/手动三种命名方式 |
| **历史记录持久化** | 关闭 Obsidian 后转写记录不丢失 |
| **跨平台支持** | macOS / Windows / Linux 全平台兼容 |
### 架构概览
```
Obsidian 插件 (TypeScript)
├── src/
│ ├── main.ts # 插件主入口,协调所有服务
│ ├── settings.ts # 设置面板 UI
│ ├── types.ts # 类型定义
│ ├── constants.ts # 常量
│ ├── services/
│ │ ├── BackendManager.ts # Python 后端进程管理(启动/停止)
│ │ ├── WebSocketClient.ts # 与后端的 WebSocket 通信
│ │ ├── AudioCapture.ts # 麦克风音频采集(Web Audio API)
│ │ ├── TranslationService.ts # 调用 LLM API 翻译
│ │ ├── SummaryService.ts # 调用 LLM API 摘要 + AI 命名
│ │ └── FormalizeService.ts # 调用 LLM API 文本润色
│ ├── views/
│ │ ├── TranscriptionView.ts # 右侧边栏主视图
│ │ └── TitleInputModal.ts # 手动命名导出弹窗
│ └── utils/
│ ├── pluginPaths.ts # 插件目录解析
│ └── entrySerializer.ts # 历史记录序列化
│
└── backend/ # Python 后端
├── server.py # WebSocket 服务端(sherpa-onnx 推理)
├── download_model.py # 模型自动下载脚本
└── requirements.txt # Python 依赖
数据流:
麦克风 → AudioCapture → WebSocket → server.py → sherpa-onnx
↓
partial/final 文本 → 聚合 → 翻译/摘要/润色 → 视图
```
---
### 第一步:安装 Obsidian 插件
#### 方式 A:从 Release 直接安装(推荐普通用户)
1. 前往 [Releases](https://github.com/garetneda-gif/obsidian-realtime-transcription/releases) 下载最新版本的 zip 文件
2. 解压后,将文件夹**重命名为** `realtime-transcription`
3. 将该文件夹整体复制到你 Vault 的插件目录:
- macOS / Linux:`<你的Vault>/.obsidian/plugins/realtime-transcription/`
- Windows:`<你的Vault>\.obsidian\plugins\realtime-transcription\`
> 不知道 Vault 在哪?打开 Obsidian → 左下角「管理库」→ 查看库的本地路径。
4. 打开 Obsidian → **设置** → **第三方插件** → 关闭安全模式 → 找到「实时语音转写」并启用
#### 方式 B:从源码构建(推荐开发者)
```bash
git clone https://github.com/garetneda-gif/obsidian-realtime-transcription.git
cd obsidian-realtime-transcription
npm install
npm run build
# 构建产物:根目录的 main.js
# 将 manifest.json、main.js、styles.css、backend/ 复制到 Vault 插件目录
```
> 若你使用 remotely-save,可能在同步结束时被旧版 `main.js` 覆盖。可在同步完成后执行:
>
> ```bash
> npm run post-sync-refresh -- --vault "/你的/Vault/路径" --vault-name "你的Vault名称"
> ```
>
> 该命令会再次复制插件文件,并通过 Obsidian CLI 执行 `plugin:reload` 强制重载。
---
### 第二步:安装 Python
> 如果你已有 Python 3.10 ~ 3.12,可跳过此步。
**检查是否已安装 Python:**
```bash
# macOS / Linux
python3 --version
# Windows(命令提示符或 PowerShell)
python --version
```
输出类似 `Python 3.11.x` 则已安装,可继续。否则按下方系统安装:
| 系统 | 安装方式 |
|------|---------|
| **macOS** | 推荐:`brew install python@3.12`(需先安装 [Homebrew](https://brew.sh))
或:从 [python.org](https://www.python.org/downloads/) 下载安装包 |
| **Windows** | 从 [python.org](https://www.python.org/downloads/) 下载安装包,**安装时务必勾选「Add Python to PATH」** |
| **Linux** | `sudo apt install python3.12 python3.12-pip`(Ubuntu/Debian) |
> **推荐版本:3.10 / 3.11 / 3.12**。3.13 和 3.14 兼容性尚未充分测试,不建议使用。
---
### 第三步:安装 Python 依赖
在插件目录的 `backend/` 文件夹中提供了一键安装脚本,**运行一次即可**,无需手动输入任何 pip 命令。
#### macOS / Linux
双击 `backend/setup.command`(macOS 可直接双击运行),或在终端中运行:
```bash
cd <你的Vault>/.obsidian/plugins/realtime-transcription/backend
bash setup.command
```
#### Windows
双击 `backend\setup.bat`,或在命令提示符中运行:
```bat
cd <你的Vault>\.obsidian\plugins\realtime-transcription\backend
setup.bat
```
脚本会自动完成:创建虚拟环境 → 安装所有依赖 → 验证安装 → **输出第五步需要填写的 Python 路径**。
> **遇到报错?** 确认已安装 Python 3.10~3.12,且终端/PowerShell 有网络访问权限。
---
### 第四步:准备模型文件
模型文件需要存放在一个**你自己创建的目录**中(插件不会自动创建目录)。
**先创建模型目录:**
```bash
# macOS / Linux
mkdir -p ~/obsidian-models
# Windows(命令提示符)
mkdir C:\Users\你的用户名\obsidian-models
```
**然后下载模型(二选一):**
#### 方法一:插件内一键下载(推荐)
1. 打开 Obsidian → 设置 → 实时语音转写 → **模型设置**
2. 在「模型目录」字段填入刚创建的目录路径:
- macOS/Linux:`/Users/你的用户名/obsidian-models`
- Windows:`C:\Users\你的用户名\obsidian-models`
3. 点击 **下载模型** 按钮(约 240 MB,需要网络,耐心等待)
4. 弹出「模型下载完成!」通知后即可
#### 方法二:手动下载
将以下三个文件**分别下载**到同一目录(每个都是独立文件,无需解压):
| 文件 | 下载链接(点击直接下载) | 大小 |
|------|------------------------|------|
| `model.int8.onnx` | [HuggingFace 下载](https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/model.int8.onnx) · [国内镜像](https://hf-mirror.com/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/model.int8.onnx) | ~229 MB |
| `tokens.txt` | [HuggingFace 下载](https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/tokens.txt) · [国内镜像](https://hf-mirror.com/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/tokens.txt) | <1 MB |
| `silero_vad.onnx` | [GitHub 下载](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx) | ~1.8 MB |
> **提示**:建议保持「使用 Int8 量化模型」为开启状态(默认已开启),可将模型体积从 895 MB 压缩至 229 MB,精度基本无损。
**确认三个文件都在目录中:**
```bash
ls ~/obsidian-models
# 应该看到:model.int8.onnx tokens.txt silero_vad.onnx
```
---
### 第五步:插件配置
打开 Obsidian → **设置** → **实时语音转写**,按以下顺序配置:
#### 后端设置
- `Python 路径`:填写 Python 的路径
- macOS / Linux:填 `python3`(大多数情况下直接可用)
- Windows:填 `python`(插件会自动设置此默认值)
> 如果默认值不工作,需要获取 Python 完整路径:
> - macOS/Linux:在终端运行 `which python3`
> - Windows:在命令提示符运行 `where python`,复制第一行结果
各平台路径示例:
| 系统 | Python 路径示例 |
|------|----------------|
| macOS(系统 Python) | `python3` 或 `/usr/local/bin/python3` |
| macOS(虚拟环境,推荐) | `/Users/你的用户名/.../backend/venv/bin/python` |
| Windows | `C:\Users\yourname\AppData\Local\Programs\Python\Python312\python.exe` |
| Linux | `python3` 或 `/usr/bin/python3` |
- `后端端口`:默认 18888,一般无需修改
- 点击 **检测环境** 按钮验证配置:
- **成功**:弹出通知「环境检测通过:Python + sherpa-onnx 可用」→ 可继续
- **失败**:见下方[环境检测失败排查](#环境检测失败排查)
#### 模型设置
- `模型目录`:填入第四步中创建的目录完整路径
- `识别语言范围`:`中英混杂`(默认)/ `纯中文` / `纯英文`
> 说中文时识别出日语或韩语?将此项改为「纯中文」。
#### 翻译 / 润色 / 摘要设置(均为可选)
这三项功能需要调用 AI 接口,支持任意 **OpenAI 兼容 API**(如 DeepSeek、通义千问、本地 Ollama 等)。
| 字段 | 填写说明 |
|------|---------|
| API 端点 | 完整 URL,例如 `https://api.deepseek.com/v1/chat/completions` |
| API Key | 对应服务的密钥,以 `sk-` 开头 |
| 模型名称 | 例如 `deepseek-chat`、`qwen-turbo`、`gpt-4o-mini` |
> 如果暂时不需要翻译/摘要功能,直接跳过这三项,保持关闭状态即可。
#### 高级设置(可选,默认值已够用)
| 参数 | 说明 | 推荐值 |
|------|------|--------|
| 实时模式预设 | 稳态档更准,极速档更快 | 稳态档 |
| 实时预览 | 边说边显示识别中的文字 | 开启 |
| VAD 静音阈值 | 越大分句越少 | 1.0 s |
| 聚合输出窗口 | 越大段落越长(延迟也越大) | 4 s |
| 单段最大字数 | 超过此长度自动换段 | 320 字 |
---
### 使用方法
1. 点击左侧 Ribbon 栏的**麦克风图标**,打开转写面板
2. 点击面板中的**开始录制**按钮
3. 对着麦克风说话,右侧面板实时显示转写文字
4. 说完后点击**停止录制**
5. 可选:点击任意条目上的**润色**按钮,用 AI 整理为书面语
6. 点击**导出笔记**,将转写内容保存为 Obsidian 笔记文件
---
### 常见问题排查
#### 环境检测失败排查
| 提示内容 | 可能原因 | 解决方案 |
|---------|---------|---------|
| 「环境检测失败,请执行 pip install...」| sherpa-onnx 依赖未安装 | 按第三步说明安装依赖后重试 |
| 「环境检测失败」但依赖已安装 | 使用了虚拟环境,但 Python 路径仍指向系统 Python | 将「Python 路径」改为虚拟环境路径,例如 `/path/to/backend/venv/bin/python` |
| 检测无反应,按钮灰色 | Python 路径字段为空 | macOS/Linux 填 `python3`;Windows 填 `python` |
| 「No such file or directory」| Python 路径不存在 | macOS/Linux 运行 `which python3`;Windows 运行 `where python` 获取正确路径 |
| Windows 上找不到 python | Python 未加入系统 PATH | 重新安装 Python,安装时勾选「Add Python to PATH」|
#### 后端启动失败:错误信息对照表
| 错误提示 | 原因 | 解决方案 |
|---------|------|---------|
| `模型文件缺失: model.int8.onnx` | 模型未下完或目录填错 | 检查模型目录路径,重新点击「下载模型」 |
| `模型文件缺失: tokens.txt` | 同上 | 同上 |
| `模型文件缺失: silero_vad.onnx` | 同上 | 同上 |
| `后端启动超时(30秒)` | 模型首次加载慢,或 Python 环境有问题 | 关闭其他占用内存的程序后重试;确认依赖已安装 |
| `[Errno 2] No such file or directory` | Python 路径填错 | 重新检查 Python 路径配置 |
> **查看详细错误日志**:
> - macOS:`Cmd + Option + I` → Console 标签
> - Windows:`Ctrl + Shift + I` → Console 标签
>
> 将红色报错信息复制后可在 [Issues](https://github.com/garetneda-gif/obsidian-realtime-transcription/issues) 提问。
#### 翻译返回 404 错误
检查 API URL 是否多写了 `/v1`:
```
# 错误
https://api.example.com/v1v1/chat/completions
# 正确
https://api.example.com/v1/chat/completions
```
#### 频繁出现 429 限流
- 换用速率更高的模型或提升 API 套餐额度
- 关闭自动翻译,改为手动触发
- 调大「聚合输出窗口」(减少 API 调用频率)
#### 识别结果分句太碎
在高级设置中调大:
- `VAD 静音阈值`(建议从 1.0 调到 1.5~2.0)
- `聚合输出窗口`(建议从 4 调到 6~8)
#### Windows 后端启动报 NotImplementedError
如果在 v1.0.2 或更早版本遇到 `NotImplementedError: add_signal_handler` 错误,请升级至 v1.0.3+。此问题已在新版本中修复。
#### macOS 首次运行弹出安全警告
macOS 可能拦截未经公证的 Python 脚本,出现「无法验证开发者」提示:
1. 打开「系统设置」→「隐私与安全性」
2. 找到相关提示,点击「仍要打开」或「允许」
3. 返回 Obsidian,重新点击开始录制
---
### 安全提示
- `data.json`(含 API Key)**不要**提交到 Git 或分享给他人
- 换新设备时建议手动在插件设置中重新填写 API Key
---
### Contributing
Pull requests and issues are welcome! Please:
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/my-feature`
3. Commit your changes following [Conventional Commits](https://www.conventionalcommits.org/)
4. Open a Pull Request
---
## English Documentation
### Features
| Feature | Description |
|---------|-------------|
| **Local real-time transcription** | Fully local inference, no internet required, streaming text display |
| **Multi-language recognition** | Chinese / English / Japanese / Korean / Cantonese |
| **Recognition mode** | Limit to Chinese-only, English-only, or mixed mode |
| **Real-time profile** | Stable mode (more accurate) / Fast mode (lower latency) |
| **Auto translation** | Automatically translate non-Chinese speech to Chinese via OpenAI-compatible API |
| **AI text formalization** | On-demand polishing of colloquial transcriptions into formal written text |
| **AI auto-summary** | Generate summaries after a configurable character threshold (default: 500 chars) |
| **Meta-summary** | Automatically generate a comprehensive summary after accumulating multiple summaries |
| **Export to note** | One-click export to Obsidian Markdown note; timestamp / AI-generated / manual title |
| **Persistent history** | Transcription history survives Obsidian restarts |
| **Cross-platform** | Fully compatible with macOS / Windows / Linux |
### Architecture Overview
```
Obsidian Plugin (TypeScript)
├── src/
│ ├── main.ts # Plugin entry point, orchestrates all services
│ ├── settings.ts # Settings UI tab
│ ├── types.ts # TypeScript type definitions
│ ├── constants.ts # Shared constants
│ ├── services/
│ │ ├── BackendManager.ts # Python backend process lifecycle (start/stop)
│ │ ├── WebSocketClient.ts # WebSocket communication with backend
│ │ ├── AudioCapture.ts # Microphone capture (Web Audio API)
│ │ ├── TranslationService.ts # LLM API calls for translation
│ │ ├── SummaryService.ts # LLM API calls for summarization & AI naming
│ │ └── FormalizeService.ts # LLM API calls for text formalization
│ ├── views/
│ │ ├── TranscriptionView.ts # Main sidebar panel view
│ │ └── TitleInputModal.ts # Manual note-naming modal
│ └── utils/
│ ├── pluginPaths.ts # Plugin directory resolution
│ └── entrySerializer.ts # History serialization/deserialization
│
└── backend/ # Python backend
├── server.py # WebSocket server (sherpa-onnx inference)
├── download_model.py # Automatic model downloader
└── requirements.txt # Python dependencies
Data Flow:
Microphone → AudioCapture → WebSocket → server.py → sherpa-onnx
↓
partial/final text → aggregation → translate/summarize/formalize → view
```
---
### Step 1: Install the Obsidian Plugin
#### Option A: From Release (recommended for regular users)
1. Go to [Releases](https://github.com/garetneda-gif/obsidian-realtime-transcription/releases) and download the latest zip file
2. Extract it and **rename the folder** to `realtime-transcription`
3. Copy the folder to your Vault's plugins directory:
- macOS / Linux: `/.obsidian/plugins/realtime-transcription/`
- Windows: `\.obsidian\plugins\realtime-transcription\`
> Not sure where your Vault is? Open Obsidian → Click the vault icon (bottom left) → View the local path.
4. Open Obsidian → **Settings** → **Community Plugins** → Disable Safe Mode → Enable **Realtime Transcription**
#### Option B: Build from Source (for developers)
```bash
git clone https://github.com/garetneda-gif/obsidian-realtime-transcription.git
cd obsidian-realtime-transcription
npm install
npm run build
# Copy manifest.json, main.js, styles.css, backend/ to your Vault plugin directory
```
> If you use remotely-save, sync may overwrite `main.js` with an older version at the end of sync. Run this right after sync:
>
> ```bash
> npm run post-sync-refresh -- --vault "/path/to/your/vault" --vault-name "Your Vault Name"
> ```
>
> This command recopies plugin files and then forces plugin reload via Obsidian CLI (`plugin:reload`).
---
### Step 2: Install Python
> Skip this step if you already have Python 3.10–3.12.
**Check if Python is installed:**
```bash
# macOS / Linux
python3 --version
# Windows (Command Prompt or PowerShell)
python --version
```
If you see `Python 3.11.x` (or similar), you're good. Otherwise, install Python for your OS:
| OS | How to install |
|----|----------------|
| **macOS** | Recommended: `brew install python@3.12` (requires [Homebrew](https://brew.sh))
Or: Download installer from [python.org](https://www.python.org/downloads/) |
| **Windows** | Download installer from [python.org](https://www.python.org/downloads/). **Check "Add Python to PATH"** during install. |
| **Linux** | `sudo apt install python3.12 python3.12-pip` (Ubuntu/Debian) |
> **Recommended versions: 3.10 / 3.11 / 3.12.** Versions 3.13 and 3.14 have not been fully tested.
---
### Step 3: Install Python Dependencies
A one-shot setup script is included in the `backend/` folder. **Run it once** — no manual pip commands needed.
#### macOS / Linux
Double-click `backend/setup.command` (macOS supports direct double-click), or run in terminal:
```bash
cd /.obsidian/plugins/realtime-transcription/backend
bash setup.command
```
#### Windows
Double-click `backend\setup.bat`, or run in Command Prompt:
```bat
cd \.obsidian\plugins\realtime-transcription\backend
setup.bat
```
The script automatically: creates a virtual environment → installs all dependencies → verifies the install → **prints the Python path you need for Step 5**.
> **Errors?** Make sure Python 3.10–3.12 is installed and your terminal has internet access.
---
### Step 4: Prepare Model Files
You need to create a folder to store model files. The plugin will not create it automatically.
**Create the model directory:**
```bash
# macOS / Linux
mkdir -p ~/obsidian-models
# Windows (Command Prompt)
mkdir C:\Users\YourUsername\obsidian-models
```
**Then download the models (choose one method):**
#### Method 1: In-plugin download (recommended)
1. Open Obsidian → Settings → Realtime Transcription → **Model Settings**
2. Enter the directory path you just created in the **Model Directory** field:
- macOS/Linux: `/Users/YourUsername/obsidian-models`
- Windows: `C:\Users\YourUsername\obsidian-models`
3. Click **Download Model** (~240 MB, requires internet, please wait patiently)
4. A notification "Model download complete!" means success
#### Method 2: Manual download
Download each file **individually** into the same directory (all are standalone files, no extraction needed):
| File | Download Link | Size |
|------|---------------|------|
| `model.int8.onnx` | [HuggingFace](https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/model.int8.onnx) · [Mirror (China)](https://hf-mirror.com/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/model.int8.onnx) | ~229 MB |
| `tokens.txt` | [HuggingFace](https://huggingface.co/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/tokens.txt) · [Mirror (China)](https://hf-mirror.com/csukuangfj/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/resolve/main/tokens.txt) | <1 MB |
| `silero_vad.onnx` | [GitHub](https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/silero_vad.onnx) | ~1.8 MB |
> **Tip**: Keep `Use Int8 quantized model` enabled (default). It reduces model size from 895 MB to 229 MB with negligible quality loss.
**Verify all three files are present:**
```bash
ls ~/obsidian-models
# Should show: model.int8.onnx tokens.txt silero_vad.onnx
```
---
### Step 5: Configure the Plugin
Open Obsidian → **Settings** → **Realtime Transcription** and configure in order:
#### Backend Settings
- `Python Path`: Enter your Python path
- macOS / Linux: Enter `python3` (works in most cases)
- Windows: Enter `python` (the plugin auto-detects this default)
> If the default doesn't work, find the full Python path:
> - macOS/Linux: Run `which python3` in terminal
> - Windows: Run `where python` in Command Prompt, use the first result
Path examples by OS:
| OS | Python Path Example |
|----|---------------------|
| macOS (system Python) | `python3` or `/usr/local/bin/python3` |
| macOS (virtual env, recommended) | `/Users/yourname/.../backend/venv/bin/python` |
| Windows | `C:\Users\yourname\AppData\Local\Programs\Python\Python312\python.exe` |
| Linux | `python3` or `/usr/bin/python3` |
- `Backend Port`: Default 18888, usually no need to change
- Click **Check Environment** to verify your setup:
- **Success**: Notification "Environment check passed: Python + sherpa-onnx available" → proceed
- **Failed**: See [Environment Check Failures](#environment-check-failures) below
#### Model Settings
- `Model Directory`: Enter the full path to the directory from Step 4
- `Recognition Mode`: `Chinese+English` (default) / `Chinese only` / `English only`
> Getting Japanese/Korean when speaking Chinese? Switch to `Chinese only`.
#### Translation / Formalization / Summary Settings (all optional)
These features require an AI API. Any **OpenAI-compatible API** works (OpenAI, DeepSeek, Qwen, local Ollama, etc.).
| Field | What to enter |
|-------|---------------|
| API Endpoint | Full URL, e.g. `https://api.deepseek.com/v1/chat/completions` |
| API Key | Your service API key (usually starts with `sk-`) |
| Model Name | e.g. `deepseek-chat`, `qwen-turbo`, `gpt-4o-mini` |
> If you don't need translation/summary now, just skip these sections and leave them disabled.
#### Advanced Settings (optional, defaults work well)
| Setting | Description | Default |
|---------|-------------|---------|
| Realtime Profile | Stable (more accurate) / Fast (lower latency) | Stable |
| Realtime Preview | Show partial results while speaking | On |
| VAD Silence Threshold | Larger = fewer sentence splits | 1.0 s |
| Aggregation Window | Larger = longer paragraphs (more delay) | 4 s |
| Max Chars per Segment | Auto-split when segment exceeds this length | 320 chars |
---
### Usage
1. Click the **microphone icon** in the left Ribbon to open the transcription panel
2. Click **Start Recording**
3. Speak — text appears in real time on the right panel
4. Click **Stop Recording** when done
5. Optionally click the **Formalize** button on any entry to polish the text
6. Click **Export Note** to save as an Obsidian Markdown file
---
### Troubleshooting
#### Environment Check Failures
| Message | Likely Cause | Fix |
|---------|-------------|-----|
| "Environment check failed, please run pip install..." | sherpa-onnx not installed | Follow Step 3 to install dependencies, then retry |
| "Environment check failed" but packages are installed | Used a venv but Python Path still points to system Python | Set Python Path to the venv path, e.g. `/path/to/backend/venv/bin/python` |
| No response, button stays gray | Python Path field is empty | Enter `python3` (macOS/Linux) or `python` (Windows) |
| "No such file or directory" | Python path is wrong | Run `which python3` (macOS/Linux) or `where python` (Windows) to get the correct path |
| Python not found (Windows) | Python not added to PATH | Reinstall Python and check "Add Python to PATH" |
#### Backend Startup Failures
| Error Message | Cause | Fix |
|---------------|-------|-----|
| `Model file missing: model.int8.onnx` | Download incomplete or wrong directory | Check model directory path; re-run the Download Model step |
| `Model file missing: tokens.txt` | Same as above | Same as above |
| `Model file missing: silero_vad.onnx` | Same as above | Same as above |
| `Backend startup timed out (30s)` | Slow first load or bad Python env | Close other memory-heavy apps; verify all dependencies are installed |
| `[Errno 2] No such file or directory` | Python path is wrong | Double-check the Python Path setting |
> **View detailed error logs:**
> - macOS: `Cmd + Option + I` → Console tab
> - Windows: `Ctrl + Shift + I` → Console tab
>
> Copy any red error messages and open a [GitHub Issue](https://github.com/garetneda-gif/obsidian-realtime-transcription/issues) for help.
#### Translation Returns 404
Check for a duplicated `/v1` in your API URL:
```
# Wrong
https://api.example.com/v1v1/chat/completions
# Correct
https://api.example.com/v1/chat/completions
```
#### Frequent 429 Rate-Limit Errors
- Switch to a model with higher rate limits or upgrade your API plan
- Disable auto-translation and translate manually instead
- Increase the Aggregation Window to reduce API call frequency
#### Transcription Segments Are Too Choppy
Increase these two values in Advanced Settings:
- `VAD Silence Threshold` (try 1.5–2.0 s)
- `Aggregation Window` (try 6–8 s)
#### Windows Backend Throws NotImplementedError
If you encounter `NotImplementedError: add_signal_handler` on v1.0.2 or earlier, upgrade to v1.0.3+. This has been fixed.
#### macOS Security Warning on First Run
macOS may block unnotarized Python scripts with an "unidentified developer" alert:
1. Open **System Settings** → **Privacy & Security**
2. Scroll down to find the blocked item and click **Open Anyway**
3. Return to Obsidian and start recording again
---
### Security Notes
- Do **not** commit `data.json` (which contains API keys) to Git or share it with others
- On a new machine, re-enter API keys manually in plugin settings
### Contributing
Pull requests and issues are welcome! Please:
1. Fork the repository
2. Create a feature branch: `git checkout -b feature/my-feature`
3. Commit your changes following [Conventional Commits](https://www.conventionalcommits.org/)
4. Open a Pull Request
### License
MIT License — see [LICENSE](LICENSE) for details.