# HandTranscriptMd Convert handwritten notes (drawn with a stylus on a canvas) into structured Markdown, directly inside Obsidian. Works on both Windows (desktop) and Android (mobile with stylus). --- ## Table of Contents 1. [User Guide](#user-guide) 2. [Project Structure & Architecture](#project-structure--architecture) 3. [Maintainability Cheat Sheet](#maintainability-cheat-sheet) --- ## User Guide ### What the Plugin Does This plugin embeds a **handwriting canvas** inside any `.md` file. You draw or write with a stylus (or mouse), and the plugin can: - **Save the drawing** as an SVG file in your vault β€” visible as an image even without the plugin installed - **Convert the handwriting to Markdown** using Google Gemini OCR, replacing the drawing block with structured text (headings, lists, tables, etc.) The SVG embed is standard Obsidian wiki syntax (`![[_handwriting/hw_xxx.svg]]`), so the image appears in any Obsidian view and is readable by tools like Claude Code. --- ### Inserting a Handwriting Block 1. Open a Markdown file. 2. Click the **pencil ribbon icon** in the left sidebar (or run the command `Insert handwriting block` via `Ctrl+P`). 3. A new block `![[_handwriting/hw_xxx.svg]]` is inserted at the cursor position. 4. A **portal panel** (toolbar overlay) appears on the image. Click the **pencil button** (✏️) to open the drawing editor. --- ### The Drawing Editor The editor opens differently depending on your platform: | Platform | How it opens | |----------|-------------| | **Windows (desktop)** | Full-screen overlay modal on top of your document | | **Android (mobile)** | A new Obsidian tab | #### Toolbar Buttons | Button | Action | |--------|--------| | **Pen** | Switch to drawing mode (stylus or mouse draws strokes) | | **Eraser** | Switch to eraser mode (drag to erase strokes under the pointer) | | **Color dots** (4) | Select the current drawing color | | **Undo** | Undo last stroke or erase action | | **Redo** | Redo last undone action | | **Clear** | Remove all strokes and reset canvas to default size | | **Convert** | Run OCR and replace the drawing block with Markdown text | | **Save** | Save the current drawing as SVG and update the preview | | **Delete** (πŸ—‘οΈ) | Delete the handwriting block and its SVG file | | **Close / ←** | Close the editor (Windows: close modal; Android: go back) | #### Drawing Tips - **Stylus draws, finger scrolls** β€” on Android, a finger touch scrolls the canvas; the stylus draws. No conflict. - **Canvas auto-expands** β€” as you draw near the bottom edge, the canvas grows automatically. - **Horizontal lines** β€” the canvas shows ruled lines (like a notebook) as a visual guide; they appear in the saved SVG too. - **Colors adapt to theme** β€” strokes drawn in black on a light theme are automatically remapped to white when you switch to dark theme (and vice versa). --- ### Portal Panel (Inline Controls) When you hover over a handwriting image in your document, a small floating panel appears with four buttons: | Button | Action | |--------|--------| | ✏️ | Open drawing editor | | πŸ“„ | Convert drawing to Markdown (OCR) directly from the preview | | ↕️ | Collapse / expand the image preview | | βœ• | Delete the block and its SVG file | --- ### OCR Conversion to Markdown The plugin uses **Google Gemini** to recognize handwritten text and converts it to Markdown based on special keywords you write in the drawing. #### Supported Keywords Write these keywords in your drawing to produce structured Markdown output. All keywords start with `//` and are **case-insensitive** (`//list` = `//LIST`). The colon after the keyword name is **optional** (`//H1 Title` and `//H1: Title` both work). | Keyword | Syntax | Output | |---------|--------|--------| | `//H1` | `//H1 My Title` | `# My Title` | | `//H2` | `//H2 Section` | `## Section` | | `//H3` | `//H3 Sub` | `### Sub` | | `//H4` | `//H4 Sub` | `#### Sub` | | `//LIST` | `//LIST item1, item2, item3` | bullet list | | `//NUMLIST` | `//NUMLIST item1, item2` | numbered list (starts at 1) | | `//NUMLIST` (offset) | `//NUMLIST 3 item1, item2` | numbered list starting at 3 | | `//CHECK` | `//CHECK task1, task2` | checklist (all unchecked) | | `//CHECK` (mixed) | `//CHECK x done, pending, x also done` | checklist with checked/unchecked items | | `//QUOTE` | `//QUOTE Text` | `> Text` | | `//NOTE` | `//NOTE Title` | Obsidian callout `[!NOTE]` | | `//WARN` | `//WARN Title` | Obsidian callout `[!WARNING]` | | `//TIP` | `//TIP Title` | Obsidian callout `[!TIP]` | | `//INFO` | `//INFO Title` | Obsidian callout `[!INFO]` | | `//ERROR` | `//ERROR Title` | Obsidian callout `[!ERROR]` | | `//IMPORTANT` | `//IMPORTANT Title` | Obsidian callout `[!IMPORTANT]` | | `//CODE` | `//CODE snippet` | `` `snippet` `` (inline code) | | `//CODEBLOCK` | `//CODEBLOCK js` + lines + blank line | fenced code block | | `//B` / `//BOLD` | `//BOLD text` | `**text**` | | `//I` | `//I text` | `*text*` | | `//BI` | `//BI text` | `***text***` | | `//S` / `//STRIKE` | `//S text` | `~~text~~` | | `//HL` | `//HL text` | `==text==` (highlight) | | `//LINK` | `//LINK label, url` | `[label](url)` | | `//IMG` | `//IMG alt, url` | `![alt](url)` | | `//TABLE` | `//TABLE Col1, Col2` + rows + `//TABLE` | Markdown table | | `//HR` / `//SEP` | `//HR` | `---` | | `//FN` | `//FN footnote text` | `[^1]: footnote text` (auto-numbered) | | `//MATH` | `//MATH x^2` | `$x^2$` (inline math) | | `//MATHBLOCK` | `//MATHBLOCK` + lines + blank line | `$$...$$` math block | | `//TAG` | `//TAG my tag` | `#my_tag` | | `//DATE` | `//DATE` | today's date (YYYY-MM-DD) | | `//TIME` | `//TIME` | current time (HH:MM) | | `//DATETIME` | `//DATETIME` | date + time | | `//INDENT` | `//INDENT text` | text indented by 2 spaces | Plain text lines (without a `//` keyword) are inserted as-is. --- #### Multi-line Continuation Any keyword that accepts a comma-separated list (`//LIST`, `//NUMLIST`, `//CHECK`, `//TABLE` rows) supports **wrapping across lines**: if a line ends with a comma, the next line is automatically treated as a continuation. ``` //LIST groceries, milk, bread, butter, eggs ``` Output: ```markdown - groceries - milk - bread - butter - eggs ``` --- #### CHECK with Mixed States Prefix any item with `x` or `X` (with or without brackets) to mark it as already checked: ``` //CHECK x bought milk, prepare slides, x sent email, review PR ``` Output: ```markdown - [x] bought milk - [ ] prepare slides - [x] sent email - [ ] review PR ``` --- #### Multi-line Callouts The text on the keyword line becomes the callout **title**. Any lines that follow (up to the first blank line or next `//` keyword) become the callout **body**: ``` //NOTE Database connection The connection may fail on an unstable network. Always verify the timeout in the settings. Normal paragraph β€” outside the callout. ``` Output: ```markdown > [!NOTE] Database connection > The connection may fail on an unstable network. > Always verify the timeout in the settings. Normal paragraph β€” outside the callout. ``` --- After conversion, the SVG is archived to `_handwriting/_converted/YYYY-MM-DD_HH-MM-SS.svg` and the drawing block is replaced with the generated Markdown. --- ### Settings Open **Settings β†’ Handwriting to Markdown** to configure: | Setting | Description | |---------|-------------| | **Interface language** | Language for the settings UI. "Auto" follows Obsidian's language. | | **SVG folder** | Vault subfolder where SVG drawing files are saved (default: `_handwriting`) | | **Canvas width / height** | Default canvas resolution in pixels | | **Canvas background** | Light / Dark / Auto (follows Obsidian theme) | | **Gemini API key** | Required for OCR. Get it free at [aistudio.google.com](https://aistudio.google.com). | | **OCR languages** | Comma-separated BCP-47 codes (e.g. `it, en, fr`). Tells Gemini which languages to expect. | > **Note β€” Free API key limitations:** With the free tier of Google AI Studio, your data may be used by Google to improve their models. Additionally, under high traffic you may see a **"Too many requests β€” please try again later"** error. To avoid rate limits, enable billing on [Google AI Studio](https://aistudio.google.com); costs are minimal for occasional OCR use. --- ### Platform Support | Feature | Windows | Android | |---------|---------|---------| | Drawing (stylus/mouse) | βœ… | βœ… | | Finger scroll while drawing | β€” | βœ… | | OCR conversion | βœ… | βœ… | | Editor opens as modal | βœ… | β€” | | Editor opens as new tab | β€” | βœ… | | Collapse/expand preview | βœ… | βœ… | --- ## Project Structure & Architecture This section explains how the codebase is organized, how Obsidian's plugin system works, and which file to open for any given task. --- ### Folder & File Layout ``` HandTranscriptMd/ β”‚ β”œβ”€β”€ src/ ← all TypeScript source files β”‚ β”œβ”€β”€ main.ts ← plugin entry point (class HandwritingPlugin) β”‚ β”œβ”€β”€ settings.ts ← settings definition, defaults, settings tab UI β”‚ β”œβ”€β”€ i18n.ts ← translation loader and t() helper β”‚ β”œβ”€β”€ locales/ ← one JSON file per language β”‚ β”‚ β”œβ”€β”€ en.json ← English (the fallback β€” always the reference) β”‚ β”‚ β”œβ”€β”€ it.json β”‚ β”‚ β”œβ”€β”€ de.json fr.json es.json ru.json ja.json β”‚ β”‚ β”œβ”€β”€ zh-cn.json pt-br.json pl.json β”‚ β”œβ”€β”€ drawing-canvas.ts ← HTML Canvas drawing engine (strokes, eraser, undo) β”‚ β”œβ”€β”€ svg-utils.ts ← SVG ↔ strokes serialization, PNG conversion, archive β”‚ β”œβ”€β”€ embed.ts ← inline preview decoration + portal panel β”‚ β”œβ”€β”€ editor-view.ts ← drawing editor (modal on Windows, tab on Android) β”‚ β”œβ”€β”€ recognizer.ts ← Gemini OCR interface + HTTP call β”‚ β”œβ”€β”€ md-parser.ts ← keyword-based OCR text β†’ Markdown converter β”‚ └── parser.test.ts ← unit tests for the markdown parser β”‚ β”œβ”€β”€ main.js ← ⚠ compiled output (generated by esbuild, do not edit) β”œβ”€β”€ styles.css ← all plugin CSS (classes prefixed with hwm_) β”œβ”€β”€ manifest.json ← plugin metadata (id, name, version, minAppVersion) β”œβ”€β”€ package.json ← npm dependencies, build scripts β”œβ”€β”€ esbuild.config.mjs ← build configuration (entry: src/main.ts β†’ main.js) β”œβ”€β”€ deploy.sh ← copies main.js + manifest.json + styles.css to local vault β”œβ”€β”€ cloudDeploy.sh ← same but to Google Drive vault (for Android testing) β”œβ”€β”€ README.md ← this file β”œβ”€β”€ CLAUDE.md ← context notes for Claude Code AI assistant └── NOTES.md ← developer session log, resolved bugs, completed tasks ``` The three files that Obsidian loads are: **`main.js`**, **`manifest.json`**, **`styles.css`**. Everything under `src/` is TypeScript source that gets compiled down to the single `main.js` by esbuild. --- ### How an Obsidian Plugin Works Obsidian plugins are JavaScript modules that run inside the Obsidian Electron app (desktop) or WebView (mobile). The key concepts: #### 1. The Plugin Class (`src/main.ts`) Every plugin exports a default class that extends Obsidian's `Plugin`. Obsidian calls **`onload()`** when the plugin is enabled and **`onunload()`** when it is disabled. ```typescript export default class HandwritingPlugin extends Plugin { async onload() { /* register everything here */ } } ``` Inside `onload()` this plugin registers: - **A view type** (`registerView`) β€” the drawing editor tab on Android - **A code block processor** (`registerMarkdownCodeBlockProcessor`) β€” renders `handwriting` code blocks - **Commands** (`addCommand`) β€” appear in `Ctrl+P` palette - **A ribbon icon** (`addRibbonIcon`) β€” the pencil button in the left sidebar - **A settings tab** (`addSettingTab`) - **Event listeners** (`registerEvent`) β€” e.g. the right-click file menu The plugin class also carries three **shared state maps** used to coordinate between the preview (embed.ts) and the editor (editor-view.ts): - `previewCallbacks` β€” after a save, the editor calls `refreshPreview()` to update the inline image - `embedPaths` β€” maps embed IDs to SVG file paths, used for color remapping on theme change - `bgModeListeners` β€” `Set` of callbacks notified when the background mode setting changes - `embedActions` β€” maps embed IDs to their expand/collapse/convert functions, used by the right-click menu #### 2. The Vault API The **Vault** is Obsidian's file system abstraction. Use `this.app.vault` (or `plugin.app.vault`) to read/write files: ```typescript // Read a file as text const content = await plugin.app.vault.read(tFile); // Write / overwrite a file await plugin.app.vault.modify(tFile, newContent); // Create a file await plugin.app.vault.create(path, content); // Move / rename await plugin.app.vault.rename(tFile, newPath); ``` A `TFile` is Obsidian's object for a file. Get one with: ```typescript const file = plugin.app.vault.getAbstractFileByPath('folder/name.md'); ``` #### 3. The Workspace API The **Workspace** manages the layout of open tabs and panels. Used to open the editor tab on Android: ```typescript const leaf = plugin.app.workspace.getLeaf('tab'); // open in a new tab await leaf.setViewState({ type: VIEW_TYPE_HANDWRITING, state: { ... } }); ``` #### 4. ItemView β€” The Drawing Editor Tab (`src/editor-view.ts`) `DrawingEditorView extends ItemView` is an Obsidian **custom view** β€” a full tab with its own DOM. Key lifecycle methods: - `getViewType()` β€” returns a unique string ID (`'handwriting-editor'`) - `getDisplayText()` β€” the tab title - `onOpen()` β€” called when the tab opens; here `buildEditor()` is called to build the canvas UI - `onClose()` β€” called when the tab closes; cleanup (remove listeners, disconnect observers) The view receives data (which SVG to load, which MD file to update) via `leaf.setViewState({ state: { svgPath, sourcePath, embedId } })`, read back in `getState()`. #### 5. Modal β€” The Desktop Drawing Overlay (`src/editor-view.ts`) `DrawingModal extends Modal` is an Obsidian **modal dialog** β€” a fullscreen overlay on desktop. Key methods: - `onOpen()` β€” builds the canvas UI by calling `buildEditor()` - `onClose()` β€” cleanup - `this.close()` β€” closes the modal programmatically (used in the ← and βœ• buttons) `Modal` and `ItemView` are completely different Obsidian base classes, which is why `buildEditorUI()` was extracted as a shared standalone function β€” both classes call it and pass their specific callbacks for save/close/delete. #### 6. Code Block Processor (Legacy Format) `registerMarkdownCodeBlockProcessor('handwriting', callback)` tells Obsidian: "when you render a ` ```handwriting ``` ` block, run my callback instead." The callback receives the block source text and the DOM element to fill. This is the legacy embed format. #### 7. MutationObserver (Wiki Format) For the new `![[svg]]` format, Obsidian renders the embed itself as a ``. The plugin cannot intercept this with a code block processor. Instead, a **MutationObserver** watches `document.body` for new nodes and decorates any span whose `src` attribute points to the `_handwriting/` folder. This happens in `registerEmbed()` in `embed.ts`. #### 8. Settings (`src/settings.ts`) Settings are stored as a JSON object in Obsidian's `data.json` (inside the plugin folder). `plugin.loadData()` reads it; `plugin.saveData(obj)` writes it. The `HandwritingSettings` interface defines the shape; `DEFAULT_SETTINGS` provides initial values. `HandwritingSettingTab extends PluginSettingTab` builds the settings UI using `new Setting(containerEl)`. #### 9. The Build System esbuild bundles all TypeScript files starting from `src/main.ts` into a single `main.js`. The `obsidian` package is marked **external** β€” it is provided at runtime by Obsidian itself and must never be bundled. esbuild does **not** run TypeScript type-checking β€” type errors are invisible at build time. To catch them: `npx tsc --noEmit`. Two build modes: - `npm run dev` β†’ watch mode, inline sourcemap, not minified - `node esbuild.config.mjs production` β†’ single build, minified, no sourcemap --- ### What File to Open for a Given Task | I want to… | Open this file | |-----------|---------------| | Change what happens when the plugin loads/unloads | `src/main.ts` β†’ `onload()` / `onunload()` | | Add or remove a command (`Ctrl+P`) | `src/main.ts` β†’ `this.addCommand(...)` | | Add or remove the ribbon icon | `src/main.ts` β†’ `this.addRibbonIcon(...)` | | Add an item to the right-click file menu | `src/main.ts` β†’ `this.app.workspace.on('file-menu', ...)` | | Change a setting (add field, change default, add UI control) | `src/settings.ts` β†’ `HandwritingSettings`, `DEFAULT_SETTINGS`, `HandwritingSettingTab.display()` | | Change the color palette for light/dark theme | `src/settings.ts` β†’ `LIGHT_COLORS`, `DARK_COLORS` | | Change how "is dark mode" is resolved | `src/settings.ts` β†’ `resolveIsDark()` | | Add or fix a translation string | `src/locales/en.json` first, then all other locale files | | Add a new interface language | `src/locales/XX.json` + `src/i18n.ts` β†’ `locales` map + `localeNames` | | Change how the `t()` lookup or fallback works | `src/i18n.ts` | | Change drawing behavior (stroke, eraser, pressure, auto-expand) | `src/drawing-canvas.ts` β†’ `DrawingCanvas` class | | Change the ruler line spacing | `src/drawing-canvas.ts` β†’ `export const LINE_SPACING` | | Change how strokes are saved into / read from SVG | `src/svg-utils.ts` β†’ `strokesToSvg()`, `svgToStrokes()` | | Change how the SVG is converted to a PNG for OCR | `src/svg-utils.ts` β†’ `svgToBase64Png()` | | Change where archived SVGs go after conversion | `src/svg-utils.ts` β†’ `archiveSvgFile()` | | Change how the inline image preview is decorated | `src/embed.ts` β†’ `tryDecorate()`, `decorateWikiEmbed()` | | Add or change buttons in the portal panel overlay | `src/embed.ts` β†’ `createPortalPanel()` | | Change the OCR pipeline (what happens when "Convert" is clicked from the preview) | `src/embed.ts` β†’ `runOcrPipeline()` | | Change the drawing editor toolbar or canvas layout | `src/editor-view.ts` β†’ `buildEditorUI()` | | Change behavior specific to the desktop modal only | `src/editor-view.ts` β†’ `DrawingModal` class | | Change behavior specific to the Android tab only | `src/editor-view.ts` β†’ `DrawingEditorView` class | | Change the save / delete / convert logic inside the editor | `src/editor-view.ts` β†’ `DrawingModal.doSave/doConvert/doDelete` or `DrawingEditorView.doSave/doConvert/doDelete` | | Change which OCR model is called or the prompt sent to Gemini | `src/recognizer.ts` β†’ `GeminiRecognizer.recognize()` | | Change how OCR text is parsed into Markdown keywords | `src/md-parser.ts` β†’ `parseHandwritingToMarkdown()`, `expandKeywords()` | | Change how `//TABLE` blocks are parsed | `src/md-parser.ts` β†’ table handling logic inside `parseHandwritingToMarkdown()` | | Change plugin CSS (colors, sizes, layout) | `styles.css` | | Change the plugin version | `manifest.json` + `package.json` (both must match) | | Change the build configuration | `esbuild.config.mjs` | | Change the deploy target path (local vault) | `deploy.sh` β†’ `VAULT_PLUGIN` variable | | Change the deploy target path (Google Drive / Android) | `cloudDeploy.sh` β†’ `VAULT_PLUGIN` variable | --- ### Data Flow: From Drawing to Saved SVG ``` User draws strokes on β”‚ β–Ό DrawingCanvas (drawing-canvas.ts) stores strokes as Stroke[] array in memory β”‚ β–Ό (on Save button or auto-save debounce) saveSvgToDisk() ─── editor-view.ts (module-level helper) β”‚ β–Ό strokesToSvg() ─── svg-utils.ts builds an SVG string: - elements for each BΓ©zier stroke - elements for ruler lines - with JSON of all strokes (for re-editing) β”‚ β–Ό plugin.app.vault.modify(tFile, svgString) saves the .svg file to the vault β”‚ β–Ό plugin.refreshPreview(embedId, svgString) calls the previewCallback registered by embed.ts β”‚ β–Ό embed.ts updates img.src with a cache-busting ?t=timestamp so the inline preview refreshes without reloading the page ``` --- ### Data Flow: From Drawing to Markdown (OCR) ``` User clicks Convert (in editor toolbar or portal panel) β”‚ β–Ό runOcrPipeline() / doConvert() β”‚ β”œβ”€ reads SVG content from vault β”œβ”€ parses SVG to DOM via DOMParser β”‚ β–Ό svgToBase64Png() ─── svg-utils.ts draws SVG onto a temporary exports as base64 PNG via canvas.toDataURL() β”‚ β–Ό GeminiRecognizer.recognize(base64) ─── recognizer.ts POST to Gemini REST API with inline_data (image) + text prompt returns recognized text as a plain string β”‚ β–Ό parseHandwritingToMarkdown(text) ─── md-parser.ts splits text into lines maps //keywords β†’ Markdown syntax returns final Markdown string β”‚ β–Ό replaceInMdFile() ─── editor-view.ts (module-level helper) reads the .md source file finds the ![[svg]] embed line via regex replaces it with the Markdown text writes the .md file back to vault β”‚ β–Ό archiveSvgFile() ─── svg-utils.ts moves the .svg from _handwriting/ to _handwriting/_converted/YYYY-MM-DD_HH-MM-SS.svg ``` --- ### CSS Class Naming Convention All plugin CSS classes use the `hwm_` prefix (short for **H**and**W**riting **M**arkdown) to avoid collisions with Obsidian's own classes or other plugins. Examples: `hwm_portal-panel`, `hwm_portal-btn`, `hwm_modal`, `hwm_toolbar`, `hwm-badge-mode`. All styles live in **`styles.css`** at the project root. There is no CSS-in-JS. --- ## Maintainability Cheat Sheet This section is a quick reference for developers who need to extend or modify the plugin. Assumes familiarity with TypeScript and the Obsidian Plugin API. --- ### How to Add a Toolbar Button The entire toolbar for both the desktop modal and the Android tab is built by the shared function `buildEditorUI()` in `src/editor-view.ts`. You only need to edit **one place**. 1. **Add the i18n key** (see [How to Add a Language Key](#how-to-add-a-language-key)). 2. Inside `buildEditorUI()`, find the toolbar section and call `mkBtn(toolbar, 'icon-name', 'your_i18n_key')`. - `mkBtn` returns the button element if you need to attach a click handler. 3. Add the click handler immediately after: `btn.addEventListener('click', () => { ... })`. `mkBtn(parent, icon, key)` is a module-level helper that creates a `