Implement Karpathy’s LLM Wiki pattern to transform Claude conversations into a structured Obsidian vault with custom commands and bidirectional sync.

Building a Personal Knowledge Graph with Claude and Obsidian: Implementing Karpathy’s LLM Wiki Pattern

Every developer accumulates knowledge through countless conversations with AI assistants—debugging sessions, architecture decisions, research explorations. Yet this knowledge evaporates the moment you close the chat window. Andrej Karpathy’s “LLM Wiki” pattern addresses this directly: treat your AI conversations as first-class knowledge artifacts that persist, interconnect, and compound over time.

This tutorial implements a complete system that transforms Claude conversations into a structured Obsidian vault. You’ll build custom command patterns (/wiki, /save, /autoresearch) that capture insights as they emerge, sync bidirectionally between Claude and your markdown files, and automatically enrich entries with related concepts and code examples.

The result is a personal knowledge graph that grows smarter with every conversation—accessible across sessions, searchable, and deeply integrated with your existing note-taking workflow.

Prerequisites

Before starting, ensure you have:

Obsidian 1.4+ installed with a vault already configured
Node.js 18+ for the sync server and automation scripts
Claude API access with an active API key
Basic familiarity with Obsidian’s folder structure and markdown linking syntax
Git for version control of your knowledge base (recommended)

You’ll also need these npm packages installed globally or in your project:

1
npm install @anthropic-ai/sdk chokidar gray-matter marked glob dotenv express

💡 If you’re using Obsidian Sync, disable it for the wiki folder during initial setup to avoid conflicts with our bidirectional sync system.

Architecture and Key Concepts

The system operates on three layers: a command interpreter that parses natural language instructions, a sync engine that maintains consistency between Claude’s context and your vault, and a research pipeline that enriches entries autonomously.

flowchart TD
    subgraph Claude["Claude Conversation Layer"]
        CMD[Command Parser]
        CTX[Context Manager]
        GEN[Content Generator]
    end
    
    subgraph Sync["Bidirectional Sync Engine"]
        WATCH[File Watcher]
        DIFF[Diff Calculator]
        MERGE[Conflict Resolver]
    end
    
    subgraph Vault["Obsidian Vault"]
        WIKI[(Wiki Entries)]
        META[(Metadata Index)]
        LINKS[(Backlink Graph)]
    end
    
    subgraph Research["Auto-Research Pipeline"]
        ENRICH[Enrichment Queue]
        FETCH[Related Concepts Fetcher]
        CODE[Code Example Generator]
    end
    
    CMD --> |/wiki, /save| GEN
    GEN --> |Create/Update| DIFF
    CTX --> |Load Context| META
    
    DIFF --> |Write| WIKI
    WATCH --> |Detect Changes| DIFF
    WIKI --> |Read| MERGE
    MERGE --> |Update| CTX
    
    GEN --> |Queue| ENRICH
    ENRICH --> FETCH
    FETCH --> CODE
    CODE --> |Append| WIKI
    
    WIKI <--> LINKS
    META <--> WIKI

Key concepts to understand:

Command patterns are structured prefixes (/wiki, /save, /autoresearch) that trigger specific behaviors in the system
Context windows in Claude have limits—the sync engine maintains a compressed index that fits within these limits while preserving semantic richness
Backlinks in Obsidian create bidirectional relationships; our system generates these automatically based on content analysis

Step-by-Step Implementation

Setting Up the Project Structure and Configuration

Create the project skeleton that will house your sync server and command handlers:

1
2
3
4
mkdir claude-obsidian-wiki && cd claude-obsidian-wiki
mkdir -p src/{commands,sync,research} config scripts
touch src/index.ts src/commands/parser.ts src/sync/engine.ts src/research/pipeline.ts
touch config/settings.json .env

Configure your environment variables and core settings:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
// config/settings.ts
import { config } from 'dotenv';
import * as path from 'path';
import * as fs from 'fs';

config();

interface WikiSettings {
  vaultPath: string;
  wikiFolder: string;
  metadataFile: string;
  syncInterval: number;
  maxContextTokens: number;
  autoResearchEnabled: boolean;
  researchDepth: 'shallow' | 'medium' | 'deep';
}

// Load and validate settings from JSON config
const settingsPath = path.join(__dirname, 'settings.json');
const rawSettings = JSON.parse(fs.readFileSync(settingsPath, 'utf-8'));

export const settings: WikiSettings = {
  vaultPath: process.env.OBSIDIAN_VAULT_PATH || rawSettings.vaultPath,
  wikiFolder: rawSettings.wikiFolder || 'claude-wiki',
  metadataFile: rawSettings.metadataFile || '.wiki-metadata.json',
  syncInterval: rawSettings.syncInterval || 5000,
  maxContextTokens: rawSettings.maxContextTokens || 8000,
  autoResearchEnabled: rawSettings.autoResearchEnabled ?? true,
  researchDepth: rawSettings.researchDepth || 'medium'
};

// Validate vault path exists
export function validateSettings(): void {
  const fullWikiPath = path.join(settings.vaultPath, settings.wikiFolder);
  
  if (!fs.existsSync(settings.vaultPath)) {
    throw new Error(`Vault path does not exist: ${settings.vaultPath}`);
  }
  
  // Create wiki folder if missing
  if (!fs.existsSync(fullWikiPath)) {
    fs.mkdirSync(fullWikiPath, { recursive: true });
    console.log(`Created wiki folder: ${fullWikiPath}`);
  }
}

Create the settings JSON file:

1
2
3
4
5
6
7
8
9
{
  "vaultPath": "/Users/yourname/Documents/ObsidianVault",
  "wikiFolder": "claude-wiki",
  "metadataFile": ".wiki-metadata.json",
  "syncInterval": 5000,
  "maxContextTokens": 8000,
  "autoResearchEnabled": true,
  "researchDepth": "medium"
}

⚠️ Never commit your .env file to version control. Add it to .gitignore immediately after creation.

Building the Command Parser and Router

The command parser extracts structured instructions from natural language messages. It recognizes three primary patterns and routes them to appropriate handlers:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
// src/commands/parser.ts
import Anthropic from '@anthropic-ai/sdk';

// Command types supported by the system
export type CommandType = 'wiki' | 'save' | 'autoresearch' | 'none';

export interface ParsedCommand {
  type: CommandType;
  title?: string;
  content?: string;
  tags?: string[];
  linkedConcepts?: string[];
  researchQuery?: string;
  rawMessage: string;
}

// Regex patterns for command detection
const COMMAND_PATTERNS = {
  wiki: /^\/wiki\s+(?:"([^"]+)"|(\S+))(?:\s+(.*))?$/is,
  save: /^\/save\s+(?:"([^"]+)"|(\S+))(?:\s+\[([^\]]+)\])?(?:\s+(.*))?$/is,
  autoresearch: /^\/autoresearch\s+(.+)$/is
};

export function parseCommand(message: string): ParsedCommand {
  const trimmed = message.trim();
  
  // Check for /wiki command
  // Format: /wiki "Title" optional content
  const wikiMatch = trimmed.match(COMMAND_PATTERNS.wiki);
  if (wikiMatch) {
    const title = wikiMatch[1] || wikiMatch[2];
    const content = wikiMatch[3] || '';
    
    return {
      type: 'wiki',
      title: sanitizeTitle(title),
      content: content,
      tags: extractInlineTags(content),
      linkedConcepts: extractWikiLinks(content),
      rawMessage: message
    };
  }
  
  // Check for /save command
  // Format: /save "Title" [tag1, tag2] content
  const saveMatch = trimmed.match(COMMAND_PATTERNS.save);
  if (saveMatch) {
    const title = saveMatch[1] || saveMatch[2];
    const tagsRaw = saveMatch[3] || '';
    const content = saveMatch[4] || '';
    
    return {
      type: 'save',
      title: sanitizeTitle(title),
      content: content,
      tags: tagsRaw.split(',').map(t => t.trim()).filter(Boolean),
      linkedConcepts: extractWikiLinks(content),
      rawMessage: message
    };
  }
  
  // Check for /autoresearch command
  // Format: /autoresearch topic or question
  const researchMatch = trimmed.match(COMMAND_PATTERNS.autoresearch);
  if (researchMatch) {
    return {
      type: 'autoresearch',
      researchQuery: researchMatch[1].trim(),
      rawMessage: message
    };
  }
  
  // No command detected - return raw message for normal processing
  return {
    type: 'none',
    rawMessage: message
  };
}

// Sanitize title for use as filename
function sanitizeTitle(title: string): string {
  return title
    .replace(/[<>:"/\\|?*]/g, '-')  // Remove invalid filename chars
    .replace(/\s+/g, ' ')            // Normalize whitespace
    .trim()
    .slice(0, 100);                  // Limit length
}

// Extract [[wiki links]] from content
function extractWikiLinks(content: string): string[] {
  const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
  const links: string[] = [];
  let match;
  
  while ((match = linkPattern.exec(content)) !== null) {
    links.push(match[1].trim());
  }
  
  return [...new Set(links)]; // Deduplicate
}

// Extract #tags from content
function extractInlineTags(content: string): string[] {
  const tagPattern = /#([a-zA-Z][a-zA-Z0-9_-]*)/g;
  const tags: string[] = [];
  let match;
  
  while ((match = tagPattern.exec(content)) !== null) {
    tags.push(match[1]);
  }
  
  return [...new Set(tags)];
}

Now create the command router that orchestrates execution:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
// src/commands/router.ts
import { ParsedCommand, parseCommand } from './parser';
import { createWikiEntry, updateWikiEntry } from '../sync/engine';
import { queueResearch } from '../research/pipeline';
import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic();

export interface CommandResult {
  success: boolean;
  message: string;
  createdFiles?: string[];
  updatedFiles?: string[];
  queuedResearch?: string[];
}

export async function routeCommand(
  userMessage: string,
  conversationContext: string[]
): Promise<CommandResult> {
  const parsed = parseCommand(userMessage);
  
  switch (parsed.type) {
    case 'wiki':
      return handleWikiCommand(parsed, conversationContext);
    
    case 'save':
      return handleSaveCommand(parsed);
    
    case 'autoresearch':
      return handleAutoresearchCommand(parsed);
    
    case 'none':
      // Pass through to normal Claude conversation
      return { success: true, message: 'No command detected' };
  }
}

async function handleWikiCommand(
  cmd: ParsedCommand,
  context: string[]
): Promise<CommandResult> {
  // Generate structured wiki content using Claude
  const generationPrompt = buildWikiGenerationPrompt(cmd, context);
  
  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 2000,
    system: `You are a technical documentation expert. Generate structured wiki entries 
in markdown format. Include: a clear definition, key concepts, practical examples, 
and relevant links to related topics using [[wiki link]] syntax.`,
    messages: [{ role: 'user', content: generationPrompt }]
  });
  
  const generatedContent = response.content[0].type === 'text' 
    ? response.content[0].text 
    : '';
  
  // Create the wiki entry file
  const filePath = await createWikiEntry({
    title: cmd.title!,
    content: generatedContent,
    tags: cmd.tags || [],
    linkedConcepts: extractAllLinks(generatedContent),
    source: 'claude-conversation',
    createdAt: new Date().toISOString()
  });
  
  return {
    success: true,
    message: `Created wiki entry: ${cmd.title}`,
    createdFiles: [filePath]
  };
}

async function handleSaveCommand(cmd: ParsedCommand): Promise<CommandResult> {
  // Direct save without AI generation - just format and store
  const filePath = await createWikiEntry({
    title: cmd.title!,
    content: cmd.content || '',
    tags: cmd.tags || [],
    linkedConcepts: cmd.linkedConcepts || [],
    source: 'manual-save',
    createdAt: new Date().toISOString()
  });
  
  return {
    success: true,
    message: `Saved note: ${cmd.title}`,
    createdFiles: [filePath]
  };
}

async function handleAutoresearchCommand(cmd: ParsedCommand): Promise<CommandResult> {
  // Queue the research topic for async processing
  const queueId = await queueResearch(cmd.researchQuery!);
  
  return {
    success: true,
    message: `Queued research on: ${cmd.researchQuery}`,
    queuedResearch: [queueId]
  };
}

function buildWikiGenerationPrompt(cmd: ParsedCommand, context: string[]): string {
  const recentContext = context.slice(-5).join('\n---\n');
  
  return `Create a wiki entry for: "${cmd.title}"

${cmd.content ? `Additional context from user: ${cmd.content}` : ''}

Recent conversation context:
${recentContext}

Generate a comprehensive wiki entry that:
1. Starts with a one-sentence definition
2. Explains core concepts with examples
3. Includes code snippets where relevant
4. Links to related concepts using [[concept name]] syntax
5. Adds practical tips or warnings where appropriate`;
}

function extractAllLinks(content: string): string[] {
  const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
  const links: string[] = [];
  let match;
  
  while ((match = linkPattern.exec(content)) !== null) {
    links.push(match[1].trim());
  }
  
  return [...new Set(links)];
}

📝 The command parser uses permissive regex patterns intentionally. Users can omit quotes around single-word titles and tags are optional. This reduces friction during rapid knowledge capture.

Implementing the Bidirectional Sync Engine

The sync engine is the critical component that maintains consistency between your Obsidian vault and Claude’s conversation context. It watches for file changes, calculates diffs, and resolves conflicts when both systems modify the same entry:

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
// src/sync/engine.ts
import * as fs from 'fs';
import * as path from 'path';
import * as chokidar from 'chokidar';
import matter from 'gray-matter';
import { settings } from '../../config/settings';

// In-memory representation of a wiki entry
export interface WikiEntry {
  title: string;
  content: string;
  tags: string[];
  linkedConcepts: string[];
  source: 'claude-conversation' | 'manual-save' | 'obsidian-edit' | 'auto-research';
  createdAt: string;
  updatedAt?: string;
  checksum?: string;
}

// Metadata index for fast lookups without reading all files
interface MetadataIndex {
  entries: Record<string, {
    path: string;
    checksum: string;
    tags: string[];
    linkedConcepts: string[];
    lastModified: string;
  }>;
  lastSync: string;
}

let metadataIndex: MetadataIndex = { entries: {}, lastSync: '' };
let fileWatcher: chokidar.FSWatcher | null = null;

// Initialize sync engine and load existing metadata
export async function initializeSyncEngine(): Promise<void> {
  const wikiPath = getWikiPath();
  const metadataPath = getMetadataPath();
  
  // Load existing metadata index if present
  if (fs.existsSync(metadataPath)) {
    const raw = fs.readFileSync(metadataPath, 'utf-8');
    metadataIndex = JSON.parse(raw);
    console.log(`Loaded metadata index with ${Object.keys(metadataIndex.entries).length} entries`);
  }
  
  // Scan vault for any untracked files
  await reconcileVaultState();
  
  // Start file watcher for real-time sync
  startFileWatcher();
}

// Create a new wiki entry file
export async function createWikiEntry(entry: WikiEntry): Promise<string> {
  const filename = `${entry.title}.md`;
  const filePath = path.join(getWikiPath(), filename);
  
  // Build frontmatter
  const frontmatter = {
    title: entry.title,
    tags: entry.tags,
    created: entry.createdAt,
    source: entry.source,
    links: entry.linkedConcepts
  };
  
  // Combine frontmatter with content
  const fileContent = matter.stringify(entry.content, frontmatter);
  const checksum = computeChecksum(fileContent);
  
  // Write file
  fs.writeFileSync(filePath, fileContent, 'utf-8');
  
  // Update metadata index
  metadataIndex.entries[entry.title] = {
    path: filePath,
    checksum: checksum,
    tags: entry.tags,
    linkedConcepts: entry.linkedConcepts,
    lastModified: new Date().toISOString()
  };
  
  await persistMetadataIndex();
  
  console.log(`Created wiki entry: ${filename}`);
  return filePath;
}

// Update existing wiki entry with conflict detection
export async function updateWikiEntry(
  title: string, 
  updates: Partial<WikiEntry>,
  forceOverwrite: boolean = false
): Promise<{ success: boolean; conflict?: boolean; merged?: string }> {
  const existing = metadataIndex.entries[title];
  
  if (!existing) {
    // Entry doesn't exist - create it instead
    await createWikiEntry({
      title,
      content: updates.content || '',
      tags: updates.tags || [],
      linkedConcepts: updates.linkedConcepts || [],
      source: updates.source || 'claude-conversation',
      createdAt: new Date().toISOString()
    });
    return { success: true };
  }
  
  // Read current file content
  const currentContent = fs.readFileSync(existing.path, 'utf-8');
  const currentChecksum = computeChecksum(currentContent);
  
  // Detect if file was modified externally
  if (currentChecksum !== existing.checksum && !forceOverwrite) {
    // Conflict detected - attempt three-way merge
    const mergeResult = attemptMerge(existing.path, updates.content || '');
    
    if (mergeResult.success) {
      fs.writeFileSync(existing.path, mergeResult.merged!, 'utf-8');
      existing.checksum = computeChecksum(mergeResult.merged!);
      await persistMetadataIndex();
      return { success: true, merged: mergeResult.merged };
    }
    
    return { success: false, conflict: true };
  }
  
  // No conflict - apply updates directly
  const parsed = matter(currentContent);
  
  if (updates.content) {
    parsed.content = updates.content;
  }
  if (updates.tags) {
    parsed.data.tags = [...new Set([...parsed.data.tags, ...updates.tags])];
  }
  if (updates.linkedConcepts) {
    parsed.data.links = [...new Set([...parsed.data.links, ...updates.linkedConcepts])];
  }
  
  parsed.data.updated = new Date().toISOString();
  
  const newContent = matter.stringify(parsed.content, parsed.data);
  fs.writeFileSync(existing.path, newContent, 'utf-8');
  
  existing.checksum = computeChecksum(newContent);
  existing.lastModified = parsed.data.updated;
  await persistMetadataIndex();
  
  return { success: true };
}

// Load entries for context injection into Claude conversations
export function loadContextEntries(
  relevantTitles: string[],
  maxTokens: number = settings.maxContextTokens
): string {
  const entries: string[] = [];
  let estimatedTokens = 0;
  
  for (const title of relevantTitles) {
    const meta = metadataIndex.entries[title];
    if (!meta) continue;
    
    const content = fs.readFileSync(meta.path, 'utf-8');
    const parsed = matter(content);
    
    // Rough token estimation (4 chars ≈ 1 token)
    const entryTokens = Math.ceil(parsed.content.length / 4);
    
    if (estimatedTokens + entryTokens > maxTokens) {
      break;
    }
    
    entries.push(`## ${title}\n${parsed.content}`);
    estimatedTokens += entryTokens;
  }
  
  return entries.join('\n\n---\n\n');
}

// Start watching for file changes
function startFileWatcher(): void {
  const wikiPath = getWikiPath();
  
  fileWatcher = chokidar.watch(`${wikiPath}/*.md`, {
    persistent: true,
    ignoreInitial: true,
    awaitWriteFinish: {
      stabilityThreshold: 500,
      pollInterval: 100
    }
  });
  
  fileWatcher.on('change', handleFileChange);
  fileWatcher.on('add', handleFileAdd);
  fileWatcher.on('unlink', handleFileDelete);
  
  console.log(`Watching for changes in: ${wikiPath}`);
}

async function handleFileChange(filePath: string): Promise<void> {
  const filename = path.basename(filePath, '.md');
  const content = fs.readFileSync(filePath, 'utf-8');
  const checksum = computeChecksum(content);
  
  const existing = metadataIndex.entries[filename];
  
  if (existing && existing.checksum !== checksum) {
    // File was modified in Obsidian
    const parsed = matter(content);
    
    existing.checksum = checksum;
    existing.tags = parsed.data.tags || [];
    existing.linkedConcepts = parsed.data.links || [];
    existing.lastModified = new Date().toISOString();
    
    await persistMetadataIndex();
    console.log(`Synced external changes: ${filename}`);
  }
}

async function handleFileAdd(filePath: string): Promise<void> {
  const filename = path.basename(filePath, '.md');
  const content = fs.readFileSync(filePath, 'utf-8');
  const parsed = matter(content);
  
  metadataIndex.entries[filename] = {
    path: filePath,
    checksum: computeChecksum(content),
    tags: parsed.data.tags || [],
    linkedConcepts: parsed.data.links || [],
    lastModified: new Date().toISOString()
  };
  
  await persistMetadataIndex();
  console.log(`Indexed new file: ${filename}`);
}

async function handleFileDelete(filePath: string): Promise<void> {
  const filename = path.basename(filePath, '.md');
  delete metadataIndex.entries[filename];
  await persistMetadataIndex();
  console.log(`Removed from index: ${filename}`);
}

// Scan vault and reconcile with metadata index
async function reconcileVaultState(): Promise<void> {
  const wikiPath = getWikiPath();
  const files = fs.readdirSync(wikiPath).filter(f => f.endsWith('.md'));
  
  for (const file of files) {
    const filePath = path.join(wikiPath, file);
    const title = path.basename(file, '.md');
    
    if (!metadataIndex.entries[title]) {
      await handleFileAdd(filePath);
    }
  }
  
  // Remove entries for deleted files
  for (const title of Object.keys(metadataIndex.entries)) {
    const entry = metadataIndex.entries[title];
    if (!fs.existsSync(entry.path)) {
      delete metadataIndex.entries[title];
    }
  }
  
  await persistMetadataIndex();
}

// Simple checksum for change detection
function computeChecksum(content: string): string {
  let hash = 0;
  for (let i = 0; i < content.length; i++) {
    const char = content.charCodeAt(i);
    hash = ((hash << 5) - hash) + char;
    hash = hash & hash;
  }
  return hash.toString(16);
}

// Basic three-way merge attempt
function attemptMerge(
  filePath: string, 
  newContent: string
): { success: boolean; merged?: string } {
  // For MVP, we use a simple strategy: append new content as a section
  const current = fs.readFileSync(filePath, 'utf-8');
  const parsed = matter(current);
  
  // Check if contents are significantly different
  if (parsed.content.includes(newContent.slice(0, 100))) {
    // New content already exists - skip
    return { success: true, merged: current };
  }
  
  // Append new content with separator
  const mergedContent = `${parsed.content}\n\n---\n*Added from Claude conversation (${new Date().toISOString()})*\n\n${newContent}`;
  
  return {
    success: true,
    merged: matter.stringify(mergedContent, parsed.data)
  };
}

async function persistMetadataIndex(): Promise<void> {
  metadataIndex.lastSync = new Date().toISOString();
  const metadataPath = getMetadataPath();
  fs.writeFileSync(metadataPath, JSON.stringify(metadataIndex, null, 2), 'utf-8');
}

function getWikiPath(): string {
  return path.join(settings.vaultPath, settings.wikiFolder);
}

function getMetadataPath(): string {
  return path.join(getWikiPath(), settings.metadataFile);
}

// Export for testing
export { metadataIndex, reconcileVaultState };

⚠️ The file watcher uses awaitWriteFinish to prevent reading partially written files. If you experience issues with large files, increase the stabilityThreshold value.

💡 The metadata index file (.wiki-metadata.json) is prefixed with a dot to hide it in most file explorers. Obsidian will ignore it by default, keeping your vault clean.

Production Configuration

A robust production setup requires proper configuration management. Create a dedicated config file that handles environment-specific settings:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// config/wiki-config.ts
import { z } from 'zod';
import * as fs from 'fs';
import * as path from 'path';

// Schema validation ensures config integrity
const ConfigSchema = z.object({
  vault: z.object({
    path: z.string().min(1),
    wikiFolder: z.string().default('wiki'),
    metadataFile: z.string().default('.wiki-metadata.json'),
  }),
  claude: z.object({
    apiKey: z.string().min(1),
    model: z.string().default('claude-sonnet-4-20250514'),
    maxTokens: z.number().default(4096),
    temperature: z.number().min(0).max(1).default(0.3),
  }),
  processing: z.object({
    batchSize: z.number().default(10),
    rateLimitMs: z.number().default(1000),
    maxRetries: z.number().default(3),
    concurrentRequests: z.number().default(2),
  }),
  graph: z.object({
    minSimilarityScore: z.number().default(0.7),
    maxLinksPerNote: z.number().default(10),
    enableBacklinks: z.boolean().default(true),
  }),
});

type WikiConfig = z.infer<typeof ConfigSchema>;

function loadConfig(): WikiConfig {
  const configPath = process.env.WIKI_CONFIG_PATH || './wiki-config.json';
  
  if (!fs.existsSync(configPath)) {
    throw new Error(`Config file not found: ${configPath}`);
  }

  const rawConfig = JSON.parse(fs.readFileSync(configPath, 'utf-8'));
  
  // Override with environment variables
  const merged = {
    ...rawConfig,
    claude: {
      ...rawConfig.claude,
      apiKey: process.env.ANTHROPIC_API_KEY || rawConfig.claude?.apiKey,
    },
  };

  return ConfigSchema.parse(merged);
}

export const config = loadConfig();

Create the corresponding JSON configuration file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
  "vault": {
    "path": "/Users/yourname/Documents/Obsidian/KnowledgeBase",
    "wikiFolder": "wiki",
    "metadataFile": ".wiki-metadata.json"
  },
  "claude": {
    "model": "claude-sonnet-4-20250514",
    "maxTokens": 4096,
    "temperature": 0.3
  },
  "processing": {
    "batchSize": 10,
    "rateLimitMs": 1000,
    "maxRetries": 3,
    "concurrentRequests": 2
  },
  "graph": {
    "minSimilarityScore": 0.7,
    "maxLinksPerNote": 10,
    "enableBacklinks": true
  }
}

⚠️ Never commit your API key to version control. Use environment variables or a .env file with dotenv. Add wiki-config.json to .gitignore if it contains sensitive data.

For systemd deployment on Linux servers, create a service file:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# /etc/systemd/system/knowledge-wiki.service
[Unit]
Description=Personal Knowledge Wiki Service
After=network.target

[Service]
Type=simple
User=wiki
WorkingDirectory=/opt/knowledge-wiki
ExecStart=/usr/bin/node dist/index.js
Restart=on-failure
RestartSec=10
Environment=NODE_ENV=production
Environment=ANTHROPIC_API_KEY=your-api-key
Environment=WIKI_CONFIG_PATH=/opt/knowledge-wiki/config.json

# Resource limits
MemoryMax=512M
CPUQuota=50%

[Install]
WantedBy=multi-user.target

The following diagram illustrates the production deployment architecture:

flowchart TD
    subgraph Client["Client Layer"]
        OBS[Obsidian App]
        SYNC[Sync Service]
    end

    subgraph Server["Server Layer"]
        SYSTEMD[systemd]
        WIKI[Wiki Service]
        WATCHER[File Watcher]
    end

    subgraph Storage["Storage Layer"]
        VAULT[(Obsidian Vault)]
        META[(Metadata Index)]
        LOGS[(Log Files)]
    end

    subgraph External["External Services"]
        CLAUDE[Claude API]
    end

    OBS -->|edit files| SYNC
    SYNC -->|sync| VAULT
    SYSTEMD -->|manages| WIKI
    WIKI --> WATCHER
    WATCHER -->|monitors| VAULT
    WIKI -->|read/write| META
    WIKI -->|structured logging| LOGS
    WIKI -->|API calls| CLAUDE
    CLAUDE -->|embeddings & analysis| WIKI

Common Mistakes and Troubleshooting

Problem: Duplicate links appearing in notes

This happens when the link injection runs before the metadata index updates. Implement idempotent link insertion:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
// utils/link-injector.ts
export function injectLinks(
  content: string,
  newLinks: string[],
  existingLinks: Set<string>
): string {
  // Extract current wiki links from content
  const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
  const currentLinks = new Set<string>();
  
  let match;
  while ((match = linkPattern.exec(content)) !== null) {
    currentLinks.add(match[1].toLowerCase());
  }

  // Filter out links that already exist
  const linksToAdd = newLinks.filter(
    link => !currentLinks.has(link.toLowerCase()) && 
            !existingLinks.has(link.toLowerCase())
  );

  if (linksToAdd.length === 0) {
    return content; // No changes needed
  }

  // Find or create the "Related" section
  const relatedSection = '\n\n## Related\n';
  const formattedLinks = linksToAdd.map(l => `- [[${l}]]`).join('\n');

  if (content.includes('## Related')) {
    // Append to existing section
    return content.replace(
      /(## Related\n)/,
      `$1${formattedLinks}\n`
    );
  }

  // Add new section at the end
  return content.trimEnd() + relatedSection + formattedLinks + '\n';
}

Problem: API rate limiting causing failures

Implement exponential backoff with jitter:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
// utils/retry.ts
export async function withRetry<T>(
  fn: () => Promise<T>,
  options: {
    maxRetries: number;
    baseDelayMs: number;
    maxDelayMs: number;
  }
): Promise<T> {
  let lastError: Error;

  for (let attempt = 0; attempt <= options.maxRetries; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error as Error;

      if (attempt === options.maxRetries) {
        break;
      }

      // Check if error is retryable
      if (error instanceof Error && error.message.includes('rate_limit')) {
        const delay = Math.min(
          options.baseDelayMs * Math.pow(2, attempt),
          options.maxDelayMs
        );
        // Add jitter to prevent thundering herd
        const jitter = Math.random() * 0.3 * delay;
        
        console.log(`Rate limited. Retrying in ${delay + jitter}ms...`);
        await sleep(delay + jitter);
      } else {
        throw error; // Non-retryable error
      }
    }
  }

  throw lastError!;
}

function sleep(ms: number): Promise<void> {
  return new Promise(resolve => setTimeout(resolve, ms));
}

💡 Set concurrentRequests to 2 or lower to stay well within Claude’s rate limits. Higher concurrency provides diminishing returns due to API throttling.

Problem: File watcher triggers infinite loops

When your service writes to watched files, it can trigger its own watcher. Use a write lock mechanism:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
// State tracking for write operations
const pendingWrites = new Set<string>();

async function safeWriteFile(filePath: string, content: string): Promise<void> {
  const normalizedPath = path.normalize(filePath);
  pendingWrites.add(normalizedPath);

  try {
    await fs.promises.writeFile(filePath, content, 'utf-8');
    // Keep the lock briefly to handle watcher debounce
    await sleep(100);
  } finally {
    pendingWrites.delete(normalizedPath);
  }
}

// In your watcher handler
watcher.on('change', async (filePath) => {
  if (pendingWrites.has(path.normalize(filePath))) {
    return; // Ignore our own writes
  }
  // Process the change...
});

📝 Common symptoms of infinite loops include high CPU usage, rapidly growing log files, and Obsidian becoming unresponsive. Check your logs for repeated processing of the same file.

Performance and Scalability

For vaults exceeding 1,000 notes, batch processing becomes essential. The following implementation processes notes in parallel while respecting rate limits:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
// services/batch-processor.ts
import pLimit from 'p-limit';
import { config } from '../config/wiki-config';

interface ProcessingResult {
  path: string;
  success: boolean;
  error?: string;
  processingTimeMs: number;
}

export class BatchProcessor {
  private limit: pLimit.Limit;
  private results: ProcessingResult[] = [];

  constructor() {
    this.limit = pLimit(config.processing.concurrentRequests);
  }

  async processNotes(notePaths: string[]): Promise<ProcessingResult[]> {
    const batches = this.chunkArray(notePaths, config.processing.batchSize);
    
    console.log(`Processing ${notePaths.length} notes in ${batches.length} batches`);

    for (let i = 0; i < batches.length; i++) {
      const batch = batches[i];
      console.log(`Batch ${i + 1}/${batches.length}: ${batch.length} notes`);

      const batchResults = await Promise.all(
        batch.map(notePath => 
          this.limit(() => this.processNote(notePath))
        )
      );

      this.results.push(...batchResults);

      // Rate limiting between batches
      if (i < batches.length - 1) {
        await this.sleep(config.processing.rateLimitMs);
      }
    }

    return this.results;
  }

  private async processNote(notePath: string): Promise<ProcessingResult> {
    const startTime = Date.now();

    try {
      // Your note processing logic here
      await this.analyzeAndLinkNote(notePath);

      return {
        path: notePath,
        success: true,
        processingTimeMs: Date.now() - startTime,
      };
    } catch (error) {
      return {
        path: notePath,
        success: false,
        error: (error as Error).message,
        processingTimeMs: Date.now() - startTime,
      };
    }
  }

  private chunkArray<T>(array: T[], size: number): T[][] {
    const chunks: T[][] = [];
    for (let i = 0; i < array.length; i += size) {
      chunks.push(array.slice(i, i + size));
    }
    return chunks;
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }

  private async analyzeAndLinkNote(notePath: string): Promise<void> {
    // Implementation depends on your specific requirements
  }
}

For very large vaults, consider implementing incremental processing with checkpoints:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
# checkpoint.yml - tracks processing state
lastRun: "2024-01-15T10:30:00Z"
processedFiles:
  - path: "wiki/machine-learning.md"
    hash: "a1b2c3d4"
    timestamp: "2024-01-15T10:25:00Z"
  - path: "wiki/neural-networks.md"
    hash: "e5f6g7h8"
    timestamp: "2024-01-15T10:28:00Z"
pendingFiles:
  - "wiki/transformers.md"
  - "wiki/attention-mechanisms.md"
failedFiles:
  - path: "wiki/corrupted-note.md"
    error: "Invalid frontmatter syntax"
    attempts: 3

Memory optimization matters when building large graphs. Use streaming for file operations:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Efficient file reading for large vaults
import * as readline from 'readline';
import * as fs from 'fs';

async function extractFrontmatterStream(filePath: string): Promise<Record<string, unknown> | null> {
  const fileStream = fs.createReadStream(filePath);
  const rl = readline.createInterface({
    input: fileStream,
    crlfDelay: Infinity,
  });

  let inFrontmatter = false;
  let frontmatterLines: string[] = [];

  for await (const line of rl) {
    if (line === '---' && !inFrontmatter) {
      inFrontmatter = true;
      continue;
    }
    if (line === '---' && inFrontmatter) {
      break; // End of frontmatter
    }
    if (inFrontmatter) {
      frontmatterLines.push(line);
    }
  }

  rl.close();
  fileStream.close();

  if (frontmatterLines.length === 0) {
    return null;
  }

  // Parse YAML frontmatter
  const yaml = await import('yaml');
  return yaml.parse(frontmatterLines.join('\n'));
}

📝 Processing 10,000 notes with the batch processor takes approximately 3-4 hours with conservative rate limiting. Run initial indexing overnight or during low-usage periods.

Conclusion and Next Steps

You now have a working personal knowledge graph that implements Karpathy’s LLM wiki pattern. The system watches your Obsidian vault, uses Claude to analyze content semantics, and automatically maintains relationships between notes.

Key capabilities you’ve built:

Real-time file monitoring with conflict-free updates
Semantic analysis via Claude for intelligent linking
Persistent metadata index for instant lookups
Production-ready configuration with validation
Batch processing for large vault initialization

To extend this foundation, consider these next steps:

Add vector embeddings: Store Claude-generated embeddings in a local vector database like Chroma or LanceDB for similarity search
Build a query interface: Create a command-line tool or Obsidian plugin to ask questions across your knowledge base
Implement spaced repetition: Use the graph structure to surface notes for periodic review
Add concept extraction: Have Claude identify key concepts and automatically create hub notes

The modular architecture makes these additions straightforward. Start with embeddings—they enable powerful semantic search that transforms how you navigate your knowledge.

Additional Resources

Obsidian Plugin Development Documentation - Official guide for building native Obsidian plugins if you want tighter integration
Anthropic Claude API Reference - Complete API documentation including rate limits and best practices
Building a Second Brain - Tiago Forte’s methodology that complements the technical implementation
Zettelkasten Method - The original knowledge management system that inspired modern note-linking approaches
LangChain Documentation - Framework for building more complex LLM applications if you need advanced chaining or agents

Common Mistakes and Troubleshooting

1. Context Window Overflow

The most common mistake when building a knowledge graph with Claude is exceeding the context window with too many linked notes.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// ❌ BAD: Loading all linked notes without limit
async function getContext(noteId: string): Promise<string> {
  const note = await getNote(noteId);
  const links = extractLinks(note.content);
  
  // This can explode exponentially
  const linkedContent = await Promise.all(
    links.map(link => getContext(link)) // Recursive without depth limit!
  );
  
  return note.content + linkedContent.join('\n');
}

// ✅ GOOD: Controlled depth and token budget
interface ContextOptions {
  maxDepth: number;
  maxTokens: number;
  priorityTags?: string[];
}

async function getContextWithLimits(
  noteId: string,
  options: ContextOptions,
  currentDepth = 0,
  tokenCount = 0
): Promise<{ content: string; tokens: number }> {
  if (currentDepth >= options.maxDepth || tokenCount >= options.maxTokens) {
    return { content: '', tokens: tokenCount };
  }

  const note = await getNote(noteId);
  const noteTokens = estimateTokens(note.content);
  
  // Check if adding this note exceeds budget
  if (tokenCount + noteTokens > options.maxTokens) {
    // Return truncated version
    const availableTokens = options.maxTokens - tokenCount;
    return {
      content: truncateToTokens(note.content, availableTokens),
      tokens: options.maxTokens
    };
  }

  let result = note.content;
  let newTokenCount = tokenCount + noteTokens;

  // Only follow links if we have budget remaining
  const links = extractLinks(note.content);
  const prioritizedLinks = prioritizeLinks(links, options.priorityTags);

  for (const link of prioritizedLinks) {
    if (newTokenCount >= options.maxTokens) break;
    
    const linked = await getContextWithLimits(
      link,
      options,
      currentDepth + 1,
      newTokenCount
    );
    
    result += `\n\n---\nLinked: ${link}\n${linked.content}`;
    newTokenCount = linked.tokens;
  }

  return { content: result, tokens: newTokenCount };
}

function estimateTokens(text: string): number {
  // Rough estimation: ~4 characters per token for English
  return Math.ceil(text.length / 4);
}

⚠️ Warning: Claude’s context window is large but not infinite. A 200k token limit sounds like a lot until you try to load 50 interconnected notes with code blocks and embedded images.

2. Obsidian Plugin Configuration Errors

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# ❌ BAD: Common .obsidian/plugins/knowledge-graph-claude/data.json mistakes
{
  "apiKey": "sk-ant-...",  # Never store API keys in plugin settings!
  "maxDepth": 10,          # Too deep - will timeout
  "includeBacklinks": true,
  "includeTags": true,
  "cacheTimeout": 0        # No cache = constant API calls = expensive
}

# ✅ GOOD: Secure and performant configuration
{
  "apiKeyEnvVar": "CLAUDE_API_KEY",  # Reference env var instead
  "maxDepth": 3,                      # Reasonable depth
  "maxNotesPerQuery": 15,             # Hard limit on notes
  "includeBacklinks": true,
  "includeTags": true,
  "cacheTimeout": 3600,               # 1 hour cache
  "excludeFolders": ["templates", "daily-notes", "attachments"],
  "priorityFolders": ["projects", "concepts"]
}

3. Rate Limiting and API Errors

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// Robust API client with retry logic
import Anthropic from '@anthropic-ai/sdk';

class ResilientClaudeClient {
  private client: Anthropic;
  private requestQueue: Array<() => Promise<void>> = [];
  private processing = false;
  private lastRequestTime = 0;
  private minRequestInterval = 100; // ms between requests

  constructor() {
    this.client = new Anthropic();
  }

  async query(
    messages: Anthropic.MessageParam[],
    retries = 3
  ): Promise<string> {
    for (let attempt = 1; attempt <= retries; attempt++) {
      try {
        // Rate limiting
        const now = Date.now();
        const timeSinceLastRequest = now - this.lastRequestTime;
        if (timeSinceLastRequest < this.minRequestInterval) {
          await this.sleep(this.minRequestInterval - timeSinceLastRequest);
        }
        
        this.lastRequestTime = Date.now();

        const response = await this.client.messages.create({
          model: 'claude-sonnet-4-20250514',
          max_tokens: 4096,
          messages
        });

        const textBlock = response.content.find(block => block.type === 'text');
        return textBlock?.text || '';

      } catch (error) {
        if (this.isRateLimitError(error)) {
          const backoff = Math.pow(2, attempt) * 1000; // Exponential backoff
          console.warn(`Rate limited. Waiting ${backoff}ms before retry ${attempt}/${retries}`);
          await this.sleep(backoff);
          continue;
        }

        if (this.isOverloadedError(error) && attempt < retries) {
          // Claude is overloaded, wait longer
          await this.sleep(5000 * attempt);
          continue;
        }

        throw error;
      }
    }

    throw new Error('Max retries exceeded');
  }

  private isRateLimitError(error: unknown): boolean {
    return error instanceof Anthropic.RateLimitError;
  }

  private isOverloadedError(error: unknown): boolean {
    return error instanceof Anthropic.APIError && error.status === 529;
  }

  private sleep(ms: number): Promise<void> {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
}

4. Graph Traversal Debugging

When your knowledge graph queries return unexpected results, use this diagnostic flow:

flowchart TD
    A[Query Returns Wrong Context] --> B{Check Note Links}
    B -->|Links Missing| C[Verify Obsidian Syntax]
    C --> D["Use [[exact-filename]] format"]
    B -->|Links Present| E{Check Depth Setting}
    E -->|Too Shallow| F[Increase maxDepth]
    E -->|Correct| G{Check Token Budget}
    G -->|Exceeded| H[Increase maxTokens or reduce depth]
    G -->|Within Budget| I{Check Priority Rules}
    I -->|Wrong Priority| J[Adjust priorityTags/priorityFolders]
    I -->|Correct| K[Enable Debug Logging]
    K --> L[Review traversal order in logs]
    L --> M{Found Issue?}
    M -->|Yes| N[Fix configuration]
    M -->|No| O[Check for circular references]
    O --> P[Add visited set to traversal]

💡 Tip: Add a debug mode that logs every note visited during graph traversal. This makes it trivial to understand why certain notes are or aren’t included in context.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Debug logging for graph traversal
interface TraversalLog {
  noteId: string;
  depth: number;
  tokensBefore: number;
  tokensAfter: number;
  included: boolean;
  reason?: string;
}

function createTraversalLogger() {
  const logs: TraversalLog[] = [];

  return {
    log(entry: TraversalLog) {
      logs.push(entry);
      if (process.env.DEBUG_TRAVERSAL) {
        console.log(
          `[Depth ${entry.depth}] ${entry.included ? '✓' : '✗'} ${entry.noteId} ` +
          `(${entry.tokensAfter - entry.tokensBefore} tokens) ${entry.reason || ''}`
        );
      }
    },
    
    getSummary() {
      return {
        totalNotes: logs.length,
        included: logs.filter(l => l.included).length,
        excluded: logs.filter(l => !l.included).length,
        totalTokens: logs[logs.length - 1]?.tokensAfter || 0,
        byDepth: logs.reduce((acc, l) => {
          acc[l.depth] = (acc[l.depth] || 0) + 1;
          return acc;
        }, {} as Record<number, number>)
      };
    }
  };
}

5. Stale Cache Issues

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
// Cache invalidation strategy for Obsidian vault changes
import { createHash } from 'crypto';
import * as fs from 'fs';
import * as path from 'path';

interface CacheEntry {
  content: string;
  hash: string;
  timestamp: number;
  linkedNoteHashes: Record<string, string>;
}

class SmartCache {
  private cache: Map<string, CacheEntry> = new Map();
  private vaultPath: string;

  constructor(vaultPath: string) {
    this.vaultPath = vaultPath;
  }

  private getFileHash(filePath: string): string {
    try {
      const content = fs.readFileSync(filePath, 'utf-8');
      return createHash('md5').update(content).digest('hex');
    } catch {
      return '';
    }
  }

  async get(noteId: string): Promise<string | null> {
    const entry = this.cache.get(noteId);
    if (!entry) return null;

    const filePath = path.join(this.vaultPath, `${noteId}.md`);
    const currentHash = this.getFileHash(filePath);

    // Check if main file changed
    if (currentHash !== entry.hash) {
      this.cache.delete(noteId);
      return null;
    }

    // Check if any linked notes changed
    for (const [linkedNote, linkedHash] of Object.entries(entry.linkedNoteHashes)) {
      const linkedPath = path.join(this.vaultPath, `${linkedNote}.md`);
      if (this.getFileHash(linkedPath) !== linkedHash) {
        this.cache.delete(noteId);
        return null;
      }
    }

    return entry.content;
  }

  set(noteId: string, content: string, linkedNotes: string[]): void {
    const filePath = path.join(this.vaultPath, `${noteId}.md`);
    const linkedNoteHashes: Record<string, string> = {};

    for (const linked of linkedNotes) {
      const linkedPath = path.join(this.vaultPath, `${linked}.md`);
      linkedNoteHashes[linked] = this.getFileHash(linkedPath);
    }

    this.cache.set(noteId, {
      content,
      hash: this.getFileHash(filePath),
      timestamp: Date.now(),
      linkedNoteHashes
    });
  }
}

📝 Note: Watch for the case where you edit a note that’s linked by many others. Your cache invalidation needs to cascade properly, or you’ll get stale context in queries that start from different entry points.

Conclusion and Next Steps

You’ve now built a functional personal knowledge graph system that combines Obsidian’s excellent note-taking capabilities with Claude’s reasoning power—implementing what Karpathy described as an “LLM Wiki” pattern.

The key architectural decisions we made:

Bidirectional link extraction from Obsidian’s wiki-link syntax enables automatic relationship discovery
Token-aware context loading prevents context window overflow while maximizing relevant information
Priority-based traversal ensures the most relevant notes are included first when budget is limited
Robust caching with smart invalidation keeps costs manageable without serving stale data

Immediate Next Steps

Start with these enhancements to make the system more powerful:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
// 1. Add semantic search for better note discovery
import Anthropic from '@anthropic-ai/sdk';

interface EmbeddingStore {
  noteId: string;
  embedding: number[];
  content: string;
}

async function findSemanticallySimilar(
  query: string,
  store: EmbeddingStore[],
  topK = 5
): Promise<string[]> {
  // Use Claude to generate a query embedding description
  const client = new Anthropic();
  
  // For production, use a dedicated embedding model
  // This is a simplified approach using Claude for concept extraction
  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 200,
    messages: [{
      role: 'user',
      content: `Extract 5 key concepts from this query as a JSON array of strings: "${query}"`
    }]
  });

  const textBlock = response.content.find(block => block.type === 'text');
  const concepts = JSON.parse(textBlock?.text || '[]');
  
  // Score notes by concept overlap (simplified)
  const scored = store.map(note => ({
    noteId: note.noteId,
    score: concepts.filter((c: string) => 
      note.content.toLowerCase().includes(c.toLowerCase())
    ).length
  }));

  return scored
    .sort((a, b) => b.score - a.score)
    .slice(0, topK)
    .map(s => s.noteId);
}

// 2. Implement automatic note suggestions
async function suggestConnections(
  noteContent: string,
  existingNotes: string[]
): Promise<Array<{ note: string; reason: string }>> {
  const client = new Anthropic();
  
  const response = await client.messages.create({
    model: 'claude-sonnet-4-20250514',
    max_tokens: 1024,
    messages: [{
      role: 'user',
      content: `Given this note content:
---
${noteContent}
---

And these existing notes in my knowledge base:
${existingNotes.map(n => `- ${n}`).join('\n')}

Suggest 3-5 notes that should be linked, with brief reasons why. 
Return as JSON: [{"note": "note-name", "reason": "why to link"}]`
    }]
  });

  const textBlock = response.content.find(block => block.type === 'text');
  return JSON.parse(textBlock?.text || '[]');
}

Long-term Enhancements

Once your basic system is stable, consider these advanced features:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Feature roadmap for knowledge graph evolution

phase_1_foundations:
  - Basic link extraction and traversal ✓
  - Token-aware context loading ✓
  - Caching layer ✓

phase_2_intelligence:
  - Semantic similarity search
  - Automatic link suggestions
  - Contradiction detection across notes
  - Gap analysis ("what's missing from my knowledge?")

phase_3_automation:
  - Daily digest of new connections discovered
  - Auto-tagging based on content analysis
  - Spaced repetition integration
  - Export to Anki for key concepts

phase_4_collaboration:
  - Multi-vault federation
  - Shared knowledge bases with access control
  - Version history with semantic diffs

The power of this approach lies in its compounding returns: every note you add makes every query smarter. Unlike traditional search, Claude can synthesize information across your notes, find non-obvious connections, and

Build a Personal Knowledge Graph with Claude and Obsidian

Building a Personal Knowledge Graph with Claude and Obsidian: Implementing Karpathy’s LLM Wiki Pattern

Prerequisites

Architecture and Key Concepts

Step-by-Step Implementation

Setting Up the Project Structure and Configuration

Building the Command Parser and Router

Implementing the Bidirectional Sync Engine

Production Configuration

Common Mistakes and Troubleshooting

Performance and Scalability

Conclusion and Next Steps

Additional Resources

Common Mistakes and Troubleshooting

1. Context Window Overflow

2. Obsidian Plugin Configuration Errors

3. Rate Limiting and API Errors

4. Graph Traversal Debugging

5. Stale Cache Issues

Conclusion and Next Steps

Immediate Next Steps

Long-term Enhancements