Implement Karpathy’s LLM Wiki pattern to transform Claude conversations into a structured Obsidian vault with custom commands and bidirectional sync.
Building a Personal Knowledge Graph with Claude and Obsidian: Implementing Karpathy’s LLM Wiki Pattern
Every developer accumulates knowledge through countless conversations with AI assistants—debugging sessions, architecture decisions, research explorations. Yet this knowledge evaporates the moment you close the chat window. Andrej Karpathy’s “LLM Wiki” pattern addresses this directly: treat your AI conversations as first-class knowledge artifacts that persist, interconnect, and compound over time.
This tutorial implements a complete system that transforms Claude conversations into a structured Obsidian vault. You’ll build custom command patterns (/wiki, /save, /autoresearch) that capture insights as they emerge, sync bidirectionally between Claude and your markdown files, and automatically enrich entries with related concepts and code examples.
The result is a personal knowledge graph that grows smarter with every conversation—accessible across sessions, searchable, and deeply integrated with your existing note-taking workflow.
Prerequisites
Before starting, ensure you have:
- Obsidian 1.4+ installed with a vault already configured
- Node.js 18+ for the sync server and automation scripts
- Claude API access with an active API key
- Basic familiarity with Obsidian’s folder structure and markdown linking syntax
- Git for version control of your knowledge base (recommended)
You’ll also need these npm packages installed globally or in your project:
1
| npm install @anthropic-ai/sdk chokidar gray-matter marked glob dotenv express
|
💡 If you’re using Obsidian Sync, disable it for the wiki folder during initial setup to avoid conflicts with our bidirectional sync system.
Architecture and Key Concepts
The system operates on three layers: a command interpreter that parses natural language instructions, a sync engine that maintains consistency between Claude’s context and your vault, and a research pipeline that enriches entries autonomously.
flowchart TD
subgraph Claude["Claude Conversation Layer"]
CMD[Command Parser]
CTX[Context Manager]
GEN[Content Generator]
end
subgraph Sync["Bidirectional Sync Engine"]
WATCH[File Watcher]
DIFF[Diff Calculator]
MERGE[Conflict Resolver]
end
subgraph Vault["Obsidian Vault"]
WIKI[(Wiki Entries)]
META[(Metadata Index)]
LINKS[(Backlink Graph)]
end
subgraph Research["Auto-Research Pipeline"]
ENRICH[Enrichment Queue]
FETCH[Related Concepts Fetcher]
CODE[Code Example Generator]
end
CMD --> |/wiki, /save| GEN
GEN --> |Create/Update| DIFF
CTX --> |Load Context| META
DIFF --> |Write| WIKI
WATCH --> |Detect Changes| DIFF
WIKI --> |Read| MERGE
MERGE --> |Update| CTX
GEN --> |Queue| ENRICH
ENRICH --> FETCH
FETCH --> CODE
CODE --> |Append| WIKI
WIKI <--> LINKS
META <--> WIKI
Key concepts to understand:
- Command patterns are structured prefixes (
/wiki, /save, /autoresearch) that trigger specific behaviors in the system - Context windows in Claude have limits—the sync engine maintains a compressed index that fits within these limits while preserving semantic richness
- Backlinks in Obsidian create bidirectional relationships; our system generates these automatically based on content analysis
Step-by-Step Implementation
Setting Up the Project Structure and Configuration
Create the project skeleton that will house your sync server and command handlers:
1
2
3
4
| mkdir claude-obsidian-wiki && cd claude-obsidian-wiki
mkdir -p src/{commands,sync,research} config scripts
touch src/index.ts src/commands/parser.ts src/sync/engine.ts src/research/pipeline.ts
touch config/settings.json .env
|
Configure your environment variables and core settings:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
| // config/settings.ts
import { config } from 'dotenv';
import * as path from 'path';
import * as fs from 'fs';
config();
interface WikiSettings {
vaultPath: string;
wikiFolder: string;
metadataFile: string;
syncInterval: number;
maxContextTokens: number;
autoResearchEnabled: boolean;
researchDepth: 'shallow' | 'medium' | 'deep';
}
// Load and validate settings from JSON config
const settingsPath = path.join(__dirname, 'settings.json');
const rawSettings = JSON.parse(fs.readFileSync(settingsPath, 'utf-8'));
export const settings: WikiSettings = {
vaultPath: process.env.OBSIDIAN_VAULT_PATH || rawSettings.vaultPath,
wikiFolder: rawSettings.wikiFolder || 'claude-wiki',
metadataFile: rawSettings.metadataFile || '.wiki-metadata.json',
syncInterval: rawSettings.syncInterval || 5000,
maxContextTokens: rawSettings.maxContextTokens || 8000,
autoResearchEnabled: rawSettings.autoResearchEnabled ?? true,
researchDepth: rawSettings.researchDepth || 'medium'
};
// Validate vault path exists
export function validateSettings(): void {
const fullWikiPath = path.join(settings.vaultPath, settings.wikiFolder);
if (!fs.existsSync(settings.vaultPath)) {
throw new Error(`Vault path does not exist: ${settings.vaultPath}`);
}
// Create wiki folder if missing
if (!fs.existsSync(fullWikiPath)) {
fs.mkdirSync(fullWikiPath, { recursive: true });
console.log(`Created wiki folder: ${fullWikiPath}`);
}
}
|
Create the settings JSON file:
1
2
3
4
5
6
7
8
9
| {
"vaultPath": "/Users/yourname/Documents/ObsidianVault",
"wikiFolder": "claude-wiki",
"metadataFile": ".wiki-metadata.json",
"syncInterval": 5000,
"maxContextTokens": 8000,
"autoResearchEnabled": true,
"researchDepth": "medium"
}
|
⚠️ Never commit your .env file to version control. Add it to .gitignore immediately after creation.
Building the Command Parser and Router
The command parser extracts structured instructions from natural language messages. It recognizes three primary patterns and routes them to appropriate handlers:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
| // src/commands/parser.ts
import Anthropic from '@anthropic-ai/sdk';
// Command types supported by the system
export type CommandType = 'wiki' | 'save' | 'autoresearch' | 'none';
export interface ParsedCommand {
type: CommandType;
title?: string;
content?: string;
tags?: string[];
linkedConcepts?: string[];
researchQuery?: string;
rawMessage: string;
}
// Regex patterns for command detection
const COMMAND_PATTERNS = {
wiki: /^\/wiki\s+(?:"([^"]+)"|(\S+))(?:\s+(.*))?$/is,
save: /^\/save\s+(?:"([^"]+)"|(\S+))(?:\s+\[([^\]]+)\])?(?:\s+(.*))?$/is,
autoresearch: /^\/autoresearch\s+(.+)$/is
};
export function parseCommand(message: string): ParsedCommand {
const trimmed = message.trim();
// Check for /wiki command
// Format: /wiki "Title" optional content
const wikiMatch = trimmed.match(COMMAND_PATTERNS.wiki);
if (wikiMatch) {
const title = wikiMatch[1] || wikiMatch[2];
const content = wikiMatch[3] || '';
return {
type: 'wiki',
title: sanitizeTitle(title),
content: content,
tags: extractInlineTags(content),
linkedConcepts: extractWikiLinks(content),
rawMessage: message
};
}
// Check for /save command
// Format: /save "Title" [tag1, tag2] content
const saveMatch = trimmed.match(COMMAND_PATTERNS.save);
if (saveMatch) {
const title = saveMatch[1] || saveMatch[2];
const tagsRaw = saveMatch[3] || '';
const content = saveMatch[4] || '';
return {
type: 'save',
title: sanitizeTitle(title),
content: content,
tags: tagsRaw.split(',').map(t => t.trim()).filter(Boolean),
linkedConcepts: extractWikiLinks(content),
rawMessage: message
};
}
// Check for /autoresearch command
// Format: /autoresearch topic or question
const researchMatch = trimmed.match(COMMAND_PATTERNS.autoresearch);
if (researchMatch) {
return {
type: 'autoresearch',
researchQuery: researchMatch[1].trim(),
rawMessage: message
};
}
// No command detected - return raw message for normal processing
return {
type: 'none',
rawMessage: message
};
}
// Sanitize title for use as filename
function sanitizeTitle(title: string): string {
return title
.replace(/[<>:"/\\|?*]/g, '-') // Remove invalid filename chars
.replace(/\s+/g, ' ') // Normalize whitespace
.trim()
.slice(0, 100); // Limit length
}
// Extract [[wiki links]] from content
function extractWikiLinks(content: string): string[] {
const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
const links: string[] = [];
let match;
while ((match = linkPattern.exec(content)) !== null) {
links.push(match[1].trim());
}
return [...new Set(links)]; // Deduplicate
}
// Extract #tags from content
function extractInlineTags(content: string): string[] {
const tagPattern = /#([a-zA-Z][a-zA-Z0-9_-]*)/g;
const tags: string[] = [];
let match;
while ((match = tagPattern.exec(content)) !== null) {
tags.push(match[1]);
}
return [...new Set(tags)];
}
|
Now create the command router that orchestrates execution:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
| // src/commands/router.ts
import { ParsedCommand, parseCommand } from './parser';
import { createWikiEntry, updateWikiEntry } from '../sync/engine';
import { queueResearch } from '../research/pipeline';
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic();
export interface CommandResult {
success: boolean;
message: string;
createdFiles?: string[];
updatedFiles?: string[];
queuedResearch?: string[];
}
export async function routeCommand(
userMessage: string,
conversationContext: string[]
): Promise<CommandResult> {
const parsed = parseCommand(userMessage);
switch (parsed.type) {
case 'wiki':
return handleWikiCommand(parsed, conversationContext);
case 'save':
return handleSaveCommand(parsed);
case 'autoresearch':
return handleAutoresearchCommand(parsed);
case 'none':
// Pass through to normal Claude conversation
return { success: true, message: 'No command detected' };
}
}
async function handleWikiCommand(
cmd: ParsedCommand,
context: string[]
): Promise<CommandResult> {
// Generate structured wiki content using Claude
const generationPrompt = buildWikiGenerationPrompt(cmd, context);
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 2000,
system: `You are a technical documentation expert. Generate structured wiki entries
in markdown format. Include: a clear definition, key concepts, practical examples,
and relevant links to related topics using [[wiki link]] syntax.`,
messages: [{ role: 'user', content: generationPrompt }]
});
const generatedContent = response.content[0].type === 'text'
? response.content[0].text
: '';
// Create the wiki entry file
const filePath = await createWikiEntry({
title: cmd.title!,
content: generatedContent,
tags: cmd.tags || [],
linkedConcepts: extractAllLinks(generatedContent),
source: 'claude-conversation',
createdAt: new Date().toISOString()
});
return {
success: true,
message: `Created wiki entry: ${cmd.title}`,
createdFiles: [filePath]
};
}
async function handleSaveCommand(cmd: ParsedCommand): Promise<CommandResult> {
// Direct save without AI generation - just format and store
const filePath = await createWikiEntry({
title: cmd.title!,
content: cmd.content || '',
tags: cmd.tags || [],
linkedConcepts: cmd.linkedConcepts || [],
source: 'manual-save',
createdAt: new Date().toISOString()
});
return {
success: true,
message: `Saved note: ${cmd.title}`,
createdFiles: [filePath]
};
}
async function handleAutoresearchCommand(cmd: ParsedCommand): Promise<CommandResult> {
// Queue the research topic for async processing
const queueId = await queueResearch(cmd.researchQuery!);
return {
success: true,
message: `Queued research on: ${cmd.researchQuery}`,
queuedResearch: [queueId]
};
}
function buildWikiGenerationPrompt(cmd: ParsedCommand, context: string[]): string {
const recentContext = context.slice(-5).join('\n---\n');
return `Create a wiki entry for: "${cmd.title}"
${cmd.content ? `Additional context from user: ${cmd.content}` : ''}
Recent conversation context:
${recentContext}
Generate a comprehensive wiki entry that:
1. Starts with a one-sentence definition
2. Explains core concepts with examples
3. Includes code snippets where relevant
4. Links to related concepts using [[concept name]] syntax
5. Adds practical tips or warnings where appropriate`;
}
function extractAllLinks(content: string): string[] {
const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
const links: string[] = [];
let match;
while ((match = linkPattern.exec(content)) !== null) {
links.push(match[1].trim());
}
return [...new Set(links)];
}
|
📝 The command parser uses permissive regex patterns intentionally. Users can omit quotes around single-word titles and tags are optional. This reduces friction during rapid knowledge capture.
Implementing the Bidirectional Sync Engine
The sync engine is the critical component that maintains consistency between your Obsidian vault and Claude’s conversation context. It watches for file changes, calculates diffs, and resolves conflicts when both systems modify the same entry:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
| // src/sync/engine.ts
import * as fs from 'fs';
import * as path from 'path';
import * as chokidar from 'chokidar';
import matter from 'gray-matter';
import { settings } from '../../config/settings';
// In-memory representation of a wiki entry
export interface WikiEntry {
title: string;
content: string;
tags: string[];
linkedConcepts: string[];
source: 'claude-conversation' | 'manual-save' | 'obsidian-edit' | 'auto-research';
createdAt: string;
updatedAt?: string;
checksum?: string;
}
// Metadata index for fast lookups without reading all files
interface MetadataIndex {
entries: Record<string, {
path: string;
checksum: string;
tags: string[];
linkedConcepts: string[];
lastModified: string;
}>;
lastSync: string;
}
let metadataIndex: MetadataIndex = { entries: {}, lastSync: '' };
let fileWatcher: chokidar.FSWatcher | null = null;
// Initialize sync engine and load existing metadata
export async function initializeSyncEngine(): Promise<void> {
const wikiPath = getWikiPath();
const metadataPath = getMetadataPath();
// Load existing metadata index if present
if (fs.existsSync(metadataPath)) {
const raw = fs.readFileSync(metadataPath, 'utf-8');
metadataIndex = JSON.parse(raw);
console.log(`Loaded metadata index with ${Object.keys(metadataIndex.entries).length} entries`);
}
// Scan vault for any untracked files
await reconcileVaultState();
// Start file watcher for real-time sync
startFileWatcher();
}
// Create a new wiki entry file
export async function createWikiEntry(entry: WikiEntry): Promise<string> {
const filename = `${entry.title}.md`;
const filePath = path.join(getWikiPath(), filename);
// Build frontmatter
const frontmatter = {
title: entry.title,
tags: entry.tags,
created: entry.createdAt,
source: entry.source,
links: entry.linkedConcepts
};
// Combine frontmatter with content
const fileContent = matter.stringify(entry.content, frontmatter);
const checksum = computeChecksum(fileContent);
// Write file
fs.writeFileSync(filePath, fileContent, 'utf-8');
// Update metadata index
metadataIndex.entries[entry.title] = {
path: filePath,
checksum: checksum,
tags: entry.tags,
linkedConcepts: entry.linkedConcepts,
lastModified: new Date().toISOString()
};
await persistMetadataIndex();
console.log(`Created wiki entry: ${filename}`);
return filePath;
}
// Update existing wiki entry with conflict detection
export async function updateWikiEntry(
title: string,
updates: Partial<WikiEntry>,
forceOverwrite: boolean = false
): Promise<{ success: boolean; conflict?: boolean; merged?: string }> {
const existing = metadataIndex.entries[title];
if (!existing) {
// Entry doesn't exist - create it instead
await createWikiEntry({
title,
content: updates.content || '',
tags: updates.tags || [],
linkedConcepts: updates.linkedConcepts || [],
source: updates.source || 'claude-conversation',
createdAt: new Date().toISOString()
});
return { success: true };
}
// Read current file content
const currentContent = fs.readFileSync(existing.path, 'utf-8');
const currentChecksum = computeChecksum(currentContent);
// Detect if file was modified externally
if (currentChecksum !== existing.checksum && !forceOverwrite) {
// Conflict detected - attempt three-way merge
const mergeResult = attemptMerge(existing.path, updates.content || '');
if (mergeResult.success) {
fs.writeFileSync(existing.path, mergeResult.merged!, 'utf-8');
existing.checksum = computeChecksum(mergeResult.merged!);
await persistMetadataIndex();
return { success: true, merged: mergeResult.merged };
}
return { success: false, conflict: true };
}
// No conflict - apply updates directly
const parsed = matter(currentContent);
if (updates.content) {
parsed.content = updates.content;
}
if (updates.tags) {
parsed.data.tags = [...new Set([...parsed.data.tags, ...updates.tags])];
}
if (updates.linkedConcepts) {
parsed.data.links = [...new Set([...parsed.data.links, ...updates.linkedConcepts])];
}
parsed.data.updated = new Date().toISOString();
const newContent = matter.stringify(parsed.content, parsed.data);
fs.writeFileSync(existing.path, newContent, 'utf-8');
existing.checksum = computeChecksum(newContent);
existing.lastModified = parsed.data.updated;
await persistMetadataIndex();
return { success: true };
}
// Load entries for context injection into Claude conversations
export function loadContextEntries(
relevantTitles: string[],
maxTokens: number = settings.maxContextTokens
): string {
const entries: string[] = [];
let estimatedTokens = 0;
for (const title of relevantTitles) {
const meta = metadataIndex.entries[title];
if (!meta) continue;
const content = fs.readFileSync(meta.path, 'utf-8');
const parsed = matter(content);
// Rough token estimation (4 chars ≈ 1 token)
const entryTokens = Math.ceil(parsed.content.length / 4);
if (estimatedTokens + entryTokens > maxTokens) {
break;
}
entries.push(`## ${title}\n${parsed.content}`);
estimatedTokens += entryTokens;
}
return entries.join('\n\n---\n\n');
}
// Start watching for file changes
function startFileWatcher(): void {
const wikiPath = getWikiPath();
fileWatcher = chokidar.watch(`${wikiPath}/*.md`, {
persistent: true,
ignoreInitial: true,
awaitWriteFinish: {
stabilityThreshold: 500,
pollInterval: 100
}
});
fileWatcher.on('change', handleFileChange);
fileWatcher.on('add', handleFileAdd);
fileWatcher.on('unlink', handleFileDelete);
console.log(`Watching for changes in: ${wikiPath}`);
}
async function handleFileChange(filePath: string): Promise<void> {
const filename = path.basename(filePath, '.md');
const content = fs.readFileSync(filePath, 'utf-8');
const checksum = computeChecksum(content);
const existing = metadataIndex.entries[filename];
if (existing && existing.checksum !== checksum) {
// File was modified in Obsidian
const parsed = matter(content);
existing.checksum = checksum;
existing.tags = parsed.data.tags || [];
existing.linkedConcepts = parsed.data.links || [];
existing.lastModified = new Date().toISOString();
await persistMetadataIndex();
console.log(`Synced external changes: ${filename}`);
}
}
async function handleFileAdd(filePath: string): Promise<void> {
const filename = path.basename(filePath, '.md');
const content = fs.readFileSync(filePath, 'utf-8');
const parsed = matter(content);
metadataIndex.entries[filename] = {
path: filePath,
checksum: computeChecksum(content),
tags: parsed.data.tags || [],
linkedConcepts: parsed.data.links || [],
lastModified: new Date().toISOString()
};
await persistMetadataIndex();
console.log(`Indexed new file: ${filename}`);
}
async function handleFileDelete(filePath: string): Promise<void> {
const filename = path.basename(filePath, '.md');
delete metadataIndex.entries[filename];
await persistMetadataIndex();
console.log(`Removed from index: ${filename}`);
}
// Scan vault and reconcile with metadata index
async function reconcileVaultState(): Promise<void> {
const wikiPath = getWikiPath();
const files = fs.readdirSync(wikiPath).filter(f => f.endsWith('.md'));
for (const file of files) {
const filePath = path.join(wikiPath, file);
const title = path.basename(file, '.md');
if (!metadataIndex.entries[title]) {
await handleFileAdd(filePath);
}
}
// Remove entries for deleted files
for (const title of Object.keys(metadataIndex.entries)) {
const entry = metadataIndex.entries[title];
if (!fs.existsSync(entry.path)) {
delete metadataIndex.entries[title];
}
}
await persistMetadataIndex();
}
// Simple checksum for change detection
function computeChecksum(content: string): string {
let hash = 0;
for (let i = 0; i < content.length; i++) {
const char = content.charCodeAt(i);
hash = ((hash << 5) - hash) + char;
hash = hash & hash;
}
return hash.toString(16);
}
// Basic three-way merge attempt
function attemptMerge(
filePath: string,
newContent: string
): { success: boolean; merged?: string } {
// For MVP, we use a simple strategy: append new content as a section
const current = fs.readFileSync(filePath, 'utf-8');
const parsed = matter(current);
// Check if contents are significantly different
if (parsed.content.includes(newContent.slice(0, 100))) {
// New content already exists - skip
return { success: true, merged: current };
}
// Append new content with separator
const mergedContent = `${parsed.content}\n\n---\n*Added from Claude conversation (${new Date().toISOString()})*\n\n${newContent}`;
return {
success: true,
merged: matter.stringify(mergedContent, parsed.data)
};
}
async function persistMetadataIndex(): Promise<void> {
metadataIndex.lastSync = new Date().toISOString();
const metadataPath = getMetadataPath();
fs.writeFileSync(metadataPath, JSON.stringify(metadataIndex, null, 2), 'utf-8');
}
function getWikiPath(): string {
return path.join(settings.vaultPath, settings.wikiFolder);
}
function getMetadataPath(): string {
return path.join(getWikiPath(), settings.metadataFile);
}
// Export for testing
export { metadataIndex, reconcileVaultState };
|
⚠️ The file watcher uses awaitWriteFinish to prevent reading partially written files. If you experience issues with large files, increase the stabilityThreshold value.
💡 The metadata index file (.wiki-metadata.json) is prefixed with a dot to hide it in most file explorers. Obsidian will ignore it by default, keeping your vault clean.
Production Configuration
A robust production setup requires proper configuration management. Create a dedicated config file that handles environment-specific settings:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
| // config/wiki-config.ts
import { z } from 'zod';
import * as fs from 'fs';
import * as path from 'path';
// Schema validation ensures config integrity
const ConfigSchema = z.object({
vault: z.object({
path: z.string().min(1),
wikiFolder: z.string().default('wiki'),
metadataFile: z.string().default('.wiki-metadata.json'),
}),
claude: z.object({
apiKey: z.string().min(1),
model: z.string().default('claude-sonnet-4-20250514'),
maxTokens: z.number().default(4096),
temperature: z.number().min(0).max(1).default(0.3),
}),
processing: z.object({
batchSize: z.number().default(10),
rateLimitMs: z.number().default(1000),
maxRetries: z.number().default(3),
concurrentRequests: z.number().default(2),
}),
graph: z.object({
minSimilarityScore: z.number().default(0.7),
maxLinksPerNote: z.number().default(10),
enableBacklinks: z.boolean().default(true),
}),
});
type WikiConfig = z.infer<typeof ConfigSchema>;
function loadConfig(): WikiConfig {
const configPath = process.env.WIKI_CONFIG_PATH || './wiki-config.json';
if (!fs.existsSync(configPath)) {
throw new Error(`Config file not found: ${configPath}`);
}
const rawConfig = JSON.parse(fs.readFileSync(configPath, 'utf-8'));
// Override with environment variables
const merged = {
...rawConfig,
claude: {
...rawConfig.claude,
apiKey: process.env.ANTHROPIC_API_KEY || rawConfig.claude?.apiKey,
},
};
return ConfigSchema.parse(merged);
}
export const config = loadConfig();
|
Create the corresponding JSON configuration file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| {
"vault": {
"path": "/Users/yourname/Documents/Obsidian/KnowledgeBase",
"wikiFolder": "wiki",
"metadataFile": ".wiki-metadata.json"
},
"claude": {
"model": "claude-sonnet-4-20250514",
"maxTokens": 4096,
"temperature": 0.3
},
"processing": {
"batchSize": 10,
"rateLimitMs": 1000,
"maxRetries": 3,
"concurrentRequests": 2
},
"graph": {
"minSimilarityScore": 0.7,
"maxLinksPerNote": 10,
"enableBacklinks": true
}
}
|
⚠️ Never commit your API key to version control. Use environment variables or a .env file with dotenv. Add wiki-config.json to .gitignore if it contains sensitive data.
For systemd deployment on Linux servers, create a service file:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| # /etc/systemd/system/knowledge-wiki.service
[Unit]
Description=Personal Knowledge Wiki Service
After=network.target
[Service]
Type=simple
User=wiki
WorkingDirectory=/opt/knowledge-wiki
ExecStart=/usr/bin/node dist/index.js
Restart=on-failure
RestartSec=10
Environment=NODE_ENV=production
Environment=ANTHROPIC_API_KEY=your-api-key
Environment=WIKI_CONFIG_PATH=/opt/knowledge-wiki/config.json
# Resource limits
MemoryMax=512M
CPUQuota=50%
[Install]
WantedBy=multi-user.target
|
The following diagram illustrates the production deployment architecture:
flowchart TD
subgraph Client["Client Layer"]
OBS[Obsidian App]
SYNC[Sync Service]
end
subgraph Server["Server Layer"]
SYSTEMD[systemd]
WIKI[Wiki Service]
WATCHER[File Watcher]
end
subgraph Storage["Storage Layer"]
VAULT[(Obsidian Vault)]
META[(Metadata Index)]
LOGS[(Log Files)]
end
subgraph External["External Services"]
CLAUDE[Claude API]
end
OBS -->|edit files| SYNC
SYNC -->|sync| VAULT
SYSTEMD -->|manages| WIKI
WIKI --> WATCHER
WATCHER -->|monitors| VAULT
WIKI -->|read/write| META
WIKI -->|structured logging| LOGS
WIKI -->|API calls| CLAUDE
CLAUDE -->|embeddings & analysis| WIKI
Common Mistakes and Troubleshooting
Problem: Duplicate links appearing in notes
This happens when the link injection runs before the metadata index updates. Implement idempotent link insertion:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
| // utils/link-injector.ts
export function injectLinks(
content: string,
newLinks: string[],
existingLinks: Set<string>
): string {
// Extract current wiki links from content
const linkPattern = /\[\[([^\]|]+)(?:\|[^\]]+)?\]\]/g;
const currentLinks = new Set<string>();
let match;
while ((match = linkPattern.exec(content)) !== null) {
currentLinks.add(match[1].toLowerCase());
}
// Filter out links that already exist
const linksToAdd = newLinks.filter(
link => !currentLinks.has(link.toLowerCase()) &&
!existingLinks.has(link.toLowerCase())
);
if (linksToAdd.length === 0) {
return content; // No changes needed
}
// Find or create the "Related" section
const relatedSection = '\n\n## Related\n';
const formattedLinks = linksToAdd.map(l => `- [[${l}]]`).join('\n');
if (content.includes('## Related')) {
// Append to existing section
return content.replace(
/(## Related\n)/,
`$1${formattedLinks}\n`
);
}
// Add new section at the end
return content.trimEnd() + relatedSection + formattedLinks + '\n';
}
|
Problem: API rate limiting causing failures
Implement exponential backoff with jitter:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
| // utils/retry.ts
export async function withRetry<T>(
fn: () => Promise<T>,
options: {
maxRetries: number;
baseDelayMs: number;
maxDelayMs: number;
}
): Promise<T> {
let lastError: Error;
for (let attempt = 0; attempt <= options.maxRetries; attempt++) {
try {
return await fn();
} catch (error) {
lastError = error as Error;
if (attempt === options.maxRetries) {
break;
}
// Check if error is retryable
if (error instanceof Error && error.message.includes('rate_limit')) {
const delay = Math.min(
options.baseDelayMs * Math.pow(2, attempt),
options.maxDelayMs
);
// Add jitter to prevent thundering herd
const jitter = Math.random() * 0.3 * delay;
console.log(`Rate limited. Retrying in ${delay + jitter}ms...`);
await sleep(delay + jitter);
} else {
throw error; // Non-retryable error
}
}
}
throw lastError!;
}
function sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
|
💡 Set concurrentRequests to 2 or lower to stay well within Claude’s rate limits. Higher concurrency provides diminishing returns due to API throttling.
Problem: File watcher triggers infinite loops
When your service writes to watched files, it can trigger its own watcher. Use a write lock mechanism:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| // State tracking for write operations
const pendingWrites = new Set<string>();
async function safeWriteFile(filePath: string, content: string): Promise<void> {
const normalizedPath = path.normalize(filePath);
pendingWrites.add(normalizedPath);
try {
await fs.promises.writeFile(filePath, content, 'utf-8');
// Keep the lock briefly to handle watcher debounce
await sleep(100);
} finally {
pendingWrites.delete(normalizedPath);
}
}
// In your watcher handler
watcher.on('change', async (filePath) => {
if (pendingWrites.has(path.normalize(filePath))) {
return; // Ignore our own writes
}
// Process the change...
});
|
📝 Common symptoms of infinite loops include high CPU usage, rapidly growing log files, and Obsidian becoming unresponsive. Check your logs for repeated processing of the same file.
For vaults exceeding 1,000 notes, batch processing becomes essential. The following implementation processes notes in parallel while respecting rate limits:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
| // services/batch-processor.ts
import pLimit from 'p-limit';
import { config } from '../config/wiki-config';
interface ProcessingResult {
path: string;
success: boolean;
error?: string;
processingTimeMs: number;
}
export class BatchProcessor {
private limit: pLimit.Limit;
private results: ProcessingResult[] = [];
constructor() {
this.limit = pLimit(config.processing.concurrentRequests);
}
async processNotes(notePaths: string[]): Promise<ProcessingResult[]> {
const batches = this.chunkArray(notePaths, config.processing.batchSize);
console.log(`Processing ${notePaths.length} notes in ${batches.length} batches`);
for (let i = 0; i < batches.length; i++) {
const batch = batches[i];
console.log(`Batch ${i + 1}/${batches.length}: ${batch.length} notes`);
const batchResults = await Promise.all(
batch.map(notePath =>
this.limit(() => this.processNote(notePath))
)
);
this.results.push(...batchResults);
// Rate limiting between batches
if (i < batches.length - 1) {
await this.sleep(config.processing.rateLimitMs);
}
}
return this.results;
}
private async processNote(notePath: string): Promise<ProcessingResult> {
const startTime = Date.now();
try {
// Your note processing logic here
await this.analyzeAndLinkNote(notePath);
return {
path: notePath,
success: true,
processingTimeMs: Date.now() - startTime,
};
} catch (error) {
return {
path: notePath,
success: false,
error: (error as Error).message,
processingTimeMs: Date.now() - startTime,
};
}
}
private chunkArray<T>(array: T[], size: number): T[][] {
const chunks: T[][] = [];
for (let i = 0; i < array.length; i += size) {
chunks.push(array.slice(i, i + size));
}
return chunks;
}
private sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
private async analyzeAndLinkNote(notePath: string): Promise<void> {
// Implementation depends on your specific requirements
}
}
|
For very large vaults, consider implementing incremental processing with checkpoints:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
| # checkpoint.yml - tracks processing state
lastRun: "2024-01-15T10:30:00Z"
processedFiles:
- path: "wiki/machine-learning.md"
hash: "a1b2c3d4"
timestamp: "2024-01-15T10:25:00Z"
- path: "wiki/neural-networks.md"
hash: "e5f6g7h8"
timestamp: "2024-01-15T10:28:00Z"
pendingFiles:
- "wiki/transformers.md"
- "wiki/attention-mechanisms.md"
failedFiles:
- path: "wiki/corrupted-note.md"
error: "Invalid frontmatter syntax"
attempts: 3
|
Memory optimization matters when building large graphs. Use streaming for file operations:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
| // Efficient file reading for large vaults
import * as readline from 'readline';
import * as fs from 'fs';
async function extractFrontmatterStream(filePath: string): Promise<Record<string, unknown> | null> {
const fileStream = fs.createReadStream(filePath);
const rl = readline.createInterface({
input: fileStream,
crlfDelay: Infinity,
});
let inFrontmatter = false;
let frontmatterLines: string[] = [];
for await (const line of rl) {
if (line === '---' && !inFrontmatter) {
inFrontmatter = true;
continue;
}
if (line === '---' && inFrontmatter) {
break; // End of frontmatter
}
if (inFrontmatter) {
frontmatterLines.push(line);
}
}
rl.close();
fileStream.close();
if (frontmatterLines.length === 0) {
return null;
}
// Parse YAML frontmatter
const yaml = await import('yaml');
return yaml.parse(frontmatterLines.join('\n'));
}
|
📝 Processing 10,000 notes with the batch processor takes approximately 3-4 hours with conservative rate limiting. Run initial indexing overnight or during low-usage periods.
Conclusion and Next Steps
You now have a working personal knowledge graph that implements Karpathy’s LLM wiki pattern. The system watches your Obsidian vault, uses Claude to analyze content semantics, and automatically maintains relationships between notes.
Key capabilities you’ve built:
- Real-time file monitoring with conflict-free updates
- Semantic analysis via Claude for intelligent linking
- Persistent metadata index for instant lookups
- Production-ready configuration with validation
- Batch processing for large vault initialization
To extend this foundation, consider these next steps:
- Add vector embeddings: Store Claude-generated embeddings in a local vector database like Chroma or LanceDB for similarity search
- Build a query interface: Create a command-line tool or Obsidian plugin to ask questions across your knowledge base
- Implement spaced repetition: Use the graph structure to surface notes for periodic review
- Add concept extraction: Have Claude identify key concepts and automatically create hub notes
The modular architecture makes these additions straightforward. Start with embeddings—they enable powerful semantic search that transforms how you navigate your knowledge.
Additional Resources
Common Mistakes and Troubleshooting
1. Context Window Overflow
The most common mistake when building a knowledge graph with Claude is exceeding the context window with too many linked notes.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
| // ❌ BAD: Loading all linked notes without limit
async function getContext(noteId: string): Promise<string> {
const note = await getNote(noteId);
const links = extractLinks(note.content);
// This can explode exponentially
const linkedContent = await Promise.all(
links.map(link => getContext(link)) // Recursive without depth limit!
);
return note.content + linkedContent.join('\n');
}
// ✅ GOOD: Controlled depth and token budget
interface ContextOptions {
maxDepth: number;
maxTokens: number;
priorityTags?: string[];
}
async function getContextWithLimits(
noteId: string,
options: ContextOptions,
currentDepth = 0,
tokenCount = 0
): Promise<{ content: string; tokens: number }> {
if (currentDepth >= options.maxDepth || tokenCount >= options.maxTokens) {
return { content: '', tokens: tokenCount };
}
const note = await getNote(noteId);
const noteTokens = estimateTokens(note.content);
// Check if adding this note exceeds budget
if (tokenCount + noteTokens > options.maxTokens) {
// Return truncated version
const availableTokens = options.maxTokens - tokenCount;
return {
content: truncateToTokens(note.content, availableTokens),
tokens: options.maxTokens
};
}
let result = note.content;
let newTokenCount = tokenCount + noteTokens;
// Only follow links if we have budget remaining
const links = extractLinks(note.content);
const prioritizedLinks = prioritizeLinks(links, options.priorityTags);
for (const link of prioritizedLinks) {
if (newTokenCount >= options.maxTokens) break;
const linked = await getContextWithLimits(
link,
options,
currentDepth + 1,
newTokenCount
);
result += `\n\n---\nLinked: ${link}\n${linked.content}`;
newTokenCount = linked.tokens;
}
return { content: result, tokens: newTokenCount };
}
function estimateTokens(text: string): number {
// Rough estimation: ~4 characters per token for English
return Math.ceil(text.length / 4);
}
|
⚠️ Warning: Claude’s context window is large but not infinite. A 200k token limit sounds like a lot until you try to load 50 interconnected notes with code blocks and embedded images.
2. Obsidian Plugin Configuration Errors
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
| # ❌ BAD: Common .obsidian/plugins/knowledge-graph-claude/data.json mistakes
{
"apiKey": "sk-ant-...", # Never store API keys in plugin settings!
"maxDepth": 10, # Too deep - will timeout
"includeBacklinks": true,
"includeTags": true,
"cacheTimeout": 0 # No cache = constant API calls = expensive
}
# ✅ GOOD: Secure and performant configuration
{
"apiKeyEnvVar": "CLAUDE_API_KEY", # Reference env var instead
"maxDepth": 3, # Reasonable depth
"maxNotesPerQuery": 15, # Hard limit on notes
"includeBacklinks": true,
"includeTags": true,
"cacheTimeout": 3600, # 1 hour cache
"excludeFolders": ["templates", "daily-notes", "attachments"],
"priorityFolders": ["projects", "concepts"]
}
|
3. Rate Limiting and API Errors
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
| // Robust API client with retry logic
import Anthropic from '@anthropic-ai/sdk';
class ResilientClaudeClient {
private client: Anthropic;
private requestQueue: Array<() => Promise<void>> = [];
private processing = false;
private lastRequestTime = 0;
private minRequestInterval = 100; // ms between requests
constructor() {
this.client = new Anthropic();
}
async query(
messages: Anthropic.MessageParam[],
retries = 3
): Promise<string> {
for (let attempt = 1; attempt <= retries; attempt++) {
try {
// Rate limiting
const now = Date.now();
const timeSinceLastRequest = now - this.lastRequestTime;
if (timeSinceLastRequest < this.minRequestInterval) {
await this.sleep(this.minRequestInterval - timeSinceLastRequest);
}
this.lastRequestTime = Date.now();
const response = await this.client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 4096,
messages
});
const textBlock = response.content.find(block => block.type === 'text');
return textBlock?.text || '';
} catch (error) {
if (this.isRateLimitError(error)) {
const backoff = Math.pow(2, attempt) * 1000; // Exponential backoff
console.warn(`Rate limited. Waiting ${backoff}ms before retry ${attempt}/${retries}`);
await this.sleep(backoff);
continue;
}
if (this.isOverloadedError(error) && attempt < retries) {
// Claude is overloaded, wait longer
await this.sleep(5000 * attempt);
continue;
}
throw error;
}
}
throw new Error('Max retries exceeded');
}
private isRateLimitError(error: unknown): boolean {
return error instanceof Anthropic.RateLimitError;
}
private isOverloadedError(error: unknown): boolean {
return error instanceof Anthropic.APIError && error.status === 529;
}
private sleep(ms: number): Promise<void> {
return new Promise(resolve => setTimeout(resolve, ms));
}
}
|
4. Graph Traversal Debugging
When your knowledge graph queries return unexpected results, use this diagnostic flow:
flowchart TD
A[Query Returns Wrong Context] --> B{Check Note Links}
B -->|Links Missing| C[Verify Obsidian Syntax]
C --> D["Use [[exact-filename]] format"]
B -->|Links Present| E{Check Depth Setting}
E -->|Too Shallow| F[Increase maxDepth]
E -->|Correct| G{Check Token Budget}
G -->|Exceeded| H[Increase maxTokens or reduce depth]
G -->|Within Budget| I{Check Priority Rules}
I -->|Wrong Priority| J[Adjust priorityTags/priorityFolders]
I -->|Correct| K[Enable Debug Logging]
K --> L[Review traversal order in logs]
L --> M{Found Issue?}
M -->|Yes| N[Fix configuration]
M -->|No| O[Check for circular references]
O --> P[Add visited set to traversal]
💡 Tip: Add a debug mode that logs every note visited during graph traversal. This makes it trivial to understand why certain notes are or aren’t included in context.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
| // Debug logging for graph traversal
interface TraversalLog {
noteId: string;
depth: number;
tokensBefore: number;
tokensAfter: number;
included: boolean;
reason?: string;
}
function createTraversalLogger() {
const logs: TraversalLog[] = [];
return {
log(entry: TraversalLog) {
logs.push(entry);
if (process.env.DEBUG_TRAVERSAL) {
console.log(
`[Depth ${entry.depth}] ${entry.included ? '✓' : '✗'} ${entry.noteId} ` +
`(${entry.tokensAfter - entry.tokensBefore} tokens) ${entry.reason || ''}`
);
}
},
getSummary() {
return {
totalNotes: logs.length,
included: logs.filter(l => l.included).length,
excluded: logs.filter(l => !l.included).length,
totalTokens: logs[logs.length - 1]?.tokensAfter || 0,
byDepth: logs.reduce((acc, l) => {
acc[l.depth] = (acc[l.depth] || 0) + 1;
return acc;
}, {} as Record<number, number>)
};
}
};
}
|
5. Stale Cache Issues
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
| // Cache invalidation strategy for Obsidian vault changes
import { createHash } from 'crypto';
import * as fs from 'fs';
import * as path from 'path';
interface CacheEntry {
content: string;
hash: string;
timestamp: number;
linkedNoteHashes: Record<string, string>;
}
class SmartCache {
private cache: Map<string, CacheEntry> = new Map();
private vaultPath: string;
constructor(vaultPath: string) {
this.vaultPath = vaultPath;
}
private getFileHash(filePath: string): string {
try {
const content = fs.readFileSync(filePath, 'utf-8');
return createHash('md5').update(content).digest('hex');
} catch {
return '';
}
}
async get(noteId: string): Promise<string | null> {
const entry = this.cache.get(noteId);
if (!entry) return null;
const filePath = path.join(this.vaultPath, `${noteId}.md`);
const currentHash = this.getFileHash(filePath);
// Check if main file changed
if (currentHash !== entry.hash) {
this.cache.delete(noteId);
return null;
}
// Check if any linked notes changed
for (const [linkedNote, linkedHash] of Object.entries(entry.linkedNoteHashes)) {
const linkedPath = path.join(this.vaultPath, `${linkedNote}.md`);
if (this.getFileHash(linkedPath) !== linkedHash) {
this.cache.delete(noteId);
return null;
}
}
return entry.content;
}
set(noteId: string, content: string, linkedNotes: string[]): void {
const filePath = path.join(this.vaultPath, `${noteId}.md`);
const linkedNoteHashes: Record<string, string> = {};
for (const linked of linkedNotes) {
const linkedPath = path.join(this.vaultPath, `${linked}.md`);
linkedNoteHashes[linked] = this.getFileHash(linkedPath);
}
this.cache.set(noteId, {
content,
hash: this.getFileHash(filePath),
timestamp: Date.now(),
linkedNoteHashes
});
}
}
|
📝 Note: Watch for the case where you edit a note that’s linked by many others. Your cache invalidation needs to cascade properly, or you’ll get stale context in queries that start from different entry points.
Conclusion and Next Steps
You’ve now built a functional personal knowledge graph system that combines Obsidian’s excellent note-taking capabilities with Claude’s reasoning power—implementing what Karpathy described as an “LLM Wiki” pattern.
The key architectural decisions we made:
- Bidirectional link extraction from Obsidian’s wiki-link syntax enables automatic relationship discovery
- Token-aware context loading prevents context window overflow while maximizing relevant information
- Priority-based traversal ensures the most relevant notes are included first when budget is limited
- Robust caching with smart invalidation keeps costs manageable without serving stale data
Start with these enhancements to make the system more powerful:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
| // 1. Add semantic search for better note discovery
import Anthropic from '@anthropic-ai/sdk';
interface EmbeddingStore {
noteId: string;
embedding: number[];
content: string;
}
async function findSemanticallySimilar(
query: string,
store: EmbeddingStore[],
topK = 5
): Promise<string[]> {
// Use Claude to generate a query embedding description
const client = new Anthropic();
// For production, use a dedicated embedding model
// This is a simplified approach using Claude for concept extraction
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 200,
messages: [{
role: 'user',
content: `Extract 5 key concepts from this query as a JSON array of strings: "${query}"`
}]
});
const textBlock = response.content.find(block => block.type === 'text');
const concepts = JSON.parse(textBlock?.text || '[]');
// Score notes by concept overlap (simplified)
const scored = store.map(note => ({
noteId: note.noteId,
score: concepts.filter((c: string) =>
note.content.toLowerCase().includes(c.toLowerCase())
).length
}));
return scored
.sort((a, b) => b.score - a.score)
.slice(0, topK)
.map(s => s.noteId);
}
// 2. Implement automatic note suggestions
async function suggestConnections(
noteContent: string,
existingNotes: string[]
): Promise<Array<{ note: string; reason: string }>> {
const client = new Anthropic();
const response = await client.messages.create({
model: 'claude-sonnet-4-20250514',
max_tokens: 1024,
messages: [{
role: 'user',
content: `Given this note content:
---
${noteContent}
---
And these existing notes in my knowledge base:
${existingNotes.map(n => `- ${n}`).join('\n')}
Suggest 3-5 notes that should be linked, with brief reasons why.
Return as JSON: [{"note": "note-name", "reason": "why to link"}]`
}]
});
const textBlock = response.content.find(block => block.type === 'text');
return JSON.parse(textBlock?.text || '[]');
}
|
Long-term Enhancements
Once your basic system is stable, consider these advanced features:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
| # Feature roadmap for knowledge graph evolution
phase_1_foundations:
- Basic link extraction and traversal ✓
- Token-aware context loading ✓
- Caching layer ✓
phase_2_intelligence:
- Semantic similarity search
- Automatic link suggestions
- Contradiction detection across notes
- Gap analysis ("what's missing from my knowledge?")
phase_3_automation:
- Daily digest of new connections discovered
- Auto-tagging based on content analysis
- Spaced repetition integration
- Export to Anki for key concepts
phase_4_collaboration:
- Multi-vault federation
- Shared knowledge bases with access control
- Version history with semantic diffs
|
The power of this approach lies in its compounding returns: every note you add makes every query smarter. Unlike traditional search, Claude can synthesize information across your notes, find non-obvious connections, and