Deepen
A second brain that doesn't just store what you know — it surfaces what you've forgotten you understood.
The question that started everything
Why do we forget the most important connections between our own ideas?
I'd been journaling, bookmarking, note-taking for years. Thousands of entries across Notion, Obsidian, scattered markdown files. But when I needed to recall why I'd changed my mind about something six months ago — the exact thread of reasoning — it was gone. Buried under layers of newer, louder thoughts.
The tools were great at storing. Terrible at remembering.
I wanted something different: a system that doesn't wait for you to search. One that watches what you're thinking about right now and quietly says, "you wrote something related to this 4 months ago."
Context is not just data — it's the timing of retrieval.
The walls I had to design around
The bets I placed
Embeddings at the edge. I sacrificed perfect semantic accuracy for speed. Moved the vector embedding pipeline to run locally using lightweight models, with optional cloud sync for heavier processing. The tradeoff: slightly less nuanced matches, but retrieval in under 200ms.
Chunking strategy. Early versions used fixed-size text chunks for the vector store. They were too small — the LLM hallucinated context that didn't exist because it was working with sentence fragments instead of complete thoughts. I switched to semantic chunking: splitting on paragraph boundaries and topic shifts instead of character counts.
// Semantic chunking for better retrieval context
async function chunkDocument(text: string) {
const paragraphs = text.split("\n\n");
return paragraphs.map(p => ({
content: p,
embedding: await generateEmbedding(p),
metadata: {
timestamp: Date.now(),
source: "journal"
}
}));
}Progressive disclosure. Instead of dumping all related notes at once, the system reveals connections gradually. First, a subtle indicator that related notes exist. Then, on hover, a preview. Then, the full context. This respects the user's flow instead of interrupting it.
Sacrificed perfect accuracy for speed. In a thought tool, a fast approximate answer beats a slow perfect one every time.
What broke (and taught me the most)
The first version hallucinated.
Not in the dramatic, obvious way. In the subtle, dangerous way. The system would surface a "related note" and present it with such confidence that users assumed the connection was real — even when the chunks were too fragmented to carry the original meaning.
One user told me: "It reminded me of something I never actually wrote."
That sentence haunted me. A second brain that creates false memories is worse than no second brain at all.
The fix wasn't technical. It was philosophical. I added confidence indicators — the system now shows how it found the connection (which words matched, what the similarity score was). Transparency over magic.
How it actually works
The system has three layers:
Capture layer. Markdown-native editor with real-time save. Every keystroke updates a local SQLite database. No cloud dependency for basic writing.
Intelligence layer. On save, text is chunked semantically and embedded using a local lightweight model. Embeddings are stored in a vector index (HNSW). A background worker continuously re-indexes as the knowledge base grows.
Surfacing layer. While writing, the system queries the vector index against the current paragraph. Matches above a threshold trigger subtle UI indicators. The user can expand them or ignore them. No pop-ups, no interruptions.

What I'd tell myself before starting
Where this is going
The current version works for individual use. But the interesting question is: what happens when second brains talk to each other?
Imagine a team where everyone has their own knowledge graph, and the system can find connections across people's thinking — without exposing private notes. Federated knowledge retrieval.
That's the next bet.
- —Multi-user knowledge graphs with privacy-preserving retrieval
- —CRDT-based sync for offline-first collaboration
- —Visual debugging tools for the embedding pipeline
- —Voice-to-thought capture for mobile

A bored developer is a dangerous developer.