2024Full Stack EngineerActiveOngoing

AI File Brain

A local-first file assistant that lets you search your disk by meaning — without sending your files to the cloud.

01 — Origin

The problem wasn’t search — it was trust

I wanted a way to ask my computer questions like:

  • “Where is that PDF about vector databases that mentioned cosine similarity?”
  • “Show me the doc where I explained the onboarding flow tradeoffs.”
  • “Find the thing that’s basically an invoice template, even if it’s not called invoice.”

Traditional file search fails when you don’t remember filenames. Cloud AI tools solve this—by uploading your data.

But the value of “AI over files” collapses if I can’t trust where my files go.

So the obsession became simple:

Semantic search, but local. No cloud. No “we don’t store your data.” No uncertainty.

Obsession

If the tool requires you to compromise privacy to be useful, it’s not a tool — it’s a trade.

02 — Constraints

The walls I had to design around

PrivacyThe non-negotiable: files never leave the machine. The only acceptable network traffic is to localhost services (Ollama).
Recall vs. precisionKeyword search is precise but brittle. Semantic search is flexible but can feel ‘mushy’ without good UX.
Indexing costDisk-scale scanning can explode in time and storage if you index everything (node_modules, builds, hidden dirs, huge binaries).
ExplainabilityUsers trust keyword matches. Semantic matches need justification: why did this file show up?
CLI UXIf the interface feels heavy, you won’t use it. The output has to be fast, readable, and confidence-building.
03 — Decisions

The bets I placed

Two search modes, not one. I made keyword search and semantic search first-class citizens.

  • ai search uses SQLite FTS5 + BM25 for determinism and speed.
  • ai find uses local embeddings (Ollama) + vector search (LanceDB) for meaning-based recall.

This wasn’t just technical—it was about trust: keyword search proves the system is grounded, semantic search expands what you can ask.

Local embeddings via Ollama. Embeddings are generated on-device using Ollama (ex: nomic-embed-text). No files are uploaded, and the system stays offline-first.

Smart scanning rules. Indexing “everything” is the easiest way to build a useless system. So scanning intentionally skips obvious trash:

  • code projects (via marker files),
  • hidden dirs,
  • heavy folders like node_modules,
  • large files that don’t produce good text.

Chunking as a product decision. Embeddings are only as good as the text you feed them. Chunking has to preserve meaning, but also stay small enough for retrieval and ranking.

Key Decision

Retrieval isn’t one technique — it’s a toolbox. Keyword builds certainty; semantic finds what you didn’t know how to ask for.

04 — The Failure

What broke (and taught me the most)

The first scans were noisy.

Not “crash” noisy—signal-to-noise noisy.

Indexing codebases, build folders, and hidden directories inflated the index, slowed scans, and made results less relevant. The system worked, but it didn’t feel smart—because it spent its intelligence on the wrong inputs.

The fix was not more AI. It was better boundaries:

  • clear inclusion (allowed paths),
  • explicit file extensions,
  • aggressive ignore lists,
  • and a bias toward indexing documents rather than artifacts.

That failure reinforced a principle I keep relearning:

Most “AI quality” problems are actually data-shaping problems.

05 — Architecture

How it actually works

AI File Brain has a clear split between indexing and retrieval:

Indexing pipeline

  1. Walk allowed directories (recursive)
  2. Extract text from documents (PDF/DOCX/MD/TXT)
  3. Chunk text
  4. Create embeddings locally (Ollama)
  5. Store:
    • metadata + FTS index in SQLite
    • vectors in LanceDB

Keyword search (ai search) Query → FTS5 MATCH → BM25 rank → highlighted snippets

Semantic search (ai find) Query → embedding → vector similarity → re-ranking (vector + filename/path signals) → results

Metadata + FTSSQLite (FTS5 + BM25) for speed, determinism, and snippets
VectorsLanceDB for cosine similarity over embeddings
EmbeddingsOllama local embedding model (offline-first)
InterfaceCLI with branded output, progress, and relevance cues
06 — Learnings

What I’d tell myself before starting

Boundaries beat clevernessThe best relevance improvements came from what I refused to index: build artifacts, hidden folders, and code projects. Better data discipline > more AI.
Trust needs determinismSemantic search feels magical when it works — but keyword search is what convinces users the system is grounded. Having both is a UX strategy.
Indexing is a product surfaceScanning isn’t a background task. It’s where users decide if the tool respects their machine, their time, and their privacy.
Explainability isn’t optionalWhen a result appears, users want to know why. Snippets and clear ranking cues turn ‘AI guesses’ into ‘retrieval I can verify.’
07 — Future

Where this goes next

The foundation is there: local indexing, hybrid retrieval primitives, and a UX that aims to be trustworthy.

The next step is moving from search to assistance without betraying the local-first premise:

  • Hybrid ranking (merge FTS + vector results with weighted scoring)
  • Incremental scanning (only re-embed changed content)
  • Watch mode (react to filesystem changes)
  • ai ask: local Q&A over your files with citations
  • Clustering + “smart folders” (suggest organization, never auto-move)

The guiding constraint stays the same:

If it can’t be private, it doesn’t ship.

Nathanim
NathanimFull Stack & AI Engineer

A bored developer is a dangerous developer.