FoxhoundOverview

Foxhound

A local, privacy-first document search and analysis pipeline. Ingest emails, diary entries, meeting notes, and documents into a single searchable corpus — then explore with semantic search and optional AI-powered analysis.

Foxhound tracks down what matters across large, messy document collections without sending anything to the cloud (unless you choose to).

Key Features

FeatureDescription
Multi-source ingestionEmails, diaries, meeting notes, PDFs, Word docs
Semantic searchFind conceptually similar content, not just keyword matches
Two-stage scoringCosine similarity for retrieval, LLM triage for relevance
Privacy-firstPseudonymisation before any cloud API call
Cost controlsConfirmation prompts and hard limits before any spend
Local-firstEverything runs on your machine — cloud is optional

Pipeline Overview

Quick Start

# Configure sources
cp config.example.yaml config.yaml
 
# Deduplicate email threads
uv run python dedup.py
 
# Embed all sources into ChromaDB
uv run python ingest.py
 
# Explore your corpus
uv run python explore.py
 
# Semantic search
uv run python query.py --semantic "project concerns" --top-k 100

In This Section

  • Getting Started - Install and configure Foxhound
  • Use Cases - Real-world scenarios and workflows
  • Guides - Step-by-step how-to guides
  • Concepts - Architecture and design decisions
  • Reference - CLI commands and configuration options
  • FAQ - Common questions about data integrity, privacy, and search

MIT 2026 © Docs Hub