Workplace Disputes

Gather email trails, meeting notes, and diary entries to build a chronological evidence pack for HR or legal review.

The Problem

Workplace disputes often involve scattered evidence across email threads, meeting notes, and personal records spanning months or years. Manually assembling a timeline is time-consuming and risks missing critical communications.

Workflow

1. Configure Sources

Set up config.yaml to point at the relevant source directories:

deduplication:
  source_paths:
    - "~/exported-emails/work"
 
ingestion:
  sources:
    - type: diary
      path: "~/diary/work-journal"
      entry_separator: "## "
    - type: meeting_note
      path: "~/meeting-notes/team"
    - type: document
      path: "~/documents/hr-correspondence"

2. Ingest Everything

uv run python dedup.py
uv run python ingest.py --reset

3. Multi-Search Strategy

Run several searches with different phrasings to maximise recall:

mkdir -p evidence
 
uv run python query.py \
  --semantic "performance review concerns feedback" \
  --top-k 200 --export-json evidence/reviews.json
 
uv run python query.py \
  --semantic "meeting conduct behaviour complaint" \
  --top-k 200 --export-json evidence/conduct.json
 
uv run python query.py \
  --semantic "workload pressure stress unreasonable" \
  --top-k 200 --export-json evidence/workload.json

4. Merge and Deduplicate

uv run python merge.py evidence/*.json --output evidence/merged.json

5. Filter by Specific People

Narrow down to communications involving specific individuals:

uv run python query.py \
  --sender "[email protected],[email protected]" \
  --date-range 2024-01-01 2025-01-01 \
  --export-json evidence/key-people.json

6. Triage Separately (recommended)

Run triage as a standalone step so results are saved to disk. This lets you re-run deep analysis at different relevance thresholds without re-triaging.

# Recommended: gemini-flash for triage (cheapest, fastest)
uv run python analyze.py evidence/merged.json \
  --triage \
  --model gemini-flash \
  --truncate 500 \
  --concurrency 5 \
  --context "Identify incidents, dates, and communications relevant to a workplace grievance" \
  --output evidence/triaged.json \
  --dry-run
 
# Free, private, slow (local Mistral 7B):
uv run python analyze.py evidence/merged.json \
  --triage \
  --local \
  --context "Identify incidents, dates, and communications relevant to a workplace grievance" \
  --output evidence/triaged.json

Use gemini-flash for triage — it’s the cheapest and fastest API model. --truncate 500 caps doc bodies to 500 chars (enough for relevance scoring, much faster). --concurrency 5 sends 5 batches in parallel. Checkpoints save every wave — if interrupted, re-run to resume. Use --retry-failed to re-triage only failed batches from a previous run.

7. Deep Analysis (on triaged results)

Use --deep-only to skip triage and run deep analysis on already-triaged data. Start with --min-relevance 5 to cast a wide net, then tighten if needed:

uv run python analyze.py evidence/triaged.json \
  --deep-only \
  --min-relevance 5 \
  --context "Identify incidents, dates, and communications relevant to a workplace grievance" \
  --model deepseek \
  --dry-run

Review the cost estimate, then run without --dry-run to proceed.

If the output is too noisy, re-run at --min-relevance 7 (no re-triage needed).

8. Export for Review

uv run python export.py analysis_output.md --clipboard

Tips

Always triage separately with --triage --output so results are saved to disk
Start with --min-relevance 5 to avoid missing borderline evidence
Use --date-range to focus on the relevant period
The pseudonymisation layer automatically protects names when using cloud models
For maximum privacy, use --local to keep everything on your machine
Use gemini-flash for triage (cheapest, fastest), deepseek for deep analysis (best reasoning)
Use --truncate 500 and --concurrency 5 for fast triage
Use --retry-failed to re-triage failed batches without re-running everything
Export to CSV with --export results.csv for spreadsheet review
Triage checkpoints save progress — re-run to resume if interrupted

Overview Due Diligence