# Evidence Hierarchy
A ranking of the kinds of evidence that can back a DD claim, strongest to weakest. The job of the DD is to map every important claim to its level on this ladder, then focus attention on the widest gaps.
## The Hierarchy
1. **Production system running at a paying customer with measurable outcomes** — the gold standard. Outcome is on the buyer's P&L, not in a slide.
2. **Pilot system with before/after data and controlled baselines** — real, but not priced. Test whether the pilot ever converted to expansion.
3. **Internal demo on real customer data** — useful, but often cherry-picked. Ask to see the failure cases.
4. **Internal demo on synthetic data** — proves the narrative, not the product.
5. **Published academic paper by the team** — signals technical depth, not commercial reality.
6. **Deck claim with no supporting artefact** — the floor. Treat as hypothesis, not fact.
## Evidence-status labels
For any claim in a DD write-up:
- **SUPPORTED** — multiple independent data points confirm.
- **PARTIAL** — some evidence, gaps remain.
- **CLAIMED** — company states it; no independent verification.
- **UNSUPPORTED** — contradicted by evidence or wholly unsubstantiated.
## The narrative-evidence gap
Every company has a vision architecture. The DD maps what's actually running vs. what's planned. **The widest gaps are most important around the claimed key differentiator.** If the single thing the company says makes them defensible is the single least-evidenced capability, that is the reddest flag.
Related: [[Technical DD Framework]], [[What Must Be True]], [[AI Agent Vertical SaaS DD MOC]]
---
Tags: #dd #investing #firstprinciple