# Evidence Hierarchy A ranking of the kinds of evidence that can back a DD claim, strongest to weakest. The job of the DD is to map every important claim to its level on this ladder, then focus attention on the widest gaps. ## The Hierarchy 1. **Production system running at a paying customer with measurable outcomes** — the gold standard. Outcome is on the buyer's P&L, not in a slide. 2. **Pilot system with before/after data and controlled baselines** — real, but not priced. Test whether the pilot ever converted to expansion. 3. **Internal demo on real customer data** — useful, but often cherry-picked. Ask to see the failure cases. 4. **Internal demo on synthetic data** — proves the narrative, not the product. 5. **Published academic paper by the team** — signals technical depth, not commercial reality. 6. **Deck claim with no supporting artefact** — the floor. Treat as hypothesis, not fact. ## Evidence-status labels For any claim in a DD write-up: - **SUPPORTED** — multiple independent data points confirm. - **PARTIAL** — some evidence, gaps remain. - **CLAIMED** — company states it; no independent verification. - **UNSUPPORTED** — contradicted by evidence or wholly unsubstantiated. ## The narrative-evidence gap Every company has a vision architecture. The DD maps what's actually running vs. what's planned. **The widest gaps are most important around the claimed key differentiator.** If the single thing the company says makes them defensible is the single least-evidenced capability, that is the reddest flag. Related: [[Technical DD Framework]], [[What Must Be True]], [[AI Agent Vertical SaaS DD MOC]] --- Tags: #dd #investing #firstprinciple