AI Agent Vertical SaaS DD MOC

# AI Agent Vertical SaaS DD MOC *A working map for due diligence on vertical, AI-native agent SaaS companies. Category-level, not company-specific. Reads as a checklist against known competitive archetypes: horizontal incumbents that moved to outcome-based pricing with published benchmarks (Fin-archetype) and enterprise platform-backed multi-domain players (Haptik-archetype).* > [!tldr]- TL;DR — five things to hold in your head > 1. **The category is repricing to outcomes.** [[Fin-archetype]] — a horizontal incumbent publishing resolution rates, hallucination rates, uptime, and offering an outcome-based per-resolution price with a dollar performance guarantee — has set the bar. A vertical challenger priced per seat or per inbound will look expensive the moment the incumbent's resolution rate crosses the cross-over point. See §6, §11. > 2. **Benchmarks beat narratives.** Whoever publishes measurable model metrics (resolution rate, hallucination rate, deflection, uptime) owns the investor and enterprise procurement conversation. An unpublished accuracy claim is a flag, not a moat. See [[Evidence Hierarchy]], §8. > 3. **Certifications are a moat, not paperwork.** SOC 2, ISO 27001 are table stakes. **ISO 42001 (AI management) and AIUC-1 (adversarial testing) are the new category-defining certifications** that gate AI governance-literate enterprise procurement. Absence against a certified competitor is not neutral — it is disqualifying in regulated verticals. See §10. > 4. **The moat is the vertical data structure, not the model.** Custom retrieval and reranker models (the [[Fin-archetype]]'s `fin-cx-retrieval` / `fin-cx-reranker` pattern) trained on domain-specific data outperform frontier general-purpose models on vertical tasks. A vertical challenger's real IP is commerce-signal context engineering, the evaluation corpus, and the playbook / procedure library — not the LLM behind it. See §9, [[The Age of Vertical Models]], [[Data Flywheel]]. > 5. **Consultancy-to-platform is the dominant failure mode; deployment hours / customer is the single most diagnostic metric.** See [[Consultancy-to-Platform Transition]], [[Deployment Velocity]]. Everything else is secondary until this curve bends. --- ## How to use this MOC Walk a deal top-to-bottom. Each section ends with a **Test** block: the one or two questions that, if answered specifically, resolve the section. If a question can't be answered with a number, an artefact, or a named customer, the section is still CLAIMED not SUPPORTED ([[Evidence Hierarchy]]). --- ## 1. What the category actually is A vertical AI agent SaaS converts **domain data + agent orchestration → a measurable P&L outcome**. Three must be present: a domain, a decision surface, an outcome metric. If any is fuzzy, the company is selling a tool, not work. Tools trade at software multiples; work trades at services-replacement multiples ([[AI Eats Services Not Software]]). **Test.** Name the P&L line the product moves, and the average magnitude of that move per customer cohort. --- ## 2. Directional arrows — where the category is going Five independent arrows, all pointing the same way: - **Category:** fragmented point tools → unified data layer → copilot → vertical agent → autonomous workflow. - **Pricing:** seats → usage → outcomes → fractional employee. - **Interface:** dashboards → chat → agents-embedded-in-surface. - **Model:** frontier → fine-tuned frontier → domain-specific SLM + custom retrieval / reranker. - **Buyer:** IT budget → line-of-business P&L. Companies whose pitch sits two steps ahead of state-of-art are early. Less than one step ahead is irrelevant. See [[The directional arrows of inevitable progress]], [[The Age of Vertical Models]], [[AI Agents Stack]]. ```mermaid graph LR classDef now fill:#fef3c7,stroke:#92400e,color:#111 classDef next fill:#dbeafe,stroke:#1e3a8a,color:#111 classDef end_ fill:#dcfce7,stroke:#14532d,color:#111 subgraph Category C1[Point tools] --> C2[Unified data layer] --> C3[Horizontal copilot] --> C4[Vertical agent] --> C5[Autonomous workflow] end subgraph Pricing P1[Seats] --> P2[Usage] --> P3[Outcomes] --> P4[Fractional employee] end subgraph Interface I1[Dashboards] --> I2[Chat] --> I3[Agents-in-surface] end subgraph Model M1[Frontier API] --> M2[Fine-tuned frontier] --> M3[Domain SLM + custom retrieval/reranker] end subgraph Buyer B1[IT budget] --> B2[LoB budget] --> B3[P&L impact budget] end class C1,P1,I1,M1,B1 now class C3,P2,I2,M2,B2 next class C5,P4,I3,M3,B3 end_ ``` --- ## 3. The AI-native gradient Place the company on: **AI-enabled → AI-assisted → AI-native → autonomous**. A company that says "AI-native" but ships "AI-enabled" has a narrative-evidence gap — usually the single most important finding in a DD. See [[Technical DD Framework]]. --- ## 4. Business model — what is actually being sold Decompose revenue into three logics and chart each separately: - **Software licence** (access). - **Usage** (per message / inbound / session). - **Outcome** (per resolution, per recovered cart, per deflection). Blended numbers hide the story. A company claiming "SaaS" that is 50%+ messaging pass-through revenue (e.g. WhatsApp / BSP margin) has a gross margin structure closer to telecom than software. Model each stream's margin independently. ### 4.1 The functional landscape — where a commerce AI company actually plays Any commerce AI product sits in one or more of five functional boxes. The box determines the metric, the buyer, and the benchmark bar: | Tool category | Primary job | Metrics the buyer actually cares about | |---|---|---| | **AI Support Agents** | Resolve customer issues end-to-end | Resolution rate, cost per resolution, CSAT | | **Email & Lifecycle AI** | Retention and repeat purchase | Conversion rate, LTV | | **Search & Personalisation AI** | Product discovery inside session | Conversion rate, AOV | | **Content & Creative AI** | Production efficiency | Time to launch, content velocity | | **Analytics & Optimisation AI** | Decision quality | Funnel conversion, churn | A challenger that claims to span three boxes without evidence is a platform-story-before-platform problem — see [[Consultancy-to-Platform Transition]]. A challenger with real depth in one box and shallow expansion into a second is the healthy pattern. **Test.** Show the revenue split across these five boxes today, and the roadmap that moves each one to a defensible second box. **Test.** Show cohort gross margin by revenue stream for the last four quarters. --- ## 5. Unit economics — the four diagnostic numbers | # | Metric | What it tells you | |---|---|---| | 1 | **Deployment hours / customer, per cohort** | The platform test. Must fall with each cohort or there is no learning curve. [[Wrights Law]]. | | 2 | **Net revenue retention by cohort** | Expansion truth vs. new-logo dressing. >110% SMB / >120% mid-market is a healthy floor. | | 3 | **Blended gross margin with inference broken out** | Hides the services drag and the provider-repricing exposure. | | 4 | **Inference cost as % of revenue at the top decile customer** | Where margin blowup lives. Stress-test under a 3× provider reprice and a 50% open-source substitution. | **Test.** Can the team produce these four numbers in one screen without prep? ### 5.1 ROI decomposition — what actually compounds The return on an AI CX deployment decomposes into three drivers, each compounding over time. Industry observation: ~41% first-year return, ~87% year two, 124%+ by year three, as knowledge-base optimisation and model updates compound. ```mermaid graph TD ROI[AI CX ROI] --> DCS[Direct cost savings] ROI --> RP[Revenue protection] ROI --> OS[Operational scale] DCS --> DCS1["AI cost/resolution: $0.99–$2.00 Human ticket cost: $6–$12"] DCS --> DCS2["At 67% resolution × 50K/mo → $2M+/yr savings vs fully human"] RP --> RP1["2.4× likelier to remain loyal when problems resolved quickly"] RP --> RP2["15% churn decrease with faster answers"] OS --> OS1["Volume spikes absorbed without staffing changes"] OS --> OS2["10-person team ~$350K → redirect 70% vol to AI → ~$120K"] Y1[Year 1: ~41% return] -.-> ROI Y2[Year 2: ~87% return] -.-> ROI Y3[Year 3+: 124%+ return] -.-> ROI ``` This matters for the DD because a challenger's pitch that frames ROI as "support automation" only captures one of three drivers. The strongest vertical AI pitches model all three, and the benchmark the buyer has already internalised is the Fin-archetype's resolution-rate-× -savings curve. **Test.** Ask the team to decompose their customer ROI into these three drivers by cohort. If they can only quantify direct cost savings, the retention and scale case is storytelling, not evidence. --- ## 6. Pricing architecture — outcome pricing is now the reference ### 6.1 The five pricing models in market ```mermaid graph TD A[AI agent pricing models] --> B[Outcome-based per resolution] A --> C[Per conversation] A --> D[Per session / interaction] A --> E[Custom enterprise contract] A --> F[Platform fee + usage] B --> B1["Fin — $0.99/res Zendesk AI — $1.50–2.00/res Gorgias — $0.60–1.27/res"] C --> C1["Salesforce Agentforce — $2.00/conv (requires Service Cloud $175+/user/mo)"] C --> C2["Ada — ~$1.00–3.50/interaction (custom, ~$30K+/yr min)"] D --> D1["Freshdesk Freddy — $0.10/session"] E --> E1["Sierra — custom Decagon — custom"] F --> F1["Decagon — $50K/yr platform fee + per-conv charges"] classDef outcome fill:#dcfce7,stroke:#14532d classDef conv fill:#fef3c7,stroke:#92400e classDef custom fill:#fee2e2,stroke:#991b1b class B,B1 outcome class C,C1,C2,D,D1 conv class E,E1,F,F1 custom ``` *Data: public vendor pricing pages and Intercom's 2026 comparison article; figures are category benchmarks for DD orientation, not commitments.* **The [[Fin-archetype]] reference point.** Fin by Intercom has publicly anchored the category at **$0.99 per resolution** and has shipped a **$1M performance guarantee** (refund if average resolution rate falls below a threshold). Every vertical challenger's pricing deck is now read against this anchor. ### 6.2 The cross-over math to hold in your head At monthly volume `V` inbounds, incumbent resolution rate `R`, vertical's session price `p`: - Fin-archetype cost ≈ `V × R × $0.99`. - Usage-priced challenger cost ≈ `V × p`. - The cross-over happens at `R = p / 0.99`. If a challenger charges $20 per 1,000 inbounds (= $0.02/inbound), the cross-over is ~2% resolution rate — meaning the challenger only looks cheaper while incumbents resolve badly in the challenger's segment. **The moment the incumbent's resolution rate in the challenger's language / channel / vertical crosses that threshold, the challenger looks expensive.** ### 6.3 TCO — what pricing tables miss Published per-unit rates are roughly half the real cost story. Four factors drive the gap: ```mermaid graph LR S[Sticker / per-unit rate] --> T[True TCO] P[Platform / seat fees $2K–4.5K/mo at 20 agents before any resolutions] --> T E["Billable event definition per-resolution vs per-conversation (40% of spend wasted at 60% res rate)"] --> T I["Implementation & ongoing mgmt $50K–150K Agentforce 3–7 mo Sierra deploys vs <1 hr Fin self-serve"] --> T H[Helpdesk dependency AI-only vendors need separate helpdesk $55–175+/agent/mo] --> T ``` What a challenger must answer: published rate × realistic resolution rate × implementation cost × helpdesk-integration cost, per customer tier. Anything short of that is directionally misleading. **Test.** Re-run the pricing comparison under the incumbent's published global average resolution rate *and* at the rate a top-decile customer sees, *and* across the full TCO stack (platform + implementation + helpdesk). Which customer archetypes still favour the challenger's model at all three? --- ## 7. Revenue composition and margin structure Three separate gross-margin realities typically coexist in one P&L: - **Subscription** (80%+ when real). - **AI sessions / per-resolution** (50–65% is the current inference-era band; improves with scale, open-weight substitution, and hosting optimisation). - **Messaging pass-through** (WhatsApp / SMS / carrier) — structurally 10–20% and exposed to the platform vendor's wholesale price moves. A company whose biggest revenue line is messaging pass-through is not a software company by margin, even if it is by surface. This is a [[Consultancy-to-Platform Transition]] cousin: substitute "messaging pass-through" for "services" and most of the same logic applies. **Test.** What is the three-year plan to shift the revenue mix off the lowest-margin line, and what does the plan assume about that line's price sensitivity? --- ## 8. Benchmarking and evaluation — the new proof standard Post-Fin, the table stakes for credibility are: - **Resolution rate** (published, customer-averaged, segment-split). - **Hallucination rate** (published, measured across a disclosed conversation volume). - **Uptime** (published, with disclosed monitoring). - **Customer count and conversation volume** (published). - **An evaluation dashboard in production**, customer-accessible, showing AI-resolved vs. escalated vs. stuck, topic mix, and sentiment. ### 8.1 Resolution-rate reference band Resolution rate is the single most important variable in the ROI calculation. Published market benchmarks give the reference band against which any challenger claim must be read: | Segment | Resolution rate | |---|---| | Industry average at initial deployment | 40–60% | | Industry average after 6–12 months of optimisation | 60%+ | | Fin-archetype global average across 7,000+ customers | 67% (improving ~1%/month) | | Fin-archetype top-decile performers | 80–84% (up to 93% in individual implementations) | | E-commerce customers on Fin-archetype | regularly 70–84% | Two implications for DD: - A challenger claiming "90%+ accuracy" without disclosing what counts as a resolution is claiming a different metric, not a better one. Accuracy on a selected use-case slice ≠ customer-averaged resolution rate on the full conversation mix. - The incumbent's curve is improving ~1%/month. A challenger whose roadmap does not close the gap faster than that is losing ground in real time, even if today's snapshot looks favourable. ### 8.2 Why publishing is a product act The [[Fin-archetype]] published `fin-cx-retrieval` and `fin-cx-reranker` model papers. A challenger who claims superior vertical accuracy but has not published a benchmark is trading on assertion. This is the single area where publishing is a product act, not a marketing one — because it moves the artefact from CLAIMED to SUPPORTED in the investor's and the enterprise buyer's head simultaneously. **Test.** Is the accuracy claim testable on a shared evaluation set today? If not, what is the shortest path to publishing one? **Test.** At the incumbent's 1%/month improvement cadence, at what date does their resolution rate in our primary language / channel cross the threshold where our pricing stops looking cheaper? Related: [[Evidence Hierarchy]], [[What Must Be True]]. --- ## 9. Technical architecture — what to probe A modern vertical AI agent system has, at minimum: 1. **Channel normalisation** across whichever surfaces the vertical uses. 2. **Identity resolution** across those channels to a unified profile. 3. **Context enrichment** from domain-specific systems of record (the actual vertical moat lives here). 4. **Intent classification and dynamic tool/prompt selection** per turn (narrows hallucination scope). 5. **A multi-step workflow library** — the [[Fin-archetype]] calls these **Procedures**; commerce / field / legal variants call them playbooks. Whatever the name, *modular, use-case-specific workflows that modify the pipeline* are the stickiest product artefact. 6. **Tool-call guardrails** validating LLM outputs against allowed API responses (prevents over-threshold refunds, invalid discount codes, fabricated actions). 7. **Output guardrails** — source-grounding and brand/style check before send. 8. **Multi-provider model routing** (OpenAI / Anthropic / Google, increasingly with fine-tuned open-weight for specific tasks). 9. **Vector store / retrieval layer** with domain-specific signal enrichment. A "graph of memory" framing is usually a vector store with enriched per-entity context — that is a legitimate investment, but the naming should not overclaim a graph architecture. 10. **A continuous evaluation pipeline** wired to real customer data and a production dashboard. The **custom retrieval + reranker model pattern** (the Fin-archetype's `fin-cx-retrieval` / `fin-cx-reranker`) is the most durable piece of model IP in this category. A challenger without at least a fine-tuning roadmap at this layer is betting that the commodity retrieval layer stays sufficient. It won't. **Test.** Which of these ten layers does the company actually own vs. assemble, and where is their published evaluation harness? ### 9.1 High-impact vs low-impact — the capability gradient Five dimensions separate the vertical AI tools that move a customer's P&L from the ones that demo well and plateau: | Capability | Low-impact tool | High-impact tool | |---|---|---| | **Scope** | Single-task automation | End-to-end workflows | | **Data access** | Static content only (help centre, FAQ) | Real-time system data (orders, inventory, customer state) | | **Actionability** | Generates responses only | Takes actions (refund, exchange, reorder, cancel) | | **Control** | Black-box behaviour | Configurable, testable, procedure-level control | | **ROI shape** | Incremental | Structural — changes the org chart, not the headcount | The Fin-archetype sits at the right column on all five. A challenger that sits at the right column on *one or two* is not yet a vertical agent; it is a vertical chatbot. The difference is whether the product moves a P&L line or just a CSAT line. **Test.** For each of the five rows, produce one artefact (screenshot, customer example, configuration doc) showing the company sits on the right side of the gradient. Gaps indicate the wedge that a better-funded competitor can exploit. Related: [[AI Agents Stack]], [[Agent Skills as Codified Domain Expertise]], [[Foundational Models MOC]]. --- ## 10. Certifications, security, and AI governance — moat, not paperwork The certification stack is now a five-dimension competitive surface. Each row has a best-in-class standard that the [[Fin-archetype]] has already cleared; any challenger posture should be benchmarked against it: | Dimension | What to evaluate | Best-in-class standard | |---|---|---| | **Foundational certifications** | SOC 2 Type II, ISO 27001, ISO 27701, HIPAA | All four, with HIPAA available without enterprise-only pricing | | **AI governance** | ISO 42001 (AI management system), AIUC-1 (adversarial red-team testing) | Both certifications, with quarterly re-evaluation under AIUC-1 | | **Hallucination control** | RAG architecture, proprietary retrieval / reranker models, validation / grounding layer | Purpose-built retrieval + reranker models, sub-0.5% hallucination rate, source attribution on every answer | | **Data handling** | Encryption, LLM data retention posture, data residency, PII controls | AES-256 at rest, TLS 1.2+ in transit, zero retention at third-party LLM, regional hosting option | | **Operational transparency** | Simulations, deterministic workflows, audit trails, quality scoring | Pre-deployment testing, procedure-level control, 100% conversation logging, AI-powered QA | ```mermaid graph TD S[Security & AI Governance Stack] --> F[Foundational certifications] S --> G[AI governance] S --> H[Hallucination control] S --> D[Data handling] S --> O[Operational transparency] F --> F1[SOC 2 II · ISO 27001 · ISO 27701 · HIPAA] G --> G1[ISO 42001 · AIUC-1 quarterly] H --> H1[Custom retrieval + reranker · <0.5% hallucination · source attribution] D --> D1[AES-256 · TLS 1.2+ · zero-retention LLM · regional hosting] O --> O1[Sims · deterministic flows · audit trails · AI QA] classDef base fill:#e0e7ff,stroke:#3730a3 classDef ai fill:#fef3c7,stroke:#92400e classDef ops fill:#dcfce7,stroke:#14532d class F,F1 base class G,G1,H,H1 ai class D,D1,O,O1 ops ``` **Regional data-protection regimes.** GDPR globally, plus the growing set of regional data-protection laws and sectoral frameworks in any target enterprise geography. Cross-border data-processing posture (engineering team in one jurisdiction, customer data in another) is a live enterprise-procurement blocker. A [[Fin-archetype]] with ISO 42001 + AIUC-1 will win the AI-governance-literate enterprise buyer over a better technical product without them. A [[Haptik-archetype]] enterprise-backed competitor typically arrives with a full SOC 2 / ISO 27001 / GDPR stack on day one because of its parent. A vertical challenger's certification posture must at minimum match the regulated-geo bar. **Test.** Walk the five-row table above. For each row, which standard is held, which is audit-ready, which is planned, and which is absent? Which enterprise deals have been lost because of the gaps, and which would flip after the top two are closed? --- ## 11. Key IP and defensibility Most moats in this category are one of six (see also [[7 Powers]], [[AI era Defensibility]], [[Defensibility Principles MOC]]): | Moat | What it actually looks like | Typical test | |---|---|---| | **Proprietary context data** | Vertical-specific signals (purchase intent, behavioural, dialect, source, return policy, product graph) enriching retrieval at inference time. | Is the data portable to the customer (weak) or company-owned (strong)? See [[Data Moat]]. | | **Evaluation corpus** | Labelled, domain-specific, customer-matched eval set that gates model updates. Under-rated; often the most durable single asset. | Who owns it and does it grow with usage? | | **Custom retrieval / reranker models** | Fine-tuned on proprietary data; gap to frontier general-purpose is structural. The Fin-archetype's `fin-cx-*` pattern. | Published paper or internal benchmarks vs. frontier? | | **Workflow / playbook / procedure library** | Modular, marketplace-able, partner-buildable use-case modules that modify the pipeline per customer or per task. | Marketplace GTM maturity; % of customers using ≥3 modules. | | **Integration depth** | Source-of-truth-grade integrations with the vertical's 2–3 dominant systems of record. | Depth × breadth; are integrations bidirectional and source-of-truth? See [[Switching Cost Design]]. | | **Ecosystem programme status** | Platform-vendor designations (global channel partners, global storefront partners, regional storefront partners). Real but not durable alone. | Tier, duration, commercial terms. | Patents rarely drive defensibility in vertical AI SaaS. Trade secrets (training data curation, eval harness, deployment playbooks) and [[Switching Cost Design]] do. See [[A short note on IP]], [[IP Strategy for Deep Tech Startups]]. **Test.** If a well-funded horizontal competitor decided to win this vertical in 18 months, which of these six does the company still own on day 540? --- ## 12. Competitive landscape — four axes and two named archetypes Every vertical AI challenger faces attack from four directions. Weakness on one is survivable; on three is terminal. ### 12.1 The [[Fin-archetype]] — horizontal incumbent with published benchmarks Properties to internalise because any vertical challenger will be compared to it: - **Outcome pricing with performance guarantee** ($0.99/resolution + $1M refund guarantee). - **Published metrics at scale** — 67% avg resolution, 80–84% top-decile, ~0.01% hallucination across 1M+ conversations/week, 99.97% uptime, 7,000+ customers. - **Native multilingual LLM generation across 45+ languages** (no translation layers). - **Voice channel** as an AI-delivered capability, not an integration story. - **Published model papers** (custom retrieval + reranker). - **Full AI-governance certification stack** (SOC 2 II, ISO 27001/27701/27018/42001, HIPAA, AIUC-1). - **Self-managed by CX teams** — no engineering required to configure procedures or update behaviour. - **Procedures** — multi-step workflows with backend integrations (Shopify / Stripe / Salesforce / Linear) and same-day updates by non-engineers. **Attack surface on a vertical challenger.** The moment Fin ships native quality in the challenger's primary language / channel / region at the Fin-archetype resolution rate, the challenger's unpublished accuracy claim and usage-based pricing both become liabilities simultaneously. **Defence.** Vertical data depth (context signals the horizontal doesn't have), faster CX-team iteration inside the vertical's workflows, ecosystem integration depth into vertical-specific systems of record, regional regulatory posture, and — critically — publishing benchmarks before the horizontal's language expansion lands. ### 12.2 The [[Haptik-archetype]] — enterprise-backed, multi-domain, platform-channel-led Properties to internalise: - **Strategic parent distribution** (Haptik → Reliance Jio, 2019). Translates into: balance-sheet endurance, channel-vendor co-sell, regional regulatory posture pre-baked. - **Multi-domain** (telecom, banking, retail, government) — breadth at the cost of vertical depth. Does not publish model architecture. - **Enterprise custom contracts**, not public price lists. Wins on procurement comfort, not price sheet. - **Channel-native** — deep WhatsApp Business Platform expertise, messaging-channel infrastructure. - **Enterprise certification pre-stacked** — SOC 2, ISO 27001, GDPR from day one via parent. - **Not SMB-focused.** Gap of exposure in the SMB / mid-market tier. **Attack surface on a vertical challenger.** Once a parent-backed multi-domain player decides the challenger's geography or vertical is material, distribution + certification + channel infrastructure arrive simultaneously at a procurement depth the challenger can't match. **Defence.** Own the tier the incumbent doesn't serve (SMB / mid-market), own the vertical the incumbent doesn't specialise (e-commerce vs. telecom vs. banking), and move up-tier inside the vertical before the incumbent moves down-tier into the segment. ### 12.3 Horizontal AI incursion risk and open-source substitution Beyond the two named archetypes: - **Frontier model providers** and horizontal agent platforms ship "good enough" vertical variants at zero incremental cost. Defence is vertical eval data, workflow depth, and regulatory fit they can't replicate cheaply. See [[Incumbent Bundling Risk]]. - **Open-source** (open-weight models, open agent frameworks) compresses the commodity layer. If the company's IP is in the commodity layer, this is a problem; if above, this is a tailwind. ### 12.4 Local champion vs. global challenger For regional plays (see [[Emerging Market SaaS DD]]): pick one explicitly. Ambiguity on this fork is a flag. Local champion defends via language, payment rails, regulation, and ecosystem density ([[Interoperability of Messaging]]). Global challenger uses the region as beachhead. **Test.** Write down the three tactical moves the company would make in the 90 days after the incumbent announces native-language support in the challenger's strongest market. --- ## 13. Go-to-market Three patterns and their trade-offs: - **Direct SMB / mid-market** — fast feedback, slow scale. - **Platform-native** (built on top of a storefront / ticketing platform) — fast scale, exposes the company to [[Incumbent Bundling Risk]] on the host. - **Channel / partner** (agencies, integrators, resellers) — scale without hiring; margin leak and brand dilution risk. Two specific levers to probe: - **Self-managed configuration by the customer's CX team.** The Fin-archetype's explicit product thesis: *no engineering required to configure, update, or iterate*. If the challenger's product still needs a solutions engineer for material changes, deployment hours per customer will not fall, and §5.1 is broken. - **Imagination-gap closing artefacts** (case studies, ROI calculators, before/after metrics). Vertical AI companies plateau at a "chat toy" ceiling when the gap isn't closed. See [[AI Eats Services Not Software]]. **Test.** Can the CX team of an existing customer ship a new workflow without talking to the vendor? What percentage of month-on-month change is customer-authored vs. vendor-authored? Related: [[AI-first GTM]]. --- ## 14. Team assessment Minimum viable founding coverage: - **Technical / AI lead** with production ML experience (not notebook demos). Absence of a Head of AI / ML while claiming "vertical AI" is inconsistent — an AI-product lead who integrates models is not an AI research lead who trains and evaluates them. - **Head of Engineering** — if the CEO covers engineering leadership, this is a single-point-of-failure flag. - **Domain operator or commercial lead** who has sold into this buyer before. - **Enterprise security / infosec lead** — can be fractional at SMB scale; becomes non-negotiable the moment SOC 2 / ISO 42001 audits go live. See [[Key-Person Risk in Deep Tech]]. **Test.** Which named person owns each of: model quality, eval harness, security, pricing. If any is "the CEO", flag it. --- ## 15. Risk register — what to score Critical (deal-breaking if unresolved): - Unsupported core AI claim (published benchmarks missing, eval harness in slideware). - Horizontal incumbent with native coverage in the challenger's primary language / channel at >60% resolution. - Platform-dependency concentration (one channel or one marketplace >40% revenue). - Anchor-customer concentration at seed / A (>40% from one account). - Cross-border data-processing posture that blocks the target enterprise geography. High (valuation-material): - Services / messaging pass-through revenue dressed as SaaS in headline margins. - Expansion motion unproven at cohort level. - Outcome-pricing transition scheduled but not shipped (re-pricing risk to top-decile customers). - Open Head of Engineering / Head of AI roles past 9 months. See [[Technical DD Framework]], [[Incumbent Bundling Risk]]. --- ## 16. What must be true Seven conditions for the investment to make sense at venture scale: 1. The outcome is measurable on the buyer's P&L and attributable. 2. The economic buyer feels the pain acutely enough to bypass procurement delay. 3. The data structure compounds — each deployment improves the next. 4. The vertical data + workflow depth is genuinely out of a horizontal incumbent's economic reach on an 18-month horizon. 5. The certification stack reaches parity with the [[Fin-archetype]] before the AI-governance-literate enterprise buyer arrives. 6. Pricing can migrate toward outcomes before the incumbent's resolution-rate cross-over arrives in the target segment. 7. Unit economics scale without converting into a messaging-pass-through or a services business. Each becomes a de-risking milestone; capital stages against them. See [[What Must Be True]], [[Investing/Investment Thesis Structure]]. --- ## 17. Dispositive questions — shortlist One question per axis, designed to resolve the biggest unknown. - **Outcome.** Name one customer whose P&L moved by >10% after deployment; show the before/after and the attribution method. - **Benchmarking.** What is the shortest path to publishing a shared-test-set comparison against the [[Fin-archetype]] in the challenger's strongest language / vertical? - **Pricing transition.** On the day the incumbent's resolution rate in our primary language crosses 70%, what is our per-unit economics and how does the outcome-tier launch schedule change? - **Certification posture.** Which of SOC 2 II, ISO 27001, ISO 42001, AIUC-1 are held / audit-ready / ungated on cash, and which enterprise deals have we lost because of the gap? - **Data posture.** Where is customer data processed, under which jurisdictions' laws, and what is our answer when an enterprise procurement team in the target regulated geography asks? - **Engineering leadership.** Who owns model quality, and what does the quarterly model-release changelog look like? - **Platform dependency.** If the dominant messaging / commerce platform re-prices, changes API access, or launches its own competing capability in 12 months, what is our 90-day response? - **Head-of-customer.** Can an existing customer's CX lead ship a new workflow today without vendor help? What share of changes are customer-authored? --- ## 18. Deal structure fit Match structure to evidence level. See [[Venture Building Manifesto]], [[Investment Thesis Plays]]: - **Supported outcome + supported economics + supported moat** → equity at market. - **Supported outcome + immature economics** → equity with milestone tranching or convertible. - **Supported vision + immature outcome** → convertible with warrants, or co-build through a studio. - **Supported founder + immature product** → studio co-build under [[Venture Building Operation Principles]]. The cleanest no is a fast no. Pass with specificity: the one or two artefacts you would need to re-engage. --- ## Atomic notes to promote as this matures - [[Fin-archetype]] (to be written — horizontal incumbent with published benchmarks + outcome pricing + ISO 42001) - [[Haptik-archetype]] (to be written — enterprise-backed, multi-domain, channel-led) - [[#6. Pricing architecture — outcome pricing is now the reference|Resolution-rate cross-over math]] - [[#8. Benchmarking and evaluation — the new proof standard|Benchmark publishing as a product act]] - [[#9. Technical architecture — what to probe|Custom retrieval + reranker as category model IP]] - [[#10. Certifications as moat, not paperwork|ISO 42001 and AIUC-1 as AI-era certifications]] --- ## Related notes in vault **DD discipline.** [[Technical DD Framework]] · [[Technical DD - Claude Project Instructions]] · [[Evidence Hierarchy]] · [[What Must Be True]] · [[ESG Due Diligence Frameworks]] · [[Four Ds of Investing]] **Defensibility & moats.** [[Defensibility Principles MOC]] · [[Technical Moat Assessment Framework]] · [[AI era Defensibility]] · [[7 Powers]] · [[Incumbent Bundling Risk]] · [[Switching Cost Design]] · [[Data Moat]] · [[Data Flywheel]] · [[Azraq Data Flywheel]] **AI-native strategy.** [[The Age of Vertical Models]] · [[AI Agents Stack]] · [[AI Agents Landscape - Snapshot of progress and potential]] · [[AI Eats Services Not Software]] · [[AI Disruption Risk Is Not Uniform - Thoma Bravo]] · [[Agent Skills as Codified Domain Expertise]] · [[Foundational Models MOC]] **Business model & unit economics.** [[Consultancy-to-Platform Transition]] · [[3 Hard Truths of Deep Tech Commercialization]] · [[Industrial AI Unit Economics]] · [[Deployment Velocity]] · [[Wrights Law]] · [[Land-and-Expand in Enterprise AI]] · [[Outcome-Based Pricing]] · [[Derivative Platforms]] **Go-to-market.** [[AI-first GTM]] · [[Interoperability of Messaging]] · [[Conversational Commerce]] · [[Emerging Market SaaS DD]] **First principles.** [[First Principles and Mental Models MoC]] · [[The directional arrows of inevitable progress]] · [[Bottleneck Business]] · [[10x different over 10x better]] · [[domain specific sense-making]] · [[A short note on IP]] · [[IP Strategy for Deep Tech Startups]] **Team & venture frame.** [[Key-Person Risk in Deep Tech]] · [[Venture Building Manifesto]] · [[Venture Building Operation Principles]] · [[Investment Thesis Plays]] · [[Investing/Investment Thesis Structure]] · [[Investing/Investing Principles]] · [[Investing/Investing System MoC]] --- Tags: #investing #dd #verticalAI #AIstrategy #businessmodel #firstprinciple #systems #kp #wip