Data Flywheel - Mind Palace

# Data Flywheel The most claimed and least verified moat in AI. A real flywheel means **usage produces data, data improves the model, the better model attracts more usage, and the loop runs without human intervention**. Most "data flywheels" are marketing. ## The four tests 1. **Capture** — does every real customer interaction actually produce labelled, usable training data? Or does it get thrown away after the conversation? 2. **Pipeline** — is the captured data cleaned, versioned, and fed back into model updates on a measurable cadence (weekly, monthly)? 3. **Lift** — does model performance actually improve over time, and is there a dashboard showing the improvement per domain / per customer? 4. **Ownership** — does the customer own the data (portable, weakens flywheel) or does the company own it (proprietary, strengthens flywheel)? Terms matter. If the answer to any of the four is "in progress" or "we have plans," the flywheel is hypothetical. ## The proprietary evaluation set corollary A more under-rated and more defensible cousin: the **proprietary evaluation set**. Whoever first assembles a domain-specific, labelled evaluation corpus owns the correctness benchmark in that domain. Model updates must pass it to ship. Competitors have to rebuild it from zero. ## Related patterns - [[Azraq Data Flywheel]] — infrastructure-risk vertical application. - [[The Age of Vertical Models]] — why domain-specific training beats frontier on specific tasks. - [[AI era Defensibility]] — where flywheels sit in a layered defence. Related: [[Data Moat]], [[AI Agent Vertical SaaS DD MOC]] --- Tags: #AIstrategy #defensibility #systems #investing