ASIC is an Application-Specific Integrated Circuit, a chip designed for one specific task rather than general-purpose computing.
Unlike CPUs (general-purpose) or GPUs (graphics-focused but flexible), ASICs are hardwired for a single application. This makes them much more efficient for that specific task but useless for anything else.
In AI, ASICs target specific workloads like training (Google's TPU) or [[Inference]] (Groq, Cerebras). They can be 10-100x more efficient than GPUs for their target workload because they eliminate all the flexibility overhead.
The economics are brutal. Developing an ASIC costs $100M-$500M+ and takes years. You're betting that workload patterns stay stable and your volume justifies the fixed cost. This is why most AI ASICs fail. Google's TPU survives because of Google's scale. AWS's Trainium and Inferentia survive because of AWS's scale.
The [[Nvidia-Groq - Inference Disaggregation Play]] shows where ASIC economics work: Groq's [[SRAM]]-based architecture for low-latency decode is fundamentally different from GPU economics. Nvidia acquires it to complete their stack, not as their primary platform.
Meta's ASIC efforts, Microsoft's ASIC efforts, and most others will likely get canceled. They can't match Nvidia's integration, networking, and ecosystem advantages. Only hyperscalers with captive demand at massive scale can justify custom silicon.
---
#deeptech #firstprinciple
Related: [[TPU]] | [[SRAM]] | [[Inference]] | [[Nvidia-Groq - Inference Disaggregation Play]]