**Tempo** is Grafana Labs' open source distributed tracing backend, designed for cost-efficient, massively scalable trace storage using object storage as the only backend. It is the traces component of Grafana's LGTM stack (Loki, Grafana, Tempo, Mimir). --- ### First Principle: Traces should be retained as long as metrics — at object storage prices, not SSD prices. [[Jaeger]] and Zipkin typically use Cassandra or Elasticsearch for trace storage — expensive at scale. Tempo stores traces directly in object storage ([[MinIO]], [[Ceph]], S3, GCS), dramatically reducing the cost per trace. The tradeoff is that Tempo is optimised for trace retrieval by ID, not full-text search across traces. --- ### Key Considerations - **Object Storage Only**: Tempo requires no database — traces go directly to object storage in Parquet format. This makes it cheap to retain months of traces. - **TraceQL**: Tempo's native query language allows searching for traces by span attributes, duration, errors, and service graph patterns. - **[[OpenTelemetry]] Native**: Tempo accepts traces via OTLP (the [[OpenTelemetry]] Protocol). It also accepts Jaeger and Zipkin formats for migration. - **Service Graph**: Tempo generates service graph metrics — showing request rates, error rates, and latencies between services — and exports them as [[Prometheus]] metrics. - **[[Grafana]] Integration**: Trace IDs in [[Loki]] log lines automatically link to the full trace in Tempo — and from traces you can jump to [[Prometheus]] metrics for that service. - **vs [[Jaeger]]**: Tempo is cheaper to operate at scale and better integrated with Grafana. Jaeger has a richer standalone UI and more mature adaptive sampling. --- ### How It Fits ``` Services → [[OpenTelemetry]] Collector → Tempo → Object storage ([[MinIO]] / [[Ceph]]) → [[Grafana]] (TraceQL queries, trace waterfall) → [[Loki]] ↔ Tempo (log-to-trace correlation) ``` [[Jaeger]] | [[OpenTelemetry]] | [[Grafana]] | [[Loki]] | [[MinIO]] | [[Open Source Hyperscaler MoC]]