Distributed Tracing

Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implementing observability for distributed systems.

Published by @Seth Hobson·0 agent reads / 30d·0 saves·

Distributed Tracing

Implement distributed tracing with Jaeger and Tempo for request flow visibility across microservices.

Purpose

Track requests across distributed systems to understand latency, dependencies, and failure points.

When to Use

  • Debug latency issues
  • Understand service dependencies
  • Identify bottlenecks
  • Trace error propagation
  • Analyze request paths

Detailed patterns and worked examples

Detailed pattern documentation lives in references/details.md. Read that file when the navigation tier above is insufficient.

Best Practices

  1. Sample appropriately (1-10% in production)
  2. Add meaningful tags (user_id, request_id)
  3. Propagate context across all service boundaries
  4. Log exceptions in spans
  5. Use consistent naming for operations
  6. Monitor tracing overhead (<1% CPU impact)
  7. Set up alerts for trace errors
  8. Implement distributed context (baggage)
  9. Use span events for important milestones
  10. Document instrumentation standards

Integration with Logging

Correlated Logs

import logging
from opentelemetry import trace

logger = logging.getLogger(__name__)

def process_request():
    span = trace.get_current_span()
    trace_id = span.get_span_context().trace_id

    logger.info(
        "Processing request",
        extra={"trace_id": format(trace_id, '032x')}
    )

Troubleshooting

No traces appearing:

  • Check collector endpoint
  • Verify network connectivity
  • Check sampling configuration
  • Review application logs

High latency overhead:

  • Reduce sampling rate
  • Use batch span processor
  • Check exporter configuration

Related Skills

  • prometheus-configuration - For metrics
  • grafana-dashboards - For visualization
  • slo-implementation - For latency SLOs

Bundled with this artifact

2 files

Reference files that ship alongside this artifact. Agents pull these in only when the task needs them.

More on the bench

SKILL0

Vercel Deployment

Best practices for Vercel deployments including serverless functions, Edge Runtime, middleware, caching, environment variables, and CI/CD configuration

software-engineering+1
0
SKILL0

Tanstack Router

Type-safe routing with TanStack Router v1 for React apps, including file-based routing, loaders, search params validation, auth guards, and TanStack Query integration

software-engineering+1
0
SKILL0

React Router V7 Rules

React Router v7 rules for framework mode, data routers, loaders, actions, route modules, and progressive enhancement

software-engineering+1
0