Services

Observability Stack

Comprehensive monitoring, logging, and tracing


Observability provides visibility into your systems through metrics, logs, and traces, enabling proactive issue detection and faster troubleshooting.

Observability by plan#

CapabilityXSSM
Basic monitoringYesYesYes
Prometheus metricsYesYesYes
Grafana dashboardsBasicStandardFull
AlertmanagerBasicStandardFull
Log aggregation-BasicFull (Loki)
Distributed tracing--Full (Tempo)
SLO/SLI monitoring--Yes
Anomaly detection--Yes

Full observability stack (M Plan)#

Prometheus#

Industry-standard metrics collection and storage:

  • Application metrics
  • Infrastructure metrics
  • Kubernetes metrics
  • Custom metrics
  • Long-term storage

Grafana#

Visualization and dashboards:

  • Pre-built dashboards
  • Custom dashboard creation
  • Team-specific views
  • Mobile-friendly interfaces
  • Embedded dashboards

Alertmanager#

Intelligent alert routing:

  • Severity-based routing
  • Escalation policies
  • On-call integration (PagerDuty, Opsgenie)
  • Alert grouping and deduplication
  • Silence and inhibition rules

Loki#

Log aggregation and querying:

  • Centralized log collection
  • Label-based organization
  • LogQL for powerful queries
  • Integration with Grafana
  • Cost-effective storage

Tempo#

Distributed tracing:

  • Request flow visualization
  • Latency analysis
  • Error tracking
  • Service dependency mapping
  • Trace to logs correlation

What we deliver#

Dashboards#

  • Application performance
  • Infrastructure health
  • Kubernetes cluster status
  • Cost monitoring
  • Business metrics

Alerts#

  • High CPU/memory usage
  • Pod restart loops
  • Error rate spikes
  • Latency increases
  • Disk space warnings
  • Certificate expiration

SLO/SLI monitoring (M Plan)#

  • Availability tracking
  • Latency percentiles
  • Error budgets
  • Burn rate alerts
  • SLO dashboards

Implementation#

  1. Assessment — Understand your monitoring needs
  2. Design — Plan metrics, dashboards, alerts
  3. Deploy — Install and configure stack
  4. Instrument — Add application metrics
  5. Alert — Configure intelligent alerting
  6. Iterate — Continuous improvement

Available in#

  • XS Plan — Basic monitoring with Prometheus/Grafana
  • S Plan — Standard observability with logging
  • M Plan — Full observability stack with tracing