Monitoring and Logs#

Audience: Operations Administrators
Prerequisites: Kleidia deployed
Outcome: Understand monitoring and log management

Monitoring Overview#

Kleidia provides multiple monitoring points for system health, performance, and security.

Health Monitoring#

Application Health Endpoints#

Backend Health#

# Health check endpoint
curl https://kleidia.example.com/api/health

# Response:
{
  "status": "ok",
  "version": "2.2.0",
  "database": "connected",
  "vault": "connected",
  "timestamp": "2025-01-15T10:30:00Z"
}

Component Health Checks#

# Database health
curl https://kleidia.example.com/api/admin/system/database

# Vault health
curl https://kleidia.example.com/api/admin/system/vault

# System health (all components)
curl https://kleidia.example.com/api/admin/system/health

Kubernetes Health#

# Pod health
kubectl get pods -n kleidia

# Service health
kubectl get services -n kleidia

# Resource health
kubectl top pods -n kleidia
kubectl top nodes

Log Management#

Log Locations#

Application Logs#

  • Backend: Kubernetes pod logs
  • Frontend: Kubernetes pod logs
  • Database: PostgreSQL pod logs
  • OpenBao: OpenBao pod logs

Accessing Logs#

# Backend logs
kubectl logs -f deployment/kleidia-services-backend -n kleidia

# Frontend logs
kubectl logs -f deployment/kleidia-services-frontend -n kleidia

# Database logs
kubectl logs -f kleidia-data-postgres-cluster-0 -n kleidia

# OpenBao logs
kubectl logs -f kleidia-platform-openbao-0 -n kleidia

Log Levels#

  • INFO: General informational messages
  • WARN: Warning messages
  • ERROR: Error messages

Note: DEBUG level logs are available for troubleshooting but not typically needed for normal operations.

Log Filtering#

# Filter by level
kubectl logs deployment/kleidia-services-backend -n kleidia | grep -i error

# Filter by time
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h

# Filter by component
kubectl logs deployment/kleidia-services-backend -n kleidia | grep -i vault

# Filter by user
kubectl logs deployment/kleidia-services-backend -n kleidia | grep "user_id=123"

Audit Logging#

Audit Log Access#

# Via web interface
# Navigate to Admin → Audit Logs

# Via API
curl https://kleidia.example.com/api/admin/audit \
  -H "Authorization: Bearer <admin-token>"

# Filter by date range
curl "https://kleidia.example.com/api/admin/audit?start=2025-01-01&end=2025-01-31" \
  -H "Authorization: Bearer <admin-token>"

Audit Log Types#

  • Authentication: Login, logout, failed attempts
  • Device Operations: Registration, PIN/PUK changes, certificate operations
  • Administrative: User management, policy changes
  • Security Events: Permission denials, suspicious activity

Performance Monitoring#

Resource Metrics#

# CPU and memory usage
kubectl top pods -n kleidia

# Node resources
kubectl top nodes

# Disk usage
df -h

# Docker disk usage
docker system df

Application Metrics#

  • Response Times: Monitor API response times
  • Error Rates: Track error rates over time
  • Request Rates: Monitor request volume
  • Database Performance: Track query performance

Alerting#

Key Metrics to Monitor#

  1. Pod Status: Pods should be Running
  2. Resource Usage: CPU/Memory should be below limits
  3. Error Rates: Error rates should be low
  4. Certificate Expiration: Certificates should not expire soon
  5. Disk Space: Disk usage should be below 85%
  6. Database Connections: Connection pool usage

Setting Up Alerts#

While Kleidia doesn’t include built-in alerting, you can:

  1. Use Kubernetes monitoring: Prometheus, Grafana
  2. External monitoring: Nagios, Zabbix, Datadog
  3. Log aggregation: ELK stack, Splunk
  4. Custom scripts: Monitor health endpoints

Log Retention#

Database Logs#

  • Audit Logs: Stored in PostgreSQL
  • Retention: Configurable (default: 90 days)
  • Archival: Export before cleanup

Application Logs#

  • Kubernetes Logs: Managed by Kubernetes
  • Retention: Configurable via log rotation
  • Archival: Export important logs

Vault Audit Logs#

  • Location: Vault audit storage
  • Retention: Configurable
  • Archival: Vault snapshot includes audit logs

Log Analysis#

Common Patterns#

High Error Rates#

# Count errors in last hour
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h | \
  grep -i error | wc -l

# Group errors by type
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h | \
  grep -i error | sort | uniq -c

Slow Queries#

# Check PostgreSQL slow queries
kubectl exec -it kleidia-data-postgres-cluster-0 -n kleidia -- \
  psql -U yubiuser -d kleidia -c "
    SELECT query, calls, total_time, mean_time
    FROM pg_stat_statements
    ORDER BY mean_time DESC
    LIMIT 10;
  "

Failed Authentications#

# Check failed login attempts
curl "https://kleidia.example.com/api/admin/audit?action=login&status=failed" \
  -H "Authorization: Bearer <admin-token>"

Best Practices#

  • ✅ Monitor health endpoints regularly
  • ✅ Set up automated health checks
  • ✅ Review logs daily
  • ✅ Archive important logs
  • ✅ Monitor resource usage
  • ✅ Set up alerting for critical metrics
  • ✅ Review audit logs weekly
  • ✅ Keep log retention policies current