Monitoring and Logs#

Audience: Operations Administrators
Prerequisites: Kleidia deployed
Outcome: Understand monitoring and log management

Monitoring Overview#

Kleidia provides multiple monitoring points for system health, performance, and security.

Health Monitoring#

Application Health Endpoints#

Backend Health#

# Health check endpoint
curl https://kleidia.example.com/api/health

# Response:
{
  "status": "ok",
  "version": "2.2.0",
  "database": "connected",
  "vault": "connected",
  "timestamp": "2025-01-15T10:30:00Z"
}

Component Health Checks#

# Database health
curl https://kleidia.example.com/api/admin/system/database

# Vault health
curl https://kleidia.example.com/api/admin/system/vault

# System health (all components)
curl https://kleidia.example.com/api/admin/system/health

Kubernetes Health#

# Pod health
kubectl get pods -n kleidia

# Service health
kubectl get services -n kleidia

# Resource health
kubectl top pods -n kleidia
kubectl top nodes

Log Management#

Log Locations#

Application Logs#

Backend: Kubernetes pod logs
Frontend: Kubernetes pod logs
Database: PostgreSQL pod logs
OpenBao: OpenBao pod logs

Accessing Logs#

# Backend logs
kubectl logs -f deployment/kleidia-services-backend -n kleidia

# Frontend logs
kubectl logs -f deployment/kleidia-services-frontend -n kleidia

# Database logs
kubectl logs -f kleidia-data-postgres-cluster-0 -n kleidia

# OpenBao logs
kubectl logs -f kleidia-platform-openbao-0 -n kleidia

Log Levels#

INFO: General informational messages
WARN: Warning messages
ERROR: Error messages

Note: DEBUG level logs are available for troubleshooting but not typically needed for normal operations.

Log Filtering#

# Filter by level
kubectl logs deployment/kleidia-services-backend -n kleidia | grep -i error

# Filter by time
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h

# Filter by component
kubectl logs deployment/kleidia-services-backend -n kleidia | grep -i vault

# Filter by user
kubectl logs deployment/kleidia-services-backend -n kleidia | grep "user_id=123"

Audit Logging#

Audit Log Access#

# Via web interface
# Navigate to Admin → Audit Logs

# Via API
curl https://kleidia.example.com/api/admin/audit \
  -H "Authorization: Bearer <admin-token>"

# Filter by date range
curl "https://kleidia.example.com/api/admin/audit?start=2025-01-01&end=2025-01-31" \
  -H "Authorization: Bearer <admin-token>"

Audit Log Types#

Authentication: Login, logout, failed attempts
Device Operations: Registration, PIN/PUK changes, certificate operations
Administrative: User management, policy changes
Security Events: Permission denials, suspicious activity

Performance Monitoring#

Resource Metrics#

# CPU and memory usage
kubectl top pods -n kleidia

# Node resources
kubectl top nodes

# Disk usage
df -h

# Docker disk usage
docker system df

Application Metrics#

Response Times: Monitor API response times
Error Rates: Track error rates over time
Request Rates: Monitor request volume
Database Performance: Track query performance

Alerting#

Key Metrics to Monitor#

Pod Status: Pods should be Running
Resource Usage: CPU/Memory should be below limits
Error Rates: Error rates should be low
Certificate Expiration: Certificates should not expire soon
Disk Space: Disk usage should be below 85%
Database Connections: Connection pool usage

Setting Up Alerts#

While Kleidia doesn’t include built-in alerting, you can:

Use Kubernetes monitoring: Prometheus, Grafana
External monitoring: Nagios, Zabbix, Datadog
Log aggregation: ELK stack, Splunk
Custom scripts: Monitor health endpoints

Log Retention#

Database Logs#

Audit Logs: Stored in PostgreSQL
Retention: Configurable (default: 90 days)
Archival: Export before cleanup

Application Logs#

Kubernetes Logs: Managed by Kubernetes
Retention: Configurable via log rotation
Archival: Export important logs

Vault Audit Logs#

Location: Vault audit storage
Retention: Configurable
Archival: Vault snapshot includes audit logs

Log Analysis#

Common Patterns#

High Error Rates#

# Count errors in last hour
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h | \
  grep -i error | wc -l

# Group errors by type
kubectl logs deployment/kleidia-services-backend -n kleidia --since=1h | \
  grep -i error | sort | uniq -c

Slow Queries#

# Check PostgreSQL slow queries
kubectl exec -it kleidia-data-postgres-cluster-0 -n kleidia -- \
  psql -U yubiuser -d kleidia -c "
    SELECT query, calls, total_time, mean_time
    FROM pg_stat_statements
    ORDER BY mean_time DESC
    LIMIT 10;
  "

Failed Authentications#

# Check failed login attempts
curl "https://kleidia.example.com/api/admin/audit?action=login&status=failed" \
  -H "Authorization: Bearer <admin-token>"

Best Practices#

✅ Monitor health endpoints regularly
✅ Set up automated health checks
✅ Review logs daily
✅ Archive important logs
✅ Monitor resource usage
✅ Set up alerting for critical metrics
✅ Review audit logs weekly
✅ Keep log retention policies current

Monitoring and Logs#

Monitoring Overview#

Health Monitoring#

Application Health Endpoints#

Backend Health#

Component Health Checks#

Kubernetes Health#

Log Management#

Log Locations#

Application Logs#

Accessing Logs#

Log Levels#

Log Filtering#

Audit Logging#

Audit Log Access#

Audit Log Types#

Performance Monitoring#

Resource Metrics#

Application Metrics#

Alerting#

Key Metrics to Monitor#

Setting Up Alerts#

Log Retention#

Database Logs#

Application Logs#

Vault Audit Logs#

Log Analysis#

Common Patterns#

High Error Rates#

Slow Queries#

Failed Authentications#

Best Practices#

Related Documentation#