Skip to content

Troubleshooting Guide

Database Connection Issues

PostgreSQL Connection Problems

Cannot Connect to Database

# Check if SSH tunnel is active
netstat -an | grep :5433  # For development
netstat -an | grep :5434  # For production

# If no tunnel, create one
ssh -L 5433:localhost:5432 user@test-server
ssh -L 5434:localhost:5432 user@prod-server

# Test connection
psql "host=localhost port=5433 user=username dbname=database_name"

Connection Timeout

# Check SSH connection
ssh user@test-server

# Check PostgreSQL service status
sudo systemctl status postgresql

# Check firewall rules
sudo ufw status

Permission Denied

# Check user permissions
psql -c "\du" # List database users and roles

# Grant permissions
GRANT ALL PRIVILEGES ON DATABASE database_name TO username;
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO username;

Django Development Issues

Migration Problems

Migration Conflicts

# Check migration status
python manage.py showmigrations

# Reset migrations (development only)
python manage.py migrate app_name zero
rm -rf app_name/migrations/
python manage.py makemigrations app_name
python manage.py migrate

# Merge conflicting migrations
python manage.py makemigrations --merge

Database Schema Out of Sync

# Check current schema
python manage.py dbshell
\d  # List tables

# Force schema sync (be careful)
python manage.py migrate --fake-initial
python manage.py migrate

Static Files Issues

Static Files Not Loading

# Collect static files
python manage.py collectstatic --noinput

# Check static files configuration
python manage.py findstatic filename.css

# Debug static files in development
DEBUG = True  # in settings
STATICFILES_DIRS = [BASE_DIR / 'static']

DigitalOcean Spaces Issues

# Test Spaces connection
import boto3
client = boto3.client(
    's3',
    endpoint_url='https://nyc3.digitaloceanspaces.com',
    aws_access_key_id='your_key',
    aws_secret_access_key='your_secret'
)
print(client.list_buckets())

Frontend Development Issues

React/NextJS Problems

Build Failures

# Clear node modules and reinstall
rm -rf node_modules package-lock.json
npm install

# Clear Next.js cache
rm -rf .next
npm run build

# Check for TypeScript errors
npm run type-check

Module Resolution Issues

# Check tsconfig.json paths
{
  "compilerOptions": {
    "baseUrl": ".",
    "paths": {
      "@/*": ["./src/*"]
    }
  }
}

# Restart TypeScript server in VS Code
Ctrl+Shift+P -> "TypeScript: Restart TS Server"

Styling Issues

CSS Not Applying

# Check for CSS conflicts
# Use browser dev tools to inspect elements

# Verify Tailwind CSS setup
npm run build:css

# Check for naming conflicts
# Use CSS modules or styled-components for scoping

DevOps and Deployment Issues

Docker Problems

Container Build Failures

# Check Dockerfile syntax
docker build --no-cache -t app-name .

# Debug build process
docker build --progress=plain -t app-name .

# Check for large files
echo "node_modules" >> .dockerignore
echo "*.log" >> .dockerignore

Container Runtime Issues

# Check container logs
docker logs container-name

# Execute commands in container
docker exec -it container-name bash

# Check resource usage
docker stats container-name

Kubernetes Deployment Issues

Pod Failures

# Check pod status
kubectl get pods
kubectl describe pod pod-name

# Check pod logs
kubectl logs pod-name
kubectl logs -f pod-name  # Follow logs

# Debug pod issues
kubectl exec -it pod-name -- bash

Service Connection Issues

# Check service configuration
kubectl get services
kubectl describe service service-name

# Test service connectivity
kubectl run test-pod --image=busybox --rm -it -- sh
nslookup service-name

Performance Issues

Database Performance

Slow Queries

-- Enable query logging in PostgreSQL
ALTER SYSTEM SET log_min_duration_statement = 1000;
SELECT pg_reload_conf();

-- Check slow queries
SELECT query, mean_time, calls
FROM pg_stat_statements
ORDER BY mean_time DESC
LIMIT 10;

Connection Pool Issues

# Django connection pool debugging
from django.db import connection
print(connection.queries)  # Last queries
print(len(connection.queries))  # Query count

# Monitor connection pool
DATABASES = {
    'default': {
        'OPTIONS': {
            'pool': True,
            'pool_timeout': 20,
            'pool_recycle': 300,
        }
    }
}

Application Performance

High Memory Usage

# Monitor memory usage
htop
ps aux --sort=-%mem | head

# Python memory profiling
pip install memory-profiler
@profile
def my_function():
    pass
python -m memory_profiler script.py

High CPU Usage

# Check process CPU usage
top -p $(pgrep python)

# Profile Python code
python -m cProfile -o profile.stats script.py
python -c "import pstats; pstats.Stats('profile.stats').sort_stats('time').print_stats(10)"

Security Issues

SSL/TLS Problems

Certificate Issues

# Check certificate validity
openssl s_client -connect domain.com:443

# Check certificate expiration
openssl x509 -in certificate.crt -text -noout | grep "Not After"

# Test SSL configuration
curl -I https://domain.com

Authentication Problems

JWT Token Issues

# Debug JWT tokens
import jwt
token = "your_jwt_token"
try:
    payload = jwt.decode(token, 'secret', algorithms=['HS256'])
    print(payload)
except jwt.ExpiredSignatureError:
    print("Token expired")
except jwt.InvalidTokenError:
    print("Invalid token")

Monitoring and Logging

Sentry Issues

Missing Error Reports

# Test Sentry configuration
import sentry_sdk
sentry_sdk.capture_message("Test message from troubleshooting")

# Check Sentry DSN
print(sentry_sdk.Hub.current.client.dsn)

# Verify environment
with sentry_sdk.configure_scope() as scope:
    print(scope._tags)
    print(scope._contexts)

Log Analysis

Finding Specific Errors

# Search in application logs
grep -r "ERROR" /var/log/app/
tail -f /var/log/app/error.log | grep "specific_error"

# Search in system logs
journalctl -u nginx -f
journalctl --since="2 hours ago" -p err

Network Issues

Connectivity Problems

API Connection Issues

# Test API connectivity
curl -I https://api.example.com/health
curl -v https://api.example.com/endpoint

# Check DNS resolution
nslookup api.example.com
dig api.example.com

# Test with different DNS
nslookup api.example.com 8.8.8.8

Firewall Issues

# Check open ports
netstat -tulpn | grep :8000
lsof -i :8000

# Test port connectivity
telnet hostname 80
nc -zv hostname 80-90

Environment-Specific Issues

Development Environment

Port Conflicts

# Find process using port
lsof -i :8000
netstat -tulpn | grep :8000

# Kill process using port
kill -9 $(lsof -ti:8000)

# Use different port
python manage.py runserver 8001
npm start -- --port 3001

Production Environment

Service Restart Issues

# Check service status
systemctl status nginx
systemctl status gunicorn
systemctl status celery

# Restart services
sudo systemctl restart nginx
sudo systemctl restart gunicorn

# Check service logs
journalctl -u nginx -f
journalctl -u gunicorn --since="1 hour ago"

Quick Recovery Commands

Emergency Procedures

Database Recovery

# Restore from backup
pg_restore -h localhost -p 5434 -U username -d database_name backup.sql

# Reset development database
dropdb -h localhost -p 5433 azmx_dev
createdb -h localhost -p 5433 azmx_dev
python manage.py migrate
python manage.py loaddata fixtures/initial_data.json

Service Recovery

# Quick service restart
sudo systemctl restart nginx gunicorn celery redis

# Clear caches
redis-cli flushall
python manage.py clear_cache

# Rebuild and deploy
git pull origin main
pip install -r requirements.txt
npm install
npm run build
python manage.py collectstatic --noinput
sudo systemctl restart gunicorn

Getting Help

Internal Resources

  • Slack: #dev-help channel
  • Documentation: This dev-docs repository
  • Team Leads: Mohammed Mahgoub (Frontend), Abdullah Rizk (Backend)

External Resources

  • Django Documentation: https://docs.djangoproject.com/
  • React Documentation: https://react.dev/
  • PostgreSQL Documentation: https://www.postgresql.org/docs/
  • DigitalOcean Documentation: https://docs.digitalocean.com/

Escalation Process

  1. Self-troubleshooting: Use this guide and documentation
  2. Team Help: Ask in Slack #dev-help
  3. Lead Consultation: Contact relevant tech lead
  4. CPTO Escalation: For critical system issues

[This troubleshooting guide will be expanded based on common issues encountered by the team]