Skip to Content
DevelopmentWorkerChord Monitoring Quick Reference

Chord Monitoring Quick Reference

This is a quick reference for chord monitoring commands. For detailed information, see Chord Management and Monitoring.

Quick Commands

🔍 Check Status

# Quick interactive check and fix python fix_chords.py # Show current chord status python -m rhesis.backend.tasks.execution.chord_monitor status # Check for stuck chords (>1 hour) python -m rhesis.backend.tasks.execution.chord_monitor check --max-hours 1

🔧 Fix Issues

# Dry run - see what would be revoked python -m rhesis.backend.tasks.execution.chord_monitor revoke --max-hours 1 --dry-run # Actually revoke stuck chords python -m rhesis.backend.tasks.execution.chord_monitor revoke --max-hours 1 # Emergency: purge all tasks (dangerous!) python -m rhesis.backend.tasks.execution.chord_monitor clean --force

🔍 Inspect Specific Chord

# Get details about a specific chord python -m rhesis.backend.tasks.execution.chord_monitor inspect <chord-id> # Get verbose details with subtasks python -m rhesis.backend.tasks.execution.chord_monitor inspect <chord-id> --verbose

Common Workflows

Daily Health Check

python fix_chords.py

When Tests are Stuck

# 1. Check status python -m rhesis.backend.tasks.execution.chord_monitor status # 2. Look for stuck chords python -m rhesis.backend.tasks.execution.chord_monitor check --max-hours 0.5 # 3. Revoke if needed python -m rhesis.backend.tasks.execution.chord_monitor revoke --max-hours 0.5

Emergency Recovery

# 1. Stop workers pkill -f celery # 2. Clean all tasks python -m rhesis.backend.tasks.execution.chord_monitor clean --force # 3. Restart workers celery -A rhesis.backend.worker.app worker --loglevel=INFO & # 4. Verify python fix_chords.py

Log Monitoring

# Watch for chord issues tail -f celery_worker.log | grep -E "(chord_unlock|MaxRetries|ERROR)" # Count stuck chords grep "chord_unlock.*retry" celery_worker.log | wc -l

Return Codes

  • 0: Success / No issues
  • 1: Issues found / Errors
  • 130: Cancelled by user

Command Options

OptionDescription
--max-hours NConsider chords stuck after N hours
--dry-runShow what would be done
--jsonJSON output
--verboseDetailed information
--forceRequired for destructive operations

Files

  • fix_chords.py - Quick interactive script
  • src/rhesis/backend/tasks/execution/chord_monitor.py - Full monitoring suite
  • celery_worker.log - Worker logs
  • src/rhesis/backend/worker.py - Configuration