Skip to main content

Engineering On Call

Overview

To ensure system reliability and timely response to urgent issues, we’ve established an on-call rotation for the engineering team. This process defines responsibilities and expectations for engineers assigned to on-call duties, promoting accountability, learning, and continuous improvement.

🔄 Rotation Details

  • 📅 Duration: Each on-call shift lasts 1 week
  • 👥 Participants:
    • Primary On-Call Engineer: Main point of contact for any alerts or support issues during the rotation.
    • Secondary On-Call Engineer: Backup support for the primary. The primary may delegate tasks to the secondary if needed (e.g., during time off or overlapping workload).

📝 Responsibilities

1️⃣ Primary Engineer

  • Actively monitor and respond to:
    • User reports: support-arc-twilio tickets
    • System alerts: ntfy-alerts
  • Coordinate with relevant teams to resolve high-priority issues based on established SLAs.
  • Keep stakeholders informed as needed.
  • Document actions taken and any notable findings.
  • Share a summary of the week’s activities and learnings by EOD Monday after the on-call week.
    • Summary of handled tickets: volumes, themes, patterns
    • 💡 Learnings and observations: what went well, what was confusing, unexpected
    • 🛠️ Fixes or follow-ups: e.g., PRs, eng handbook updates
    • 🚀 Proactive work suggestions: how to prevent similar issues, automation ideas, tooling gaps

2️⃣ Secondary Engineer

  • Stay informed and available during the week in case backup is required.
  • Assist with ticket resolution or alerts if the primary delegates a task.
  • Optionally shadow the process for ramp-up or cross-training.

📋 Ticket Management

📘 Feedback - Responder Guide