- Monitoring DataDog on different environments for various application and server stacks.
- Restarting / Rebooting of servers and services
- Manual Blockings of Live Streams
- Checking of streams for errors, buffering, etc., and restarting if need be.
- Escalating to teams – Where there is an issue / problem that needs intervention from a resolving team
- Setting up custom checks for specific areas (Error & Rebuffer Percentage spikes, Live Sports Watch Parties, etc.)
- Management of production issues (DStv Online & BoxOffice services).
- Working closely with all teams, to understand new developments and requirements, and implementation of new solutions, features and new integrations.
- Updating Mission Control group during management of escalations.
- Maintaining communications to stakeholders during management of escalations.
- Initiation of War Rooms for high priority issues.
- Completion of Technical Root Cause Analyses documents for high priority issues.
- Updates, follow ups and resolution of incident tickets.
- Capacity planning with third parties for High Profile Events.