← Back to opportunities
Provide 2nd level support for production systems and critical business applications.
Investigate, troubleshoot, and resolve incidents and performance issues.
Perform root cause analysis (RCA) and document findings in a structured manner. Design, implement, and maintain monitoring dashboards.
Improve alert quality and reduce noise through effective threshold and metric design.
Analyze logs, metrics, and system behavior to proactively detect anomalies, automate operational processes using Ansible and scripting. Operational Mindset & Collaboration
Proven experience in Site Reliability Engineering, DevOps, or 2nd level production support.
Effective communication skills and ability to work with cross-functional teams.
About the Role
What You Will Do
Operational Support & Incident Management
Monitoring, Observability & Automation
Improve alert quality and reduce noise through effective threshold and metric design.
Analyze logs, metrics, and system behavior to proactively detect anomalies, automate operational processes using Ansible and scripting.
What You Bring
Technical Skills
Ready to Join Through a Referral?
Apply now and get connected directly with the hiring team
Apply for this Position