← Back to opportunities
About the Role
Responsibilities:
- Monitor and Maintain Systems: Ensure the availability, performance, and reliability of our production environment by monitoring system health and responding to incidents.
- Automation: Develop and implement automation tools to reduce manual intervention and improve system efficiency.
- Collaboration: Work closely with development teams to design and implement scalable and reliable systems.
- Performance Tuning: Analyze system metrics to identify performance bottlenecks and optimize system performance.
- Incident Management: Lead incident response efforts, conduct root cause analysis, and implement preventive measures.
- Documentation: Create and maintain comprehensive documentation for system architecture, processes, and procedures.
- Capacity Planning: Conduct capacity planning and en...
Ready to Join Through a Referral?
Apply now and get connected directly with the hiring team
Apply for this Position