← Back to opportunities

Site Reliability Engineer (Winnipeg)

📍 Location
winnipeg
⏰ Job Type
Full-time
📅 Posted
May 30, 2026

About the Role

L1 Site Reliability Engineer responsible for monitoring, triaging, and executing standard operational tasks across enterprise applications Supports Kubernetes, APIs, WAF, databases, API gateways (Gloo, Apigee), Kafka, and multi-cloud environments (AWS/Azure/GCP) First line of defense for incident detection, troubleshooting, and escalation using runbooks and automation Key Responsibilities

Monitoring & Infrastructure

Monitor systems using Grafana, Datadog, Splunk, Prometheus, and AIOps tools Detect anomalies and follow alert workflows for resolution or escalation Validate Kubernetes issues using monitoring dashboards and logs Runbook Execution

Follow predefined runbooks for incident resolution Restart services, validate system health, and elevate when procedures fail Ensure adherence to operational standards Perform initial incident triage and severity classification Collect logs, metrics, and system data for analysis Communicate clearly with stakeholders and ...

Ready to Join Through a Referral?

Apply now and get connected directly with the hiring team

Apply for this Position