← Back to opportunities
About the Role
Senior Site Reliability Engineer – AI Operations TechInsights is building the reliability and AI operations foundation for its next chapter — an AI-first intelligence platform that runs the most demanding semiconductor intelligence workflows in the world. We’re looking for a Senior Site Reliability Engineer who owns that foundation.
As a senior individual contributor at the technical leadership tier, you will own strategic reliability initiatives end‑to‑end: setting technical direction, defining SLOs and error budgets across our production platform, designing reliability patterns for AI agent pipelines, and enabling our development and AI Engineering teams to build and ship with confidence.
Platform Reliability & AI Operations
Own SLOs, SLIs, and error budgets for all production services; drive error budget discipline across engineering.
Design reliability patterns for AI agent pipelines: LLM observability, tool‑use tracking, failure detection, and g...
As a senior individual contributor at the technical leadership tier, you will own strategic reliability initiatives end‑to‑end: setting technical direction, defining SLOs and error budgets across our production platform, designing reliability patterns for AI agent pipelines, and enabling our development and AI Engineering teams to build and ship with confidence.
Platform Reliability & AI Operations
Own SLOs, SLIs, and error budgets for all production services; drive error budget discipline across engineering.
Design reliability patterns for AI agent pipelines: LLM observability, tool‑use tracking, failure detection, and g...
Ready to Join Through a Referral?
Apply now and get connected directly with the hiring team
Apply for this Position