← Back to opportunities
About the Role
Join NVIDIA as a Senior Software Engineer in AI Inference and leverage your expertise to enhance LLM serving technologies. Partner with customers to unlock AI performance potential.
This role combines technical prowess with direct customer interaction. You will design and execute benchmarking campaigns on Kubernetes and Slurm while tuning vLLM for optimal performance. Utilize your hands-on experience to document and communicate actionable insights effectively.
Key Responsibilities:
• Partner with customers on LLM serving projects
• Conduct benchmarking across GPU clusters
• Optimize vLLM deployments for performance and efficiency
• Create tools and workflows to boost team effectiveness
• Clearly communicate technical assessments and solutions
Requirements:
• Bachelor’s or higher in Computer Science or equivalent
• 5+ years in complex software systems
• Experience with LLM inference, especially vLLM
• Proficient in container orchestration (Kubernetes)...
This role combines technical prowess with direct customer interaction. You will design and execute benchmarking campaigns on Kubernetes and Slurm while tuning vLLM for optimal performance. Utilize your hands-on experience to document and communicate actionable insights effectively.
Key Responsibilities:
• Partner with customers on LLM serving projects
• Conduct benchmarking across GPU clusters
• Optimize vLLM deployments for performance and efficiency
• Create tools and workflows to boost team effectiveness
• Clearly communicate technical assessments and solutions
Requirements:
• Bachelor’s or higher in Computer Science or equivalent
• 5+ years in complex software systems
• Experience with LLM inference, especially vLLM
• Proficient in container orchestration (Kubernetes)...
Ready to Join Through a Referral?
Apply now and get connected directly with the hiring team
Apply for this Position