Engineering Manager, Deep Learning Inference
Job Description: • Lead, mentor, and scale a high-performing engineering team focused on deep learning inference and GPU-accelerated software • Drive the strategy, roadmap, and execution of NVIDIA’s inference frameworks engineering • Partner with internal compiler, libraries, and research teams to deliver end-to-end optimized inference pipelines • Oversee performance tuning, profiling, and optimization of large-scale models • Guide engineers in adopting best practices for CUDA, Triton, CUTLASS, and multi-GPU communications • Represent the team in roadmap and planning discussions • Foster a culture of technical excellence, open collaboration, and continuous innovation Requirements: • MS, PhD, or equivalent experience in Computer Science, Electrical/Computer Engineering, or a related field • 6+ years of software development experience • 3+ years in technical leadership or engineering management • Strong background in C/C++ software design and development • Proficiency in Python is a plus • Hands-on experience with GPU programming (CUDA, Triton, CUTLASS) • Proven record of deploying or optimizing deep learning models in production environments • Experience leading teams using Agile or collaborative software development practices Benefits: • Health insurance • Comprehensive benefits package Apply tot his job