[Remote] Software Engineer, Inference - Multi Modal
Note: The job is a remote job and is open to candidates in USA. OpenAI is an AI research and deployment company dedicated to ensuring that general-purpose artificial intelligence benefits all of humanity. They are looking for a software engineer to help serve OpenAI’s multimodal models at scale, focusing on building reliable infrastructure for real-time audio and image processing. Responsibilities • Design and implement inference infrastructure for large-scale multimodal models • Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs • Enable experimental research workflows to transition into reliable production services • Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities • Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers Skills • Experience building and scaling inference systems for LLMs or multimodal models • Worked with GPU-based ML workloads and understand the performance dynamics of large models, especially with complex data like images or audio • Enjoy experimental, fast-evolving work and collaborating closely with research • Comfortable dealing with systems that span networking, distributed compute, and high-throughput data handling • Familiarity with inference tooling like vLLM, TensorRT-LLM, or custom model parallel systems • Own problems end-to-end and are excited to operate in ambiguous, fast-moving spaces • Design and implement inference infrastructure for large-scale multimodal models • Optimize systems for high-throughput, low-latency delivery of image and audio inputs and outputs • Enable experimental research workflows to transition into reliable production services • Collaborate closely with researchers, infra teams, and product engineers to deploy state-of-the-art capabilities • Contribute to system-level improvements including GPU utilization, tensor parallelism, and hardware abstraction layers • Experience working with image generation or audio synthesis models in production • Exposure to distributed ML training or system-efficient model design Company Overview • OpenAI is an AI research and deployment company that develops advanced AI models, including ChatGPT. It is a sub-organization of OpenAI Foundation. It was founded in 2015, and is headquartered in San Francisco, California, USA, with a workforce of 201-500 employees. Its website is Company H1B Sponsorship • OpenAI has a track record of offering H1B sponsorships, with 1 in 2025, 1 in 2024, 1 in 2023, 18 in 2022, 10 in 2021, 6 in 2020. Please note that this does not guarantee sponsorship for this specific role. Apply tot his job