Senior Large Language Model (LLM) Operations Engineer

Remote Full-time
Job Description: • Architect and spearhead the development of cutting-edge, scalable AI infrastructure, including novel human-in-the-loop (HITL) paradigms, ensuring our systems learn effectively from feedback. • Lead the technical design and implementation of core MLOps components and systems for our LLMs—including CI/CD, monitoring, and automated feedback loops—ensuring robustness, scalability, and adherence to software engineering best practices. • Define and shape solutions for complex automation and deployment challenges, enabling the strategic application of our cutting-edge AI. • Drive technical alignment and integration with AI Data Science and Software Engineering teams, ensuring the seamless transition of AI solutions from research into production environments and influencing architectural standards. • Define and establish standards for the rigorous validation, monitoring, and lifecycle management of AI products, ensuring continuous accuracy improvement and reliability in production. • Define, champion, and drive adoption of best practices for MLOps, including model/data versioning, experiment tracking, and reproducibility within the AI/ML domain; actively mentor others. • Identify, champion, and integrate state-of-the-art MLOps technologies and frameworks, driving innovation and maintaining our technical edge in AI deployment. • Provide expert guidance on applying safeguards and protections (HIPAA, privacy laws) to our model deployment and data handling pipelines; champion and uphold the highest compliance, quality, and security standards. Requirements: • 3+ years of professional experience in an MLOps, DevOps, or Software Engineering role with a focus on machine learning systems. • MSc/BSc graduate in engineering, computer science, or a relevant field, with extensive equivalent experience. A PhD is a plus. • Deep, hands-on expertise in Python and proficiency in modern software development practices. • Hands-on experience with a major cloud platform (AWS, GCP, or Azure). • Strong experience with containerization and orchestration technologies (Docker, Kubernetes). • Proven experience building and maintaining CI/CD pipelines for complex applications (e.g., GitHub Actions, Jenkins), particularly those that include data + model versioning. • A proven track record of technical leadership and high-impact contributions in building and scaling production machine learning systems. • Proven ability to independently define, architect, and lead solutions for complex, ambiguous infrastructure problems, clearly articulating business value. • Demonstrated ability to lead the decomposition of large-scale systems and guide teams in delivering incremental solutions. • Track record of designing sustainable, reusable, and high-quality code and influencing team/organizational standards. • Exceptional written, verbal, and presentation skills; ability to influence stakeholders at all levels. • Recognized technical leader, proactive, strategic thinker, and takes end-to-end ownership. • Generous, Curious, and Humble. Benefits: • N-Power Medicine (NPM) offers equity at hire • Discretionary annual bonus which may be available based on Company performance • Eligible for company benefits Apply tot his job
Apply Now
← Back to Home