Senior DevOps / MLOps Engineer
About the position As the leader in animal health, Zoetis is looking to recruit a Senior DevOps/MLOps Engineer into its world-class Veterinary Medicine Research and Development (VMRD) organization to operationalize AI/ML, scientific modeling, and digital twin workloads. You’ll build secure, scalable platforms and data pipelines across cloud and on‑prem/HPC, partnering closely with biologists and data scientists to translate scientific questions into reliable production systems. Responsibilities • Build end‑to‑end DevOps/MLOps foundations: CI/CD for code/data/models, containerization/orchestration, artifact/registry management, and secure configuration. • Design and operate data engineering pipelines (batch/streaming) with data quality checks, lineage, schema contracts, and governance across lake/warehouse environments. • Productionize scientific and digital twin workflows into services/APIs and lightweight UIs with reproducibility, versioning, auditability, and compliant deployment. • Implement scalable training/inference (batch/real‑time) with observability, SLIs/SLOs, runbooks, incident response, and automated rollback strategies. • Run distributed/HPC jobs (including GPU) and optimize storage, throughput, and cost across on‑prem and cloud; collaborate with scientists on experiment design, data/compute needs, and validation. Requirements • PhD in a quantitative field (computer science, ML, computational biology, applied math) or MS/BS with equivalent senior engineer level experience working in a scientific domain. • 6+ years building production systems; strong software engineering fundamentals. • Expert in Python • Strong experience with a query language such as SQL, MapReduce, and/or Cypher • Proficiency in one of: C++, Go, Rust, Java, or Scala. • Docker, Kubernetes, CI/CD (e.g., GitHub Actions), secure artifact/container registries. • Data pipeline orchestration (e.g., Databricks, Dagster, Kedro); streaming (Kafka or Redis); data modeling with SQL/NoSQL/graph. • MLOps: experiment tracking and model versioning (e.g., MLflow), model serving and monitoring. • Cloud (AWS/Azure/GCP) and on‑prem/HPC (e.g., Slurm) experience. • Experience on multidisciplinary projects and teams, including scientists and software engineers, with excellent communication with scientific stakeholders. Nice-to-haves • APIs and scientific apps: FastAPI; minimal UIs (Streamlit/React); scientific computing (NumPy, Pandas, SciPy). • DevOps/IaC: Terraform; GitOps (Argo CD/Flux); Helm/Kustomize; Docker/Kubernetes; secure registries and config. • Data engineering: dbt and feature stores; Parquet/Delta; schema/lineage with Avro/Protobuf, OpenLineage, Great Expectations. • Observability/SRE: Prometheus/Grafana; ELK/OpenSearch; OpenTelemetry; SLIs/SLOs and performance profiling/optimization. • Distributed compute and resilience: Dask, Ray, Spark; HPC/Slurm; GPU scheduling; service mesh (Istio/Linkerd), API gateways, ingress; encryption/secrets/KMS, audit trails, backup/restore, DR. Benefits • We offer a competitive and comprehensive benefits package, which includes healthcare, dental coverage, and retirement savings benefits along with paid holidays, vacation and disability insurance. Apply tot his job