Mid-Level Computer Vision Engineer
ABOUT SPORTFX • SportFX is a high-performance, fully-remote startup democratizing elite sports analytics through AI—serving everyone from youth leagues to professional franchises with the same world-class technology. • We combine deep sports expertise with cutting-edge computer vision and ML to deliver real-time video analysis, personalized coaching insights, and data-driven performance optimization. • Culture: extreme ownership, radical transparency, and collaborative excellence. We operate with high-trust autonomy where smart people solve hard problems together—no egos, no politics, just results. ABOUT THE ROLE — MID-LEVEL COMPUTER VISION ENGINEER You'll be the execution engine for our CV team—taking architectural decisions from the Senior CV Engineer and turning them into production-ready models through systematic experimentation, dataset work, and rigorous evaluation. This is a hands-on role focused on model training, data quality, and iterative improvement. While the Senior CV Engineer sets technical direction, you'll own the implementation details that take models from 90% to 98% accuracy. You'll run experiments, clean datasets, fine-tune models for edge cases, and build the augmentation pipelines that make our systems robust to real-world video conditions. What You'll Own Model Training & Fine-Tuning: Run training experiments for pose estimation, object detection, and temporal segmentation models. Fine-tune on challenging scenarios—extreme camera angles, poor lighting, heavy motion blur, partial occlusion from equipment or other players. Dataset Management: Curate and clean training datasets across all sports. Generate annotations using Label Studio, validate label quality, identify systematic errors. Build dataset analysis tools to understand where models fail. Data Augmentation Pipelines: Implement augmentation strategies that simulate real-world conditions—motion blur, lighting variation, synthetic occlusion, background clutter. Integrate synthetic data from Blender into training pipelines. Systematic Evaluation: Compute comprehensive metrics across test sets. Generate detailed performance reports identifying failure modes by sport, scenario, and video quality. Propose targeted experiments to fix systematic issues. Temporal Smoothing & Post-Processing: Implement filtering and interpolation to improve tracking consistency and pose stability without degrading latency. Handle outlier detection and trajectory correction. Experiment Infrastructure: Build tools that make experimentation faster—training scripts, evaluation pipelines, visualization tools, dataset splitting strategies. Enable rapid iteration cycles. Your Environment Tech Stack: PyTorch, YOLO family, MediaPipe or similar pose frameworks, OpenCV, Label Studio for annotation, pandas for analysis, W&B for experiment tracking Pipeline: Working within established CV architecture set by Senior Engineer, focus on execution and systematic improvement Team: Report to Senior CV Engineer, collaborate with Blender Engineer on synthetic data, coordinate with DevOps on deployment and infrastructure Workflow: Experiment-driven development with clear metrics, weekly progress reviews with Senior CV Engineer, iterative improvement cycles with quantitative validation Location: Global (4+ hour overlap with US Central Time required) SKILLS & EXPERIENCE Must-Have • 2-4 years hands-on CV/ML experience with production models • Strong PyTorch fundamentals—you can debug training issues, tune hyperparameters, implement custom losses • Solid understanding of modern CV architectures (YOLO, transformers, CNNs) • Experience with object detection or pose estimation—you've trained models and shipped improvements • Data augmentation expertise with libraries like albumentations or imgaug • Strong Python skills with clean, testable code • Ability to run systematic experiments and analyze results rigorously Nice-to-Have • Experience with tracking algorithms (ByteTrack, DeepSORT, OC-SORT) • Background in sports or understanding of athletic movement • Familiarity with model optimization (ONNX, TensorRT) • Experience with annotation tools and labeling workflows (Label Studio, CVAT) • Understanding of signal processing for temporal filtering • Prior work with small object detection or motion blur handling • Some understanding of 3D computer vision concepts What Sets You Apart: You're methodical and detail-oriented. You don't just run experiments randomly—you form hypotheses, test them systematically, and analyze results to guide next steps. You understand that improving accuracy from 90% to 95% requires careful dataset work and targeted experimentation, not just trying different architectures. You take ownership of problems end-to-end and follow through until they're actually fixed in production. Job Types: Full-time, Part-time, Contract Pay: $75,000.00 - $120,000.00 per year Benefits: • Paid time off Application Question(s): • Describe a real training issue you’ve encountered when fine-tuning an object detection or pose model (e.g., YOLO, HRNet, MediaPipe, RT-DETR). What were the symptoms, how did you diagnose the root cause, and what specific steps did you take to fix it? • Give an example of how you improved model performance through dataset cleaning or a custom augmentation pipeline. Explain the before/after metrics and why your changes worked. • Pick one of your past CV projects and walk us through how you evaluated it. Which metrics did you use (mAP, OKS, IDF1, tracking metrics), what failure modes did you identify, and how did you propose fixing them? • Share links to 1–3 of your past computer vision projects (GitHub, portfolio, video demos, Kaggle notebooks, papers, repos, etc) Work Location: Remote Apply tot his job