Looking for Expert NLP/ML Engineer for Language Translation Model Training (Indic Languages)

Remote Full-time
Project Description: I am looking to hire an experienced NLP/ML engineer to train high-quality machine translation models for Indic languages. The goal is to develop single language-pair models, such as: ● English → Telugu ● English → Hindi (and additional language pairs, if needed) You may choose the most suitable model architecture based on your expertise (e.g., mBART, mT5, NLLB fine-tuning, Transformer variants, etc.), as long as the final models deliver strong translation quality. Dataset: ● You can use the AI4Bharat datasets including: ● Samanantar ● BPCC ● Other open Indic parallel corpora Scope of Work: The freelancer will be responsible for: 1. Data Handling ● Cleaning, filtering, and preprocessing datasets Sentence alignment (if needed) ● Tokenization and vocabulary preparation (SentencePiece/BPE/etc.) 2. Model Training ● Selecting an appropriate model architecture ● Training single language-pair translation models ● Implementing best practices for training efficiency (FP16, gradient accumulation, etc.) ● Hyperparameter tuning Checkpoint management and monitoring 3. Evaluation ● Compute BLEU, SacreBLEU, and other relevant metrics ● Provide side-by-side qualitative translation samples ● Benchmarking against baseline models 4. Delivery ● Final trained model weights ● Inference scripts (Python) for quick testing ● Instructions for running and continuing training ● Documentation of preprocessing and training pipeline ● Optional: Dockerfile or virtual environment setup Requirements: The ideal candidate should have: ● Strong experience in NLP, Transformers, and neural MT models ● Prior work with Indic languages (big plus) ● Experience with training libraries such as PyTorch, Hugging Face Transformers, Fairseq, OpenNMT, or similar ● Ability to handle large-scale training and dataset preprocessing ● Familiarity with SentencePiece, tokenization strategies, and MT evaluation metrics ● Ability to deliver clean, well-documented code Additional Notes: ● Compute resources can be discussed (I can provide compute, or you can use yours). ● More language pairs may be added later as separate follow-up projects. ● Quality of translation is the highest priority. Apply tot his job
Apply Now

Similar Opportunities

Freelance Writer: Politics and Trending News at GAMURS Group

Remote

Junior AI/NLP/Machine Learning Engineer 2

Remote

[Remote] Senior Account Manager, Nordstrom Media Network (Remote)

Remote

Professional Services Engineer - Network Security Vendor

Remote

Trending News Writer & Editor, Soccer - Sports Illustrated FC

Remote

Overnight Inpatient Pharmacy Technician - IP 500P - (Part-Time, 10-Hour Night Shifts)

Remote

Customer Service Representative (Guam Night Shift)

Remote

Live Chat Assistant - Remote - Night Shift Premium - $25-$35/hr

Remote

Senior Principal, Stakeholder Engagement, Global Sustainability

Remote

[PART_TIME Remote] Nike Data Entry Remote Jobs $27/Hour

Remote

Remote - AML Transaction Monitoring Investigator - Analyst | Jacksonville Beach, FL, USA | Hybrid

Remote

Experienced Remote Data Entry Customer Care Specialist – Delivering Magical Experiences from Home with arenaflex

Remote

Remote Special Education Teacher - Part Time Virtual Instruction for Colorado Schools

Remote

System Test Engineer - All levels

Remote

Customer Service Agent - MAF (Part-Time)

Remote

Experienced Virtual Customer Support Representative - Work from Home Opportunity with Unparalleled Flexibility

Remote

Experienced Customer Success Manager – Cybersecurity and SaaS Industry Expert for blithequark

Remote

Vice President, Legal Affairs (Litigation, Regulatory & Public Policy)

Remote

**Experienced Live Chat Support Agent – Remote Customer Service Representative**

Remote

Business Solutions Specialist

Remote
← Back to Home