Senior Python Developer – LlamaIndex / RAG Pipeline Engineer
We are seeking a highly skilled Senior Python Developer with expertise in LlamaIndex and RAG pipeline engineering. The ideal candidate will be responsible for designing and implementing efficient data processing pipelines, optimizing workflows, and ensuring seamless integration of LlamaIndex solutions. Strong problem-solving skills and the ability to work collaboratively in a fast-paced environment are essential. If you're passionate about leveraging Python to build scalable solutions, we want to hear from you! Required Skills & Experience: 3+ years Python development with production deployments Hands-on experience with LlamaIndex (not just LangChain) Vector database implementation (Qdrant, Milvus, or pgvector) Document parsing pipelines (Apache Tika, Docling, PyMuPDF, or Unstructured) Local LLM inference with llama.cpp or vLLM Experience with open-source models (Llama, Mistral, Gemma, or similar) GGUF model formats and quantization (Q4, Q8) Sentence-transformers or HuggingFace embedding models Multi-tenant application architecture Async Python (asyncio) Strong Plus: Gemma model family experience LoRA/QLoRA fine-tuning Custom LlamaIndex NodeParser or BaseReader implementations HNSW/IVF index tuning Chunking strategy optimization (semantic chunking experience) Docker/Kubernetes deployment Not Required: OpenAI API experience (we run fully self-hosted) Frontend development Apply tot his job