About the job
Join Nagarro as a Senior Staff Engineer specializing in Python and LLM!
Are you ready to take your career to the next level? At Nagarro, we are looking for a talented engineer with over 7.5 years of experience to design and optimize applications powered by cutting-edge Large Language Models (LLMs). Your expertise in Python backend development and LLM engineering will be pivotal in delivering innovative solutions.
Key Responsibilities:
- Design, implement, and enhance LLM-powered applications using leading and open-source models.
- Develop advanced prompt engineering strategies and structured output pipelines.
- Build Retrieval-Augmented Generation (RAG) pipelines with hybrid search and custom retrieval strategies.
- Contribute to the development of multi-agent systems and autonomous AI workflows.
- Fine-tune and serve foundational models using LoRA/QLoRA and modern inference engines.
- Deploy and scale LLM workloads on GPU/TPU-based systems.
- Integrate multimodal models across various types of data including text, images, audio, and video.
- Establish evaluation pipelines for detecting hallucinations, ensuring factual accuracy, and quality scoring.
- Implement safety policies and moderation guardrails for AI systems.
- Build robust backend systems using FastAPI, microservices, and event-driven architectures.
- Optimize backend performance and reliability.
- Develop ingestion pipelines for document processing and semantic indexing.
- Deploy AI systems on cloud platforms and manage Kubernetes inference clusters.
- Establish CI/CD processes, automated testing, and model versioning.
Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Information Technology, or a related field.
- At least 7.5 years of hands-on experience in LLM engineering and Python backend development.
- Expertise in LLM Application Frameworks, including Prompt Engineering with LLMs and FastAPI.
- Proven track record in building and deploying applications using advanced LLMs (e.g., GPT-4/5, Claude, Gemini, etc.).
- Strong experience with RAG pipelines, embeddings, prompt engineering, and multi-agent systems.
- Deep knowledge of model fine-tuning techniques like LoRA, QLoRA, PEFT, and adapters.
- Familiarity with vector databases such as Pinecone, Weaviate, Milvus, and FAISS.
- Strong understanding of cloud platforms (AWS, GCP, Azure) and containerization using Kubernetes.
- Excellent communication, collaboration, and problem-solving skills.

