Rishabh Singh

AI Engineer

Gurugram 1+ yrs exp 83 · Excellent

About

AI Engineer with 1+ year of experience building end-to-end GenAI systems, including RAG pipelines, LLM fine-tuning, and multi-agent workflows. Strong in backend engineering and scalable architectures, with hands-on experience deploying AI applications on cloud platforms, focusing on performance optimization, cost efficiency, and production reliability.

Skills & Expertise (38)

Python Advanced

8.4/10

Years Exp

LORA Advanced

8.4/10

Years Exp

QLoRA Advanced

8.4/10

Years Exp

rag systems Advanced

8.2/10

Years Exp

Hugging Face AWS Bedrock EC2 S3 Docker GitHub Actions Helm Kafka RabbitMQ MySql Transformers vLLM Langfuse Label Studio Git GitHub Prompt Engineering C/C++ JavaScript SQL Agentic Workflows CrewAI LangChain NLP Semantic Search Redis FastAPI Flask REST APIs GraphQL Microservices Postgresql Elasticsearch

Work Experience

Software Engineer

Simpplr

Aug 2025 - Present

Owned end-to-end development of GenAI workflows from data ingestion, retrieval, and model inference to deployment and monitoring, improving system reliability and production readiness. Optimized enterprise search and RAG pipelines using hybrid retrieval (BM25 + vector search) and open-source LLMs, reducing query latency by 30–40% and lowering inference costs while improving semantic relevance across large-scale enterprise datasets. Experimented and benchmarked embedding and retrieval strategies using an MTEB-inspired evaluation framework, improving model selection, chunking strategies, and retrieval quality through automated evaluation pipelines. Fine-tuned and deployed LLMs using LoRA/QLoRA (Unsloth); served via vLLM on AWS (Bedrock, EC2, S3), replacing proprietary APIs and reducing costs by 50%+ while improving latency and throughput in production. Built multi-agent GenAI systems using FastAPI and CrewAI, enabling task decomposition, planning, and tool orchestration; integrated memory, external tools (Slack, Redis), and Kafka-based pipelines for scalable execution. Implemented CI/CD pipelines using GitHub Actions and Helm-based deployments, enabling automated build-test-deploy workflows; integrated Langfuse for observability, tracing, and performance monitoring. Collaborated with cross-functional teams to translate business requirements into scalable AI solutions, contributing to production-grade system design and delivery.

Junior Software Engineering Intern

EPAM Systems

Jan 2025 - Jun 2025

Developed and scaled backend microservices using Python (FastAPI, Flask) and SQL/NoSQL databases, improving service reliability and modularity across multiple applications. Designed and implemented RESTful and GraphQL APIs within a microservices architecture, applying system design principles to enable efficient and scalable service communication. Built and deployed RAG-based GenAI pipelines using vector search and optimized chunking/embedding strategies; containerized services with Docker to ensure consistent, production-ready deployments.

AI&ML Technology Trainee

NVIDIA via Global Infoventures

Dec 2023 - Jun 2024

Trained and deployed ML models on NVIDIA DGX A100 GPUs, achieving up to 50% faster training cycles through GPU utilization optimizations. Built end-to-end ML pipelines covering data preprocessing, model training, evaluation, and deployment on GPU infrastructure.