π Hello, Iβm
Rohan Patil
Building AI Systems
That Scale π
AI/ML Engineer with experience at Perplexity and Amazon, building production-grade LLM pipelines, RAG systems, and distributed ML infrastructure for real-world high-scale environments.
How I build systems
Selected Projects
Adaptive RAG Chatbot
Built an adaptive Retrieval-Augmented Generation (RAG) system that improves LLM reliability by dynamically selecting retrieval strategies based on query complexity. The system balances latency and accuracy by routing simple queries through lightweight retrieval while applying deeper contextual search and re-ranking for complex queries.
LENS β AI Image Intelligence
Developed a real-time multimodal AI system that processes images and generates contextual outputs across multiple modes including storytelling, humor, and analytical reasoning. The system leverages GPT-4o Vision with streaming responses to deliver an interactive and engaging user experience.
Second Brain β Knowledge Graph
Built an AI-powered system that converts raw notes and text into structured knowledge graphs. The system extracts entities and relationships using LLMs and visualizes them as an interactive force-directed graph, enabling intuitive exploration of complex information.
Human Activity Recognition System
Built a lightweight human activity recognition system using 2D pose keypoints instead of raw video, enabling efficient sequence modeling with high accuracy. The system leverages temporal patterns using LSTM networks while significantly reducing computational overhead compared to RGB-based approaches.
Driver Drowsiness Detection System
Developed a real-time driver drowsiness detection system that monitors eye states using computer vision and deep learning. The system processes webcam input, detects facial regions, and classifies eye states to trigger alerts when fatigue is detected.
Location Intelligence & Clustering System
Built a location intelligence system that analyzes geospatial and venue data to identify optimal neighborhoods. The system applies clustering techniques to group similar areas and provide insights for decision-making based on data patterns.
Experience

AI/ML Engineer β Perplexity
June 2024 β Present Β· San Francisco, CA
- β’ Architected RAG pipelines integrating vector search + web indexing.
- β’ Built FAISS + Redis hybrid retrieval improving recall/precision tradeoff.
- β’ Optimized Triton GPU inference β +25% throughput.
- β’ Designed LLM routing (on-device + cloud) for sub-second latency.
- β’ Improved factual consistency via ranking + citation pipelines.
- β’ Built evaluation systems tracking latency, accuracy, UX metrics.
- β’ Led 0β1 agentic AI features β +18% engagement.

AI/ML Engineer β Amazon
Oct 2019 β June 2023 Β· India
- β’ Built batch + streaming pipelines using AWS, Spark, Kafka.
- β’ Designed feature systems β +30% faster data access.
- β’ Prevented training-serving skew in real-time ML systems.
- β’ Built Kafka + Spark streaming pipelines for low latency updates.
- β’ Orchestrated ML workflows with Airflow + SageMaker.
- β’ Built drift detection + monitoring datasets.
- β’ Reduced infra cost by ~15% via optimization.