..
About me
Hello! I am Romit. I love building data-driven tools and products. I have (Current year - 2018)
years of work experience in arranging matrices.
A bit about my experience
- ML engineer at Sarvam.ai, [February 2025-Present]
- Currently working on pre-training vision encoders for sovereign vision language models.
- Led post-training quantization and inference optimization of Sarvam-M, a 24B LLM - Section 4 in the blog.
- Built a highly fault-tolerant, scalable central store for pre-training datasets using WebDatasets.
- Worked with TensorRT LLM, vLLM, WebDatasets, and K8s.
- AI engineer at Meraki Labs, [December 2023-January 2025]
- Pre-trained small audio language models. This model aligned audio and text tokens into the same space, enabling TTS, ASR, and text-to-voice continuation from a single model.
- Spent time understanding the audio domain and how its modeling works.
- Developed custom GPU kernels in Triton to accelerate inferencing and researched inference bottlenecks in the audio domain.
- Worked with Triton, CUDA C, and PyTorch.
- Senior Data Scientist at epifi, [November 2020-October 2023]
- Built customer-support call categorization pipelines using Whisper, Mistral 7B, and fine-tuned BERT.
- Trained and deployed real-time and batch fraud-detection models using computer vision and tree-based methods.
- Built infrastructure for real-time and batched feature engineering and optimized model deployment.
- Worked with Tensorflow, Docker, PySpark, Airflow, EFK stack, and the usual AWS stack.
- Technical Lead at Postman, [January 2018-November 2020]
- Founding member of the data team; built an analytics platform supporting over 1,000 production dashboards.
- Architected data testing frameworks and data warehouse table design to power business analytics.
- Worked with SQL, Redshift, Looker, DBT, and PowerBI.
- I was on the team that built the project Panoptic.
Places where I have talked about my work experience
- Gave a talk on efficient inferencing at AI Inference Meetup with NVIDIA.
- Gave a demo on audio language models at AI Tinkerers Bengaluru chapter.
- Gave a workshop at Fifth Elephant conference on writing efficient Triton GPU Kernels.
- Speaker at 2022 AWS ML Fridays series for fraud detection using Amazon SageMaker.
A bit about what I love doing
- Tech I love working with - Python, Triton, PyTorch, SQL, Docker, FastAPI.
- Areas I am interested in:
- Systems around LLMs
- Applied deep learning, especially computer vision
- Human in-loop decisions
- Reading books, playing cricket, and petting cats.
- Obsessed with calendars, to-dos, tracking every little thing, and shiny productivity apps.