Aayan Yadav

Aayan Yadav

BTech Student
IIT Roorkee

About Me

I am a senior at the Mehta Family School of Data Science and Artificial Intelligence at IIT Roorkee. My current interests include 3D Representations, Reconstruction & SLAM. My aim is to develop data efficient models that understand the physical world.

Most recently, I interned at AuraML where I worked on text to 3D scene generation. In the past, I had the privilege to work with Prof. Justin Johnson and Dr. Karan Desai on the Benchmarking Object Detectors with COCO: A New Path Forward where we refined annotations of MS COCO dataset. I have also worked with Prof. Sanjeev Kumar on adversarial finetuning of contrastive models.

I am actively looking for opportunities in the field of 3D computer vision. I am interested in full time research roles and PhD starting Fall 2026. If your work aligns with my interests please reach out!

News
  • [April 2025]: Joining AuraML as Research Intern!
  • [July 2024]: COCO-ReM is accepted to ECCV 2024!
  • [December 2023]: Reached finals of Smart India Hackathon 2023!
  • [October 2022]: Joining IIT Roorkee as a bachelors student.

Publications

COCO-ReM Image

Benchmarking Object Detectors with COCO: A New Path Forward

Shweta Singh*, Aayan Yadav*, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

StegaVision Image

StegaVision: Enhancing Steganography with Attention Mechanism (Student Abstract)

Abhinav Kumar, Pratham Singla, Aayan Yadav

Proteus Image

Provenance Detection for AI-Generated Images: Combining Perceptual Hashing, Homomorphic Encryption, and AI Detection Models

Shree Singhi, Aayan Yadav, Aayush Gupta, Shariar Ebrahimi, Parisa Hassanizadeh

Projects

SLAFCoM: A Study on Loss Functions for Adversarial Finetuning of Contrastive Models

Introduced a Clean Consistency Term in the loss function and experimented with different weights and learning rate to improve adversarial finetuning of contrastive models.

GitHub

Sirius

An agentic RAG system using SoTA techniques like AdaRAG, PlanRAG, HyDE, SPLADE, MetRAG, RRF etc.

GitHub

MedMatcher

Similar Document Template Matching for Medical Dataset. Fine-tuned LayoutLMv3 model on custom medical document dataset using weighted cross entropy loss and minibatch gradient descent.

GitHub

Image Captioning Model

Build an image captioning model using transfer learning techniques on the Flickr8k dataset. We fine-tuned a combination of pretrained Inceptionv3 and LSTM with regularization.

GitHub

Blogs

Dismantling Disentanglement in VAEs

In this blog post I give a brief introduction of variational autoencoders and then explain how we can achieve disentanglement in latent space. It is an explanation of this paper.

Read more

Activation Functions

This is a beginner's introduction to activation functions. This was my first ever blog which I wrote for Blogathon organised by DSG IITR!

Read more