Sanyam Mehta

Hi, I'm Sanyam! I'm a graduate student in the Robotics program at the University of Michigan, diving deep into the worlds of computer vision, generative AI, and robotic perception. My goal is to build intelligent systems that can see, understand, and interact with the world in meaningful ways.

Before coming to Michigan, I was a Computer Vision & ML Engineer at Gather AI, a CMU spin-off, where I got to apply these ideas to solve real-world automation challenges in massive warehouses.

Email  /  GitHub  /  LinkedIn

profile photo

🔬 Research

My current research with Dr. Andrew Owens explores the surprising, emergent capabilities of large generative models.

Point Prompting: Counterfactual Tracking with Video Diffusion Models

Recent advances in video generation have produced powerful diffusion models capable of generating high-quality, temporally coherent videos. We ask whether space-time tracking capabilities emerge automatically within these generators, as a consequence of the close connection between synthesizing and estimating motion. We propose a simple but effective way to elicit point tracking capabilities in off-the-shelf image-conditioned video diffusion models. We simply place a colored marker in the first frame, then guide the model to propagate the marker across frames, following the underlying video’s motion. To ensure the marker remains visible despite the model’s natural priors, we use the unedited video's initial frame as a negative prompt. We evaluate our method on the TAP-Vid benchmark using several video diffusion models. We find that it outperforms prior zero-shot methods, often obtaining performance that is competitive with specialized self-supervised models, despite the fact that it does not require any additional training.


đź’Ľ Experience

Computer Vision & ML Engineer @ Gather AI (2021-2024)

For over three years, I worked at a fast-paced, and an incredibly cool startup where we built the world's first fully autonomous drone-based inventory management system. I was fortunate to be mentored directly by the founders: Dr. Daniel Maturana, a computer vision guru and the inventor of VoxNet, and Dr. Sankalp Arora, who taught me the art of thinking slowly and critically. I got to:

  • âś…Develop a neural network for 3D occupancy inference, helping classify over 1.2 million pallet locations with high accuracy.
  • âś…Analyze and compare monocular SLAM systems to improve drone navigation in complex, dynamic warehouses.
  • âś…Engineer scalable data processing systems and lead cross-cultural data annotation efforts.

Graduate Student Instructor @ University of Michigan (2024-Present)

I've also discovered a love for teaching as a Graduate Student Instructor (GSI) for ROB 550: Robotics Systems Laboratory. It's incredibly rewarding to:

  • 🎓Lead lab sessions and help students tackle hands-on projects, from programming a robotic arm to implementing a particle filter SLAM.
  • 🎓Mentor students and share the passion for robotics that my own mentors instilled in me.

❤️ Why Robotics?

For People: My time at Gather AI opened my eyes to the real-world impact of robotics. I learned that in 2021, slips, trips, and falls were responsible for nearly 38% of accidental deaths in warehouses. Seeing our autonomous drones collect data from dangerous, hard-to-reach places—keeping people safely on the ground—showed me that robotics isn't just about efficiency; it's about protecting lives.
For the Planet: I was heartbroken to learn about the devastation wildfires have caused in California, destroying 20% of the world's Giant Sequoias. I imagine a future where advanced robots can assist Park Rangers in protecting these natural wonders. This is the kind of work I dream of contributing to.
For the Fun of It: And sometimes, the reason is simple: robots are just cool. I love the challenge of building things, and the idea of creating intelligent machines that can help people is what gets me excited to learn and create every single day.