Available · Summer 2026 · Industry research

Building world models that move atoms.

I'm Rugved Katole — PhD candidate at The Ohio State University researching vision-language-action systems and world foundation models for general-purpose robotics. Focused on closing the data gap between simulation and reality.

Explore research → Get in touch

4+ Years in robotics
9 Peer-reviewed publications
$100K Research grant led
5.9× Inference speedup

scroll

/ about

From mechanical hands to general-purpose minds.

Trained as a mechanical engineer at BITS Pilani and now a PhD candidate in Computer Science at The Ohio State University, I work at the intersection of foundation models and embodied AI. Previously at IIT Bombay's TIH and ARMS Lab, leading deployments in autonomous navigation and multi-agent planning.

I care about systems that work outside the lab — robots that handle 40° slips on vineyard slopes, UAV swarms that map fields without choreography, and policies trained from scarce, real-world data.

More about me →

World Foundation Models

Augmenting video data and synthesizing trajectories for sample-efficient policy learning.

Vision-Language-Action

Closing the loop between perception, language, and embodied control across manipulation tasks.

Sim-to-Real Robotics

Photoreal Omniverse digital twins, ROS 2 deployments, and field-tested multi-agent systems.

/ selected work

Research that ships into the real world.

Six projects spanning foundation models, multi-agent autonomy, and embodied AI. Each connects to a deployed system, a published paper, or both.

World Foundation Models × VLA

Augment real-world video datasets for vision-language-action training. Synthesize counterfactual trajectories to make robotic policies sample-efficient.

data min trajectories

gain few-shot

World ModelsVLAFoundation Models

Accelerative Synthetic Data

Diffusion-based filtering and early-exit pipelines that detect inauthentic synthetic videos 9× faster — 75% compute saved on generation.

9× faster filter

75% compute saved

DiffusionPyTorchOptimization

Wildlife Digital Twin

Photoreal NVIDIA Omniverse simulation of generative animal behaviors and herd dynamics. Drone algorithms validated before field deployment.

∞ sim hours

↓ field cost

OmniverseSim2RealConservation

Decentralized AV Intersection

Communication-free, deadlock-free intersection coordination using graph theory and road-marking intent detection across 255 scenarios.

255 scenarios

0 deadlocks

AutonomyGraphPlanning

Multi-Agent UAV Swarm

CNN + multi-agent RL for heterogeneous UAV crop scouting. Cuts scouting need 60%, labor cost 4.8×, lifts farmer profit 36%.

60% less scouting

36% profit ↑

MARLUAVAgriculture

Priority Patrol Planning

Distributed online patrol with finite-time visit guarantees, balancing priority and non-priority site coverage. Sim-to-real validated.

O(n) scalable

✓ sim2real

Multi-RobotCoverageAlgorithms

View all 12 projects →

/ publications

Recent peer-reviewed work.

Selected publications across world models, edge computing for conservation, and multi-agent autonomy. Google Scholar →

See all publications →

/ collaborate

Let's build the next embodied agent.

I'm open to industry research roles, collaborations on world models & VLA, speaking, and consulting. Best reached over email or LinkedIn.

Send a message → Download CV