🧠 Questions and Answers
Q: 🚁 Teleoperating drones is hard—how did you collect enough data?
We didn't directly teleoperate drones. Instead, we leveraged the Universal Manipulation Interface (UMI) — a lightweight handheld gripper that lets humans record diverse manipulation demonstrations without using any robot hardware. This decouples data collection from specific embodiments, making it possible to gather large, in-the-wild datasets quickly while keeping the action space consistent across robots.
Q: 🤖 Isn't this just UMI-on-Legs but with a drone controller?
Not quite. UMI-on-Air introduces ✨ embodiment-aware guidance ✨, where the low-level controller (like MPC for drones) provides gradient feedback during the diffusion sampling process. This two-way communication lets the policy adapt its trajectories to the dynamics of the embodiment in real time — something even the UR10e arm benefits from. So while UMI-on-Legs made UMI policies mobile, UMI-on-Air makes them embodiment-adaptive.
Q: ⚙️ What can't the Embodiment-Aware Diffusion Policy (EADP) do?
EADP currently relies on analytical gradients from model-based controllers (e.g., MPC), which limits its use to systems with known dynamics. However, recent progress in learning-based control — especially reinforcement learning for legged and whole-body systems (e.g., UMI-on-Legs, OmniH2O) — suggests an exciting path forward 🚀. Future versions could use learned or RL-based controllers to provide the same kind of guidance, making EADP more general and scalable.
Q: 🤔 What does "UMI-ability" mean?
"UMI-ability" measures how well a robot can execute trajectories produced by a UMI-trained policy. Robots like fixed-base arms are highly UMI-able—they can closely follow human demonstrations—while drones or legged robots face dynamic and control constraints that make them less so. UMI-on-Air helps bridge this gap by steering trajectories toward what's feasible for each embodiment.
The term actually came up when Prof. Guanya Shi was describing why some embodiments “just listen better” to UMI policies than others. It stuck — and later, Huy Ha decided to formalize it in the paper as a measurable notion.