Blog post: deepmind.google/discover/blo...
Tech report: storage.googleapis.com/deepmind-med...
Blog post: deepmind.google/discover/blo...
Tech report: storage.googleapis.com/deepmind-med...
Thrilled to share the launch of Gemini Robotics 1.5! This is a major step for generalist robots, thanks to a new motion transfer mechanism allowing zero-shot skill transfer between embodiments. I’m incredibly proud of our team's key contributions to this effort—a project I was honored to co-lead.
We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: tap-next.github.io
Happy to share our new paper on better diffusions with scoring rules!
Check it out at arxiv.org/abs/2502.02483
A common question nowadays: Which is better, diffusion or flow matching? 🤔
Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.
Joint work with Sivaramakrishnan Swaminathan, Rajkumar Vasudeva Raju, J. Swaroop Guntupalli, Wolfgang Lehrach, Joseph Ortiz, Antoine Dedieu, Miguel Lázaro-Gredilla and Kevin Murphy #DiffusionModels #ReinforcementLearning #Robotics #Control 4/4
The disadvantage of MPC is that searching over action trajectories can be slow, so we train another diffusion model (on offline data) that acts as a proposal distribution over action trajectories, then use a simple "sample, score, and rank" (SSR) optimizer. 3/4
The advantages of MPC (over policy learning) are that it can be trained on suboptimal reward-free data, and can then be used to optimize new reward functions on the fly. Below we ask a 2d walker agent to achieve different target heights - for 1.4m it has to repeatedly jump! 2/4
Hello world! Excited to (re)share from X our new paper on "Diffusion Model Predictive Control" (D-MPC). Key idea: leverage diffusion models to learn a trajectory-level (not just single-step) world model to mitigate compounding errors when doing rollouts. arxiv.org/abs/2410.05364 🧵 1/4