Stannis Zhou's Avatar

Stannis Zhou

@stanniszhou

Research Scientist at Google DeepMind stanniszhou.github.io

1,188
Followers
219
Following
7
Posts
19.11.2024
Joined
Posts Following

Latest posts by Stannis Zhou @stanniszhou

Preview
Gemini Robotics 1.5 brings AI agents into the physical world We’re powering an era of physical agents — enabling robots to perceive, plan, think, use tools and act to better solve complex multi-step tasks.

Blog post: deepmind.google/discover/blo...

Tech report: storage.googleapis.com/deepmind-med...

26.09.2025 15:00 👍 0 🔁 0 💬 0 📌 0

Thrilled to share the launch of Gemini Robotics 1.5! This is a major step for generalist robots, thanks to a new motion transfer mechanism allowing zero-shot skill transfer between embodiments. I’m incredibly proud of our team's key contributions to this effort—a project I was honored to co-lead.

26.09.2025 15:00 👍 1 🔁 0 💬 1 📌 0
Video thumbnail

We're very excited to introduce TAPNext: a model that sets a new state-of-art for Tracking Any Point in videos, by formulating the task as Next Token Prediction. For more, see: tap-next.github.io

09.04.2025 14:04 👍 23 🔁 9 💬 1 📌 0

Happy to share our new paper on better diffusions with scoring rules!

Check it out at arxiv.org/abs/2502.02483

06.02.2025 05:24 👍 5 🔁 1 💬 1 📌 0
Post image

A common question nowadays: Which is better, diffusion or flow matching? 🤔

Our answer: They’re two sides of the same coin. We wrote a blog post to show how diffusion models and Gaussian flow matching are equivalent. That’s great: It means you can use them interchangeably.

02.12.2024 18:45 👍 255 🔁 58 💬 6 📌 7

Joint work with Sivaramakrishnan Swaminathan, Rajkumar Vasudeva Raju, J. Swaroop Guntupalli, Wolfgang Lehrach, Joseph Ortiz, Antoine Dedieu, Miguel Lázaro-Gredilla and Kevin Murphy #DiffusionModels #ReinforcementLearning #Robotics #Control 4/4

23.11.2024 04:33 👍 0 🔁 0 💬 0 📌 0

The disadvantage of MPC is that searching over action trajectories can be slow, so we train another diffusion model (on offline data) that acts as a proposal distribution over action trajectories, then use a simple "sample, score, and rank" (SSR) optimizer. 3/4

23.11.2024 04:33 👍 0 🔁 0 💬 1 📌 0
Post image

The advantages of MPC (over policy learning) are that it can be trained on suboptimal reward-free data, and can then be used to optimize new reward functions on the fly. Below we ask a 2d walker agent to achieve different target heights - for 1.4m it has to repeatedly jump! 2/4

23.11.2024 04:33 👍 0 🔁 0 💬 1 📌 0
Preview
Diffusion Model Predictive Control We propose Diffusion Model Predictive Control (D-MPC), a novel MPC approach that learns a multi-step action proposal and a multi-step dynamics model, both using diffusion models, and combines them for...

Hello world! Excited to (re)share from X our new paper on "Diffusion Model Predictive Control" (D-MPC). Key idea: leverage diffusion models to learn a trajectory-level (not just single-step) world model to mitigate compounding errors when doing rollouts. arxiv.org/abs/2410.05364 🧵 1/4

23.11.2024 04:33 👍 48 🔁 3 💬 1 📌 0