's Avatar

@pablovelagomez

44
Followers
5
Following
139
Posts
03.02.2025
Joined
Posts Following

Latest posts by @pablovelagomez

Sharing in Rerun - from web to native viewer Learn how Rerun's URL-based architecture and open source design make sharing Physical AI visualizations effortless—from command line to web viewer to native apps.

All possible thanks to the hard work the rerun team is doing to make sharing easy and accessible! You can see more here

rerun.io/blog/url-sh...

10.10.2025 20:25 👍 0 🔁 0 💬 0 📌 0
Preview
GitHub - rerun-io/annotation-example at dbt Contribute to rerun-io/annotation-example development by creating an account on GitHub.

@rerundotio viewer example here - <app.rerun.io/version/0.2...>

Code found here - <github.com/rerun-io/an...>

10.10.2025 20:25 👍 0 🔁 0 💬 1 📌 0

Consistent progress over time really compounds, and I'm excited as to how fast things are improving.

10.10.2025 20:25 👍 0 🔁 0 💬 1 📌 0

Full body model really excels in exo views and is worth using if one can get a get view of the upper body, and hands only work great given a good bounding box from projecting 3D exo keypoints into egocentric views.

10.10.2025 20:25 👍 0 🔁 0 💬 1 📌 0

I've made lots of improvements to the calibration code and ended up merging the full body estimator with the hands only. Also FINALLY got ego synced and working in the full thing.

10.10.2025 20:25 👍 0 🔁 0 💬 1 📌 0
Video thumbnail

From 8 -> 5 -> 4 exocentric cameras, all visualized with @rerundotio. I'm dropping the number of cameras used and collecting my own data to make sure I'm not overfitting to open-source datasets.

10.10.2025 20:25 👍 0 🔁 0 💬 1 📌 0
Preview
GitHub - rerun-io/annotation-example at dbt Contribute to rerun-io/annotation-example development by creating an account on GitHub.

and the code that made it can be found here - <github.com/rerun-io/an...>

29.09.2025 13:00 👍 0 🔁 0 💬 0 📌 0

View it directly in the @rerundotio webviewer here (I promise it's worth it) - <app.rerun.io/version/0.2...>

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0

Still, I'm quite happy with how it's going so far. Currently, I have a reasonable set of datasets to validate, a performant baseline, and an annotation app to correct inaccurate predictions.

From here, the focus will be more on the egocentric side!

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0

3. Interacting hands causes lots of issues, and the pipeline is very fragile when there's no clear delineation between hands

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0

Really happy with how it looks so far, but this is far from ideal.

1. Not even close to real time, this 30-second 8-view sequence took nearly 5 minutes to process on my 5090 GPU
2. 8 views is WAY too many and unscalable, I'm convinced this can be done with far fewer (2 exo + 1 stereo ego)

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0

3. Per View 2D keypoint estimation
4. Hand Pose Optimization

At the end of it all, I have a pipeline where you input synchronized videos and this outputs full tracked per-view 2D keypoints, bounding boxes, 3D keypoints, MANO joint angles + hand shape!

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0

I want to emphasize that these are not the ground-truth values provided by the wonderful HOCap dataset, but rather from my pipeline that was written from the ground up!

For context, it consists of 4 parts

1. Exo/Ego camera estimation
2. Hand Shape Calibration

29.09.2025 13:00 👍 0 🔁 0 💬 1 📌 0
Video thumbnail

It's finally done, I've finished ripping out my full-body pipeline and replaced it with a hands-only version. Critical to make it work in a lot more scenarios! I've visualized the final predictions with @rerundotio!

29.09.2025 13:00 👍 2 🔁 0 💬 1 📌 0
Preview
GitHub - rerun-io/annotation-example at dbt Contribute to rerun-io/annotation-example development by creating an account on GitHub.

Code is here for you to follow along! <github.com/rerun-io/an...>

19.09.2025 17:00 👍 0 🔁 0 💬 0 📌 0

This tight integration between visuals, predictions, and data is crucial to ensure your data is precisely what you expect it to be.

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0

The next step involves leveraging Rerun's recent updates, particularly the multisink support. Changes are saved directly to a file in .rrd format, easily extractable since the underlying representation is PyArrow. This can be converted to Pandas, Polars, or DuckDB.

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0

Networks will occasionally make mistakes, so having the ability to correct them manually is crucial. This is a significant step towards robust and powerful hand tracking, which will provide excellent training data for robot dexterous manipulation.

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0

The only input required is a zip file containing two or more multiview MP4 files. I handle everything else automatically. This application works with both egocentric (first-person) and exocentric (third-person) videos.

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0

The combination of Rerun's callback system and Gradio integration enables a highly customizable and powerful labeling app. It supports multiple views, 2D and 3D, and maintains time synchronization!

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0
Video thumbnail

If you're not labeling your own data, you're NGMI. I take this seriously, so I finished building the first version of my hand-tracking annotation app using rerun.io and gradio.app

19.09.2025 17:00 👍 0 🔁 0 💬 1 📌 0
Preview
GitHub - rerun-io/annotation-example at dbt Contribute to rerun-io/annotation-example development by creating an account on GitHub.

Find the current code here <github.com/rerun-io/an...>

15.09.2025 17:01 👍 0 🔁 0 💬 0 📌 0

The complexity of this is really starting to stack up, and I hope in the longer term to have the compute + data to build a fully end-to-end network!
x.com/pablovelago...

15.09.2025 17:01 👍 1 🔁 0 💬 1 📌 0

Upload multiview video zip -> calibrate cameras (VGGT + Moge) -> perform 2D point estimation (Wilor)

Now I need to add reactivity for every frame and timestamp to address any failures in the network!

15.09.2025 17:01 👍 0 🔁 0 💬 1 📌 0

Every off-the-shelf annotation solution I've tried doesn't provide nearly enough flexibility, so it was a no-brainer to build my own with rerun and gradio.

So far, I have the bare-bones implementation:

15.09.2025 17:01 👍 0 🔁 0 💬 1 📌 0

Currently, the hand detection module isn't temporally consistent enough, which leads to cascading errors throughout the pipeline. So I want to train a new detector that better suits my needs. The problem is, I want a scalable way to generate labels.

15.09.2025 17:01 👍 0 🔁 0 💬 1 📌 0
Video thumbnail

Unfortunately, I've identified some serious issues with the hand tracker that necessitate manual intervention. So I decided the next best course of action is to build a labeling app with @rerun.io and
@gradio-hf.bsky.social, as I transition from my full-body solution to a hands-only solution.

15.09.2025 17:01 👍 0 🔁 0 💬 1 📌 0
Preview
GitHub - rerun-io/pi0-lerobot at hand-kinematic-fitting Contribute to rerun-io/pi0-lerobot development by creating an account on GitHub.

Take a look at the code here - <
github.com/rerun-io/pi...
>

29.08.2025 16:43 👍 0 🔁 0 💬 0 📌 0

I'm also hoping to train my own hand tracker with @roboflow.

That's the next goal (as well as opening up the multiview pipeline! Sorry it's taking so long, needs a little bit more polish).

29.08.2025 16:43 👍 1 🔁 0 💬 1 📌 0

I've been super impressed by how well it performs on videos of all kinds! This model has no concept of video and runs image to image, and yet it still performs incredibly well. With some basic tracking added from something like @skalskip92 github. com/roboflow/trackers, and this could go super far.

29.08.2025 16:43 👍 0 🔁 0 💬 1 📌 0