Tarek Bouamer (@tarekbouamer)

Telling your students about research before the ImageNet moment

21.12.2025 18:53 👍 34 🔁 5 💬 3 📌 2

Introducing StereoSpace -- our new end-to-end method for turning photos into stereo images without explicit geometry or depth maps. This makes it especially robust with thin structures and transparencies. Try the demo below

16.12.2025 16:49 👍 9 🔁 4 💬 2 📌 0

ACE-SLAM: Scene Coordinate Regression for Neural Implicit Real-Time SLAM

Ignacio Alzugaray, @marwantaher.bsky.social, @ajdavison.bsky.social

tl;dr: in title; ACE+SLAM

arxiv.org/abs/2512.14032

17.12.2025 22:27 👍 6 🔁 1 💬 0 📌 0

SAM 3D: 3Dfy Anything in Images

SAM 3D Team et al?

tl;dr: in title. 8-stage training, dataset, human labeling. Do not read tl;dr, read whole paper
arxiv.org/abs/2511.16624

21.11.2025 13:00 👍 16 🔁 3 💬 1 📌 0

Dmytro Mishkin 🇺🇦 on X: "Last week we launched IMC2025-Ongoing on @kaggle The dataset is exactly as in IMC2025, but the competition is on-going for a year, making it better for academic leaderboard and persistency. https://t.co/ejmeN3Gh5B 1/2" / X Last week we launched IMC2025-Ongoing on @kaggle The dataset is exactly as in IMC2025, but the competition is on-going for a year, making it better for academic leaderboard and persistency. https://t.co/ejmeN3Gh5B 1/2

Last week we launched IMC2025-Ongoing on
@kaggle.com

The dataset is exactly as in IMC2025, but the competition is on-going for a year, making it better for academic leaderboard and persistency.
kaggle.com/competitions...
1/2

10.11.2025 20:32 👍 6 🔁 2 💬 1 📌 0

MASt3R-Fusion: Integrating Feed-Forward Visual Model with IMU, GNSS for High-Functionality SLAM

Yuxuan Zhou, Xingxing Li, Shengyu Li, Zhuohao Yan, Chunxi Xia, Shaoquan Feng

tl;dr: MASt3R-SLAM+IMU+GNSS

arxiv.org/abs/2509.20757

26.09.2025 13:09 👍 5 🔁 3 💬 0 📌 0

LongSplat a robust unposed 3D Gaussian Splatting for Casual Long Videos
web linjohnss.github.io/longsplat/
code github.com/NVlabs/LongS...

25.09.2025 20:40 👍 14 🔁 3 💬 0 📌 0

MapAnything, a simple, end-to-end trained transformer model that directly regresses the factored metric 3D geometry of a scene given various types of inputs (images, calibration, poses, or depth).
code: github.com/facebookrese...
web: map-anything.github.io

17.09.2025 18:00 👍 9 🔁 2 💬 2 📌 0

3D and 4D World Modeling: A Survey

tl;dr: in title

arxiv.org/abs/2509.07996

11.09.2025 13:54 👍 6 🔁 4 💬 1 📌 0

OmniMap: A General Mapping Framework Integrating Optics, Geometry, and Semantics

Yinan Deng, Yufeng Yue, Jianyu Dou, Jingyu Zhao, Jiahui Wang, Yujie Tang, Yi Yang, Mengyin Fu

tl;dr: optics, geometry, and semantics->3DGS-Voxel hybrid representation

arxiv.org/abs/2509.07500

10.09.2025 20:04 👍 3 🔁 1 💬 0 📌 0

Faster VGGT with Block-Sparse Global Attention

Chung-Shien Brian Wang, Christian Schmidt, Jens Piekenbrinck, Bastian Leibe

tl;dr: block-sparse attention replaces global attention

another work to improve scalability of VGGT

arxiv.org/abs/2509.07120

10.09.2025 20:06 👍 3 🔁 2 💬 1 📌 0

Life is hard without the fast internet we’re used to.

I could not even join my Google Meet this morning. 😞

09.09.2025 17:18 👍 2 🔁 0 💬 0 📌 0

CausNVS: Autoregressive Multi-view Diffusion for Flexible 3D Novel View Synthesis

Xin Kong, Daniel Watson, Yannick Strümpler, @miniemeyer.bsky.social, Federico Tombari

tl;dr: a framewise attention layer with causal masking on top of a pretrained 2D diffusion backbone

arxiv.org/abs/2509.06579

09.09.2025 11:01 👍 2 🔁 1 💬 0 📌 0

Stages of the eclipse, captured by a friend.
#mooneclipse

08.09.2025 09:13 👍 1 🔁 0 💬 0 📌 0

Lunar eclipse over Abu Dhabi tonight at 22:20 PM 🌑🌕✨

07.09.2025 20:16 👍 2 🔁 0 💬 0 📌 1

Apply for the AITHYRA-CeMM International PhD Program!

15-20 fully funded PhD fellowships available in Vienna, AT
in AI/ML and Life Sciences

Deadline for applications:
10 September 2025 apply.cemm.at

25.07.2025 10:27 👍 15 🔁 10 💬 0 📌 2

Franca official code and pretrained models are up on github and pytorch hub! github.com/valeoai/franca
Eager to learn how will it be used.

28.07.2025 19:20 👍 16 🔁 2 💬 0 📌 0

Reconstruct, Inpaint, Finetune: Dynamic Novel-view Synthesis from Monocular Videos

Kaihua Chen, @tarashakhurana.bsky.social, Deva Ramanan

tl;dr: in title; fine-tune CogVideoX->train 2D video-inpainter

arxiv.org/abs/2507.12646

18.07.2025 10:22 👍 5 🔁 2 💬 0 📌 0

St4RTrack: Simultaneous 4D Reconstruction and Tracking in the World

TL;DR: a feed-forward; (reconstructs+tracks dynamic video content); dust3r-like pointmaps for a pair of frames captured at different moments (1/2)

22.04.2025 16:30 👍 3 🔁 1 💬 1 📌 0

www.liruilong.cn/prope/

15.07.2025 23:49 👍 1 🔁 0 💬 0 📌 0

🌌🛰️🔭Want to explore universal visual features? Check out our interactive demo of concepts learned from our #ICML2025 paper "Universal Sparse Autoencoders: Interpretable Cross-Model Concept Alignment".

Come see our poster at 4pm on Tuesday in East Exhibition hall A-B, E-1208!

15.07.2025 02:36 👍 12 🔁 6 💬 1 📌 3

Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching

Yuhan Liu, Jingwen Fu, Yang Wu, Kangyi Wu, Pengna Li, Jiayi Wu, Sanping Zhou, Jingmin Xin

tl;dr: Stable Diffusion+attention-based prompt in LoFTR-type framework

no eval. on IMC

arxiv.org/abs/2507.10318

15.07.2025 10:47 👍 3 🔁 1 💬 0 📌 0

Leveraging Automatic CAD Annotations for Supervised Learning in 3D Scene Understanding

Yuchen Rao, Stefan Ainetter, Sinisa Stekovic, @vincentlepetit.bsky.social , Friedrich Fraundorfer

tl;dr: in title
arxiv.org/abs/2504.13580

28.04.2025 09:13 👍 8 🔁 3 💬 0 📌 0

A Guide to Structureless Visual Localization

Vojtech Panek, Qunjie Zhou, Yaqing Ding, Sérgio Agostinho, Zuzana Kukelova @sattlertorsten.bsky.social @lealtaixe.bsky.social

tl;dr: RoMa>MAST3r outdoors with 5pt solver, indoors MAST3r is king. M3Dv2 depth comparable to MAST3r
arxiv.org/abs/2504.17636

28.04.2025 08:02 👍 6 🔁 2 💬 1 📌 0

🚀 Never miss a beat in science again!

📬 Scholar Inbox is your personal assistant for staying up to date with your literature. It includes: visual summaries, collections, search and a conference planner.

Check out our white paper: arxiv.org/abs/2504.08385
#OpenScience #AI #RecommenderSystems

14.04.2025 11:04 👍 94 🔁 19 💬 1 📌 4

Super excited to share Visual Chronicles! Huge kudos to @boyangdeng.bsky.social on his fantastic internship work with us at Google DeepMind. It was one of the coolest and most fun projects I've ever been a part of!

Tell us what trends we discovered surprise you: boyangdeng.com/visual-chron...

14.04.2025 15:40 👍 6 🔁 1 💬 0 📌 0

#EidMubarak to all my friends and colleagues celebrating! 🌙✨

May these blessed days bring joy, peace, and prosperity to you and your families. 🕌🤲

30.03.2025 05:52 👍 0 🔁 0 💬 0 📌 0

09.03.2025 20:50 👍 0 🔁 0 💬 0 📌 0

(1/3) Happy to share LUDVIG: Learning-free Uplifting of 2D Visual features to Gaussian Splatting scenes, that uplifts visual features from models such as DINOv2 (left) & CLIP (mid) to 3DGS scenes. Joint work w. @dlarlus.bsky.social @jmairal.bsky.social
Webpage & code: juliettemarrie.github.io/ludvig

31.01.2025 09:59 👍 65 🔁 16 💬 1 📌 2

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Jianing Yang, Alexander Sax, Kevin J. Liang, Mikael Henaff, Hao Tang, Ang Cao, Joyce Chai, Franziska Meier, Matt Feiszli

tl;dr: one more multiview-transformer decoder for DUST3r encoder. Optimized
arxiv.org/abs/2501.13928

27.01.2025 12:01 👍 36 🔁 4 💬 1 📌 0

Tarek Bouamer

Latest posts by Tarek Bouamer @tarekbouamer