HΓ©ron cendrΓ© I would say π§
HΓ©ron cendrΓ© I would say π§
Iβve put together a short list of opportunities for early career academics willing to come to Europe: www.cvernade.com/miscellaneou...
This mostly covers France and Germany for now but Iβm willing to extend it. I build on @ellis.eu resources and my own knowledge of these systems.
T-Rex scaling laws?
The way I see it is that βthe marketβ roughly thought that scaling is all you need, so big AI actors would need to buy more and more GPUs to get better LLMs. While DeepSeek is proof that you can still get better LLMs without necessarily needing to buy more GPUs. Does that make sense?
Yesterday I went to Twitter and scrolled over 36h of my feed to check whether I missed something. I did not.
Even @francois.fleuret.org and @giffmana.ai seem to be less active? Guys, you should bring your skeets to this place π
II had a great time at the @neurreps.bsky.social workshop today! Thanks again to the organisers of this awesome event!
"Does Equivariance Matter at Scale?"
Spoiler alert: Yes it does !!
I guess @johannbrehmer.bsky.social did not have to convince the @neurreps.bsky.social audience π
Super interesting work building scaling laws for equivariant foundation models!
Super inspiring talk by Eero Simoncelli building the bridge between inductive biases in diffusion models and how the human brain processes images!
Surprisingly clear given its high βmathematics densityβ :)
Itβs today @neuripsconf.bsky.social in East building poster #1500!
Whether youβre into pose estimation or just looking for a new application to test your fancy geometric deep learning method, come chat with us !!
Very inspiring talk by Fei Fei Lee yesterday at #NeurIPS2024 on visual intelligence !
Very interesting first invited talk at the intersection between cognitive sciences and AI by @alisongopnik.bsky.social ! π€©
β¨Joint work with the amazing @vletzelter.bsky.social, Nermin Samet, Renaurd Marlet, Matthieu Cord, Patrick Perez and Eduardo Valleβ¨
If youβre at #Neurips2024 next week, come meet us in poster session 3 on Thu 12 Dec 11 a.m!
Or at our oral presentation during the @neur_reps workshop on Saturday 14th!
Paper: arxiv.org/abs/2312.06386
Github: github.com/cedricrommel...
We train it with the resilient winner-takes-all loss, which allows the model to optimally quantize the space without requiring many heads.
In the end, our model works as a conditional density estimator, taking the shape of a mixture of Dirac deltas.
- Limbs length and directions are disentangled to constrain predicted poses to an estimated manifold.
- A multi-head subnetwork is used to predict different possible rotations for each joint, together with their corresponding likelihoods.
- Both are then merged into predicted poses.
In fact, we prove the *only* way of conciliating consistency with accurate predictions is to output multiple 3D poses for each 2D input.
We hence propose ManiPose, a manifold-constrained multi-hypothesis deep network capable of better dealing with depth ambiguity.
Previous approaches constrain poses to an estimated manifold by disentangling limbs lengths and directions. But they lag behind unconstrained models in terms of joint position error (MPJPE).
In our work, we prove this is unavoidable because of points 1 and 2.
There are 3 main reasons to this:
1. Existing training losses and evaluation metrics (MPJPE) are blind to such inconsistencies ;
2. Many possible 3D poses can map to the same 2D input ;
3. Pose sequences cannot occupy the whole space: they lie on a smooth manifold because of limbs rigidity.
While standard approaches directly map 2D coordinates to 3D, prior works noticed that predicted posesβ limbs could shrink and stretch along a movement.
In our work, we prove these are not isolated cases and that these methods always predict *inconsistent* 3D pose sequences.
Many intelligent systems, like autonomous cars and smart/VR glasses, need to understand humanβs movements and poses.
This can be achieved with a single camera by detecting human keypoints on a video, then lifting them into a 3D pose.
Inferring 3D human poses from video is highly ill-posed because of depth ambiguity.
Our work accepted to #NeurIPS2024, ManiPose, gets one step closer to solving this, by leveraging prior knowledge about poses topology and cool multiple-choice learning techniques.