Nima Nooshiri (@nimanzik)

The Illustrated DeepSeek-R1

Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it.

newsletter.languagemodels.co/p/the-illust...

27.01.2025 20:22 👍 76 🔁 23 💬 1 📌 4

Table Representation Learning Workshop TRL Workshop ---

At NeurIPS, IMHO:
- openreview.net/forum?id=LN5...
- openreview.net/forum?id=WH5...
- openreview.net/forum?id=Vbm...
- openreview.net/forum?id=FmN...
Before (self-plug):
- arxiv.org/abs/2402.16785
- arxiv.org/abs/2207.01848
Beyond:
- arxiv.org/abs/2410.18164
- table-representation-learning.github.io

30.12.2024 21:02 👍 26 🔁 2 💬 0 📌 0

Improved TableReport:
◼ tighter layout
◼ support any script (any alphabet حب माया) in the plots
◼ robust to outliers

It works without dependencies, in any html-based environment (Jupyter notebooks, @vscode.dev, a simple web page...)

Check it out on skrub-data.org
4/5

27.11.2024 20:46 👍 1 🔁 2 💬 1 📌 0

A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.

Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.

Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.

Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.

I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.

Here is the recording of the presentation:

www.youtube.com/watch?v=-gYn...

27.11.2024 14:17 👍 49 🔁 19 💬 1 📌 1

We're always updating the pydata & scipy project starter pack:
go.bsky.app/6HkrMcp

Hello @scikit-learn.bsky.social , @networkx.bsky.social , @scipyconf.bsky.social

22.11.2024 17:46 👍 53 🔁 20 💬 6 📌 1

Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable, powerful and interesting it is is increa...

Me: writes a paper (with @sashamtl.bsky.social and @meredithmeredith.bsky.social) on how hype and choice of goals and benchmarks leads to an arm race in AI arxiv.org/abs/2409.14160

Them: here's my proposal to solve the problem by a better AI algorithm.

🤨 The problem is social: it's norm setting..

05.11.2024 13:29 👍 46 🔁 12 💬 3 📌 0

I guess you are looking for this GeoGinger 👉🏻 @geoginger.bsky.social

16.11.2024 23:40 👍 1 🔁 0 💬 1 📌 0

Optuna: a hyperparameter optimization framework Scikit-learn allows you to perform hyperparameter search but a lot of it happens in memory. Sometimes you want to have a storage layer for these hyperparamet...

Will be going live in a bit to talk about Optuna. Feel free to join live at lunch if you have any questions you'd like me to answer live.

15.11.2024 09:21 👍 5 🔁 1 💬 0 📌 0

Map of “volcanoes that changed the world”

Probably not the upbeat, diverting story most of us are looking for these days, but an interesting read nevertheless.
And hey, what’s another future cataclysm among friends?

“The next massive volcano eruption will cause climate chaos — and we are unprepared”
⚒️🧪🌋
www.nature.com/articles/d41...

13.11.2024 16:27 👍 43 🔁 14 💬 9 📌 3

boom

13.11.2024 16:19 👍 81 🔁 7 💬 2 📌 0

Nima Nooshiri

Latest posts by Nima Nooshiri @nimanzik