The Illustrated DeepSeek-R1
Spent the weekend reading the paper and sorting through the intuitions. Here's a visual guide and the main intuitions to understand the model and the process that created it.
newsletter.languagemodels.co/p/the-illust...
27.01.2025 20:22
๐ 76
๐ 23
๐ฌ 1
๐ 4
Table Representation Learning Workshop
TRL Workshop ---
At NeurIPS, IMHO:
- openreview.net/forum?id=LN5...
- openreview.net/forum?id=WH5...
- openreview.net/forum?id=Vbm...
- openreview.net/forum?id=FmN...
Before (self-plug):
- arxiv.org/abs/2402.16785
- arxiv.org/abs/2207.01848
Beyond:
- arxiv.org/abs/2410.18164
- table-representation-learning.github.io
30.12.2024 21:02
๐ 26
๐ 2
๐ฌ 0
๐ 0
Improved TableReport:
โผ tighter layout
โผ support any script (any alphabet ุญุจ เคฎเคพเคฏเคพ) in the plots
โผ robust to outliers
It works without dependencies, in any html-based environment (Jupyter notebooks, @vscode.dev, a simple web page...)
Check it out on skrub-data.org
4/5
27.11.2024 20:46
๐ 1
๐ 2
๐ฌ 1
๐ 0
A high-level summary diagram taken from the slides linked below. It shows the interplay of two main components: a probabilistic model and decision maker or planner.
Probabilistic predictions of an underfitting polynomial classifier on a noisy XOR task and the corresponding under-confident calibration curve.
Probabilistic predictions of an overfitting polynomial classifier and the resulting overconfident calibration curve on the same noisy XOR problem.
Simulation study to show the relative lack of stability of hyperparameter tuning when using hard metrics such as Accuracy or soft yet not probabilistic metrics such as ROC AUC compared to a strictly proper scoring rule such as the log-loss.
I recently shared some of my reflections on how to use probabilistic classifiers for optimal decision-making under uncertainty at @pydataparis.bsky.social 2024.
Here is the recording of the presentation:
www.youtube.com/watch?v=-gYn...
27.11.2024 14:17
๐ 49
๐ 19
๐ฌ 1
๐ 1
We're always updating the pydata & scipy project starter pack:
go.bsky.app/6HkrMcp
Hello @scikit-learn.bsky.social , @networkx.bsky.social , @scipyconf.bsky.social
22.11.2024 17:46
๐ 53
๐ 20
๐ฌ 6
๐ 1
Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI
With the growing attention and investment in recent AI approaches such as large language models, the narrative that the larger the AI system the more valuable, powerful and interesting it is is increa...
Me: writes a paper (with @sashamtl.bsky.social and @meredithmeredith.bsky.social) on how hype and choice of goals and benchmarks leads to an arm race in AI arxiv.org/abs/2409.14160
Them: here's my proposal to solve the problem by a better AI algorithm.
๐คจ The problem is social: it's norm setting..
05.11.2024 13:29
๐ 46
๐ 12
๐ฌ 3
๐ 0
I guess you are looking for this GeoGinger ๐๐ป @geoginger.bsky.social
16.11.2024 23:40
๐ 1
๐ 0
๐ฌ 1
๐ 0
Map of โvolcanoes that changed the worldโ
Probably not the upbeat, diverting story most of us are looking for these days, but an interesting read nevertheless.
And hey, whatโs another future cataclysm among friends?
โThe next massive volcano eruption will cause climate chaos โ and we are unpreparedโ
โ๏ธ๐งช๐
www.nature.com/articles/d41...
13.11.2024 16:27
๐ 43
๐ 14
๐ฌ 9
๐ 3
boom
13.11.2024 16:19
๐ 81
๐ 7
๐ฌ 2
๐ 0