Daniel Scalena's Avatar

Daniel Scalena

@danielsc4.it

PhDing @unimib ๐Ÿ‡ฎ๐Ÿ‡น & @gronlp.bsky.social ๐Ÿ‡ณ๐Ÿ‡ฑ, interpretability et similia danielsc4.it

423
Followers
185
Following
23
Posts
16.11.2024
Joined
Posts Following

Latest posts by Daniel Scalena @danielsc4.it

Indeed, but we also show the other side of the coin: personalized generation and its evaluation remain extremely challenging, and IMO professional human translators are still essential to produce a truly original, publication-ready final work as of today.

05.01.2026 12:41 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Want models to translate in the style you actually like?

Our paper is here! See you in Marocco! ๐Ÿ‡ฒ๐Ÿ‡ฆ

04.01.2026 18:11 ๐Ÿ‘ 5 ๐Ÿ” 1 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
EAGER: Entropy-Aware GEneRation for Adaptive Inference-Time Scaling With the rise of reasoning language models and test-time scaling methods as a paradigm for improving model performance, substantial computation is often required to generate multiple candidate sequenc...

Takeaway: EAGer shows we can be MORE efficient & MORE effective by letting models focus compute where it matters most.

๐Ÿ“„Paper: arxiv.org/abs/2510.11170
๐Ÿ’ปCode: github.com/DanielSc4/EA...
โœจHuge thanks to my mentors and collaborators @leozotos.bsky.social E. Fersini @malvinanissim.bsky.social A. รœstรผn

16.10.2025 12:07 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Post image

Results: Across 3B-20B models, EAGer cuts budget by up to 80%, boosts perf 13% w/o labels & 37% w/ labels on AIME.
As M scales, EAGer consistently:
๐Ÿš€ Achieves HIGHER Pass@k,
โœ‚๏ธ Uses FEWER tokens than baseline,
๐Ÿ•บ Shifts the Pareto frontier favorably across all tasks.
๐Ÿงต5/

16.10.2025 12:07 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

The fun part: EAGer-adapt reallocates saved budget to "saturating" prompts hitting the M cap, no labels needed! โ€“ Training & Verification-Free ๐Ÿš€

Full EAGer uses labels to catch failing prompts, lowering threshold to branch or add sequences. Great for verifiable pipelines!
๐Ÿงต4/

16.10.2025 12:07 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

EAGer works by monitoring token entropy during generation. High entropy token โ†’ It branches to explore new paths (reusing prefixes). Token with low entropy โ†’ It continues a single path.

We cap at M sequences/prompt, saving budget on easy ones without regen. Training-free!
๐Ÿงต3/

16.10.2025 12:07 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Why? Reasoning LLMs shine with CoTs, but full parallel samplingโ€”generating multiple paths per promptโ€”is inefficient ๐Ÿ˜ค.

It wastes compute on redundant, predictable tokens, esp. for easy prompts. Hard prompts need more exploration but get the same budget. Enter EAGER๐Ÿง !
๐Ÿงต2/

16.10.2025 12:07 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

You can easily save up to 65% of compute while improving performance on reasoning tasks ๐Ÿคฏ ๐Ÿ‘€

Meet EAGer: We show that monitoring token-level uncertainty lets LLMs allocate compute dynamically - spending MORE on hard problems, LESS on easy ones.
๐Ÿงต๐Ÿ‘‡

16.10.2025 12:07 ๐Ÿ‘ 1 ๐Ÿ” 1 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Iโ€™ll be attending the NEMI 2025 workshop this Friday and presenting a poster๐Ÿ‘‡.

Happy to chat about cool interpretability stuff there!

20.08.2025 22:42 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0
Preview
Steering Large Language Models for Machine Translation Personalization High-quality machine translation systems based on large language models (LLMs) have simplified the production of personalized translations reflecting specific stylistic constraints. However, these sys...

๐Ÿ“ Paper: arxiv.org/abs/2505.16612
๐Ÿ”— Code: github.com/DanielSc4/st...

Thanks to my amazing co-authors:
@gsarti.com , @arianna-bis.bsky.social , Elisabetta Fersini, @malvinanissim.bsky.social
7/7

23.05.2025 12:23 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 1

๐Ÿ” Whatโ€™s happening in the model?
We find that SAE steering and multi-shot prompting impact internal representations similarly, suggesting insight from user examples are summarized with extra interpretability potential (look at latents) and better efficiency (no long context) 6/

23.05.2025 12:23 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐ŸŒ Across 7 languages, our SAE-based method matches or outperforms traditional prompting methods! Our method obtains better human-like translations (H) personalization accuracy (P), and maintains translation quality (Comet โ˜„๏ธ @nunonmg.bsky.social) especially for smaller LLMs. 5/

23.05.2025 12:23 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿ’ก We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).

Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose โœจ contrastive SAE steering โœจ with mutual info to personalize literary MT by tuning latent features 4/

23.05.2025 12:23 ๐Ÿ‘ 4 ๐Ÿ” 2 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿ“ˆ But can models recognize and replicate individual translator styles?:
โœ“ Classifiers can find styles with high acc. (humans kinda donโ€™t)
โœ“ Multi-shot prompting boosts style a lot
โœ“ We can detect strong style traces in activations (esp. mid layers) 3/

23.05.2025 12:23 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

๐Ÿ“˜ Literary translation isn't just about accuracy, but also creatively conveying meaning across languages. But LLMs prompted for MT are very literal. Prompting & steering to the rescue!

Can we personalize LLMโ€™s MT when few examples are available, without further tuning? ๐Ÿ” 2/

23.05.2025 12:23 ๐Ÿ‘ 2 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0
Post image

๐Ÿ“ข New paper: Applied interpretability ๐Ÿค MT personalization!

We steer LLM generations to mimic human translator styles on literary novels in 7 languages. ๐Ÿ“š

SAE steering can beat few-shot prompting, leading to better personalization while maintaining quality.

๐Ÿงต1/

23.05.2025 12:23 ๐Ÿ‘ 20 ๐Ÿ” 5 ๐Ÿ’ฌ 2 ๐Ÿ“Œ 2

Hellooo ๐Ÿ‘€

04.12.2024 13:45 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

Hey hello! ๐Ÿ‘‹

28.11.2024 11:05 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Now on ๐Ÿฆ‹!

21.11.2024 14:51 ๐Ÿ‘ 3 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

Hello!

19.11.2024 19:39 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ‘‹

19.11.2024 19:24 ๐Ÿ‘ 0 ๐Ÿ” 0 ๐Ÿ’ฌ 1 ๐Ÿ“Œ 0

It was great, I'm starting to get tickets for next year!

17.11.2024 20:30 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0

๐Ÿ‘€๐Ÿ™‹โ€โ™‚๏ธ

16.11.2024 23:58 ๐Ÿ‘ 1 ๐Ÿ” 0 ๐Ÿ’ฌ 0 ๐Ÿ“Œ 0