Delighted to share "Facts Fade Fast: Evaluating Memorization of Outdated Medical Knowledge in LLMs", accepted to Findings of #EMNLP2025 !πΌ
Wit a novel dataset of changed medical knowledge, we discover the alarming presence of obsolete advice in eight popular LLMs.β
π: arxiv.org/abs/2509.04304 #NLP
06.09.2025 16:29
π 9
π 0
π¬ 0
π 0
line diagram showing the RAG performance of different base LLM models
Also happy to share that βOn the Influence of Context Size and Model Choice in RAG Systemsβ was accepted to Findings of #NAACL2025! πΊπΈποΈ
We test how the RAG performance on QA tasks changes (and plateaus) with increasing context size across different LLMs and retrievers.
π arxiv.org/abs/2502.14759
23.02.2025 16:46
π 8
π 0
π¬ 0
π 0
architecture of the step-by-step fact verification system
Thrilled to share that "Step-by-Step Fact Verification for Medical Claims with Explainable Reasoning" was accepted to #NAACL2025! πΊπΈποΈ
This system iteratively collects new knowledge via generated Q&A pairs, making the verification process more robust and explainable.
π arxiv.org/abs/2502.14765 #NLP
23.02.2025 16:44
π 6
π 0
π¬ 0
π 0
More than 8500 submissions to ACL 2025 (ARR February 2025 cycle)! That is an increase of 3000 submissions compared to ACL 2024. It will be a fun reviewing period. π
π―
@aclmeeting.bsky.social #ACL2025 #ACL2025nlp #NLP
16.02.2025 13:19
π 20
π 5
π¬ 1
π 4
Most exciting update to encoder-only models in a long time! Love to use them for classification tasks where LLMs are an overkill #ModernBERT
21.12.2024 00:29
π 4
π 0
π¬ 1
π 0
The OLMo 2 models sit at the Pareto frontier of training FLOPs vs model average performance.
Meet OLMo 2, the best fully open language model to date, including a family of 7B and 13B models trained up to 5T tokens. OLMo 2 outperforms other fully open models and competes with open-weight models like Llama 3.1 8B β As always, we released our data, code, recipes and more π
26.11.2024 20:51
π 151
π 36
π¬ 5
π 12
I have started taking screenshots of interesting posts instead, but that gets hard to track after a while. π₯²
09.11.2024 10:10
π 1
π 0
π¬ 0
π 0
Thank you for the list! I would appreciate being added. π
08.11.2024 16:29
π 1
π 0
π¬ 0
π 0
Using that "other" NLP is fun for trying to convince your reviewers to increase the scores :))
08.11.2024 16:13
π 0
π 0
π¬ 0
π 0
Congratulations! π I will definitely read it
08.11.2024 16:08
π 1
π 0
π¬ 1
π 0