Fernando Pérez-García's Avatar

Fernando Pérez-García

@fepegar.com

Senior research machine learning engineer at Microsoft Research Health Futures. PhD in Medical Imaging. Open source & open access. Created TorchIO. He/him.

533
Followers
486
Following
31
Posts
14.11.2024
Joined
Posts Following

Latest posts by Fernando Pérez-García @fepegar.com

Why do I have to pretend that I'm going to print something in order to save it as a PDF. Why do I have to engage in a little ruse.

23.02.2026 21:43 👍 19292 🔁 2923 💬 344 📌 1

...Maria Teodora Wetscherek, Klaus Maier-Hein, Panagiotis Korfiatis, @valesalvatelli.bsky.social, Javier Alvarez-Valle.

🧵12/12

30.01.2026 15:41 👍 0 🔁 0 💬 0 📌 0

Extra kudos to Tassilo Wald, who led the execution!

Thanks to everyone else involved in the project: @ibrahimethem.bsky.social, Yuan (William) Gao, @sambondtaylor.bsky.social, Harshita Sharma, @maxilse.bsky.social, Cynthia Lo, Olesya Melnichenko, @anton-sc.bsky.social, Noel Codella...

🧵11/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Preview
microsoft/colipri · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

COLIPRI is very easy to install and use! Just run `pip install colipri` and paste this snippet (from aka.ms/colipri) to get started.

I'm looking forward to seeing what the community will build on top of our model. Get in touch if you have questions or feedback!

🧵10/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0

We used nnSSL (Wald et al., ICCV 2025) for training, TorchIO (@fepegar.com et al., CMPB 2021) for preprocessing and augmentation, nnU-Net (Isensee et al., Nature Methods 2021) for segmentation, and nifti-zarr-py for efficient patch loading from during cloud training.

🧵9/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Post image

COLIPRI is generally superior to concurrent methods across all tasks. This is particularly clear when plugging an MLLM on top of our vision backbone. Our models are particularly stronger at clinical metrics, which are most relevant in practice.

🧵8/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Post image

To overcome this domain shift, we introduce an Opposite Sentence Loss (OSL), a simple but effective mechanism that complementes the contrastive loss and improves our metrics substantially.

🧵7/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0

Medical reports are often very long and most sentences describe what is *not* in the scan. However, for zero-shot classification, users tend to use very short prompts, such as "Lung nodules" and "No lung nodules".

🧵6/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Post image

We generate reports during training to ensure that the vision encoder extracts from the image all the information that would be needed for reporting, similar to CapPa (@mtschannen.bsky.social et al., NeurIPS 2023).

🧵5/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Post image

We resampled the volumes to 2-mm isotropic spacing using and used an input size of 160^3. We randomly shuffled and shortened sentences in the reports used for contrastive alignment. We initialised our encoder from CXR-BERT (Boecking, @naotous.bsky.social et al., ECCV 2022).

🧵4/12

30.01.2026 15:41 👍 1 🔁 0 💬 1 📌 0
Post image

We first pre-train our encoder only on images (no reports) sourced from different datasets, using a 3D MAE (Wald et al., CVPR 2025). This allows us to leverage more training data, as we did for Rᴀᴅ-DINO (@fepegar.com et al., Nature Machine Intelligence 2025).

🧵2/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Preview
Developing Generalist Foundation Models from a Multimodal Dataset for 3D Computed Tomography Advancements in medical imaging AI, particularly in 3D imaging, have been limited due to the scarcity of comprehensive datasets. We introduce CT-RATE, a public dataset that pairs 3D medical images wit...

There is not a lot of paired 3D medical image–report data out there. A pioneering example is CT-RATE (@iethemhamamci, arxiv.org/abs/2403.17834), with samples from 21k patients, a number much smaller than what we see in the natural imaging domain.

🧵1/12

30.01.2026 15:41 👍 0 🔁 0 💬 1 📌 0
Post image

We are excited to release the weights of @msftresearch.bsky.social's COLIPRI, our 3D vision–language encoder for chest CT scans, on @hf.co 🤗

Model: aka.ms/colipri
Demo: aka.ms/colipri-demo
Paper: aka.ms/colipri-paper

Why does COLIPRI matter?

🧵0/12 👇

30.01.2026 15:41 👍 3 🔁 2 💬 1 📌 0

Grading and googling hallucinated citations, as one does nowadays, and now that LLMs have been around for a while, I've discovered new horrors: hallucinated journals are now appearing in Google Scholar with dozens of citations bc so many people are citing these fake things

15.12.2025 20:41 👍 3988 🔁 1274 💬 132 📌 276

This article frames the problem as [AI?] slop. Which is a problem, but not the main one here. This is an issue with authorship norms and practices. A single individual putting their name on hundreds of (workshop) papers they admitted they had little part in.

07.12.2025 21:45 👍 21 🔁 3 💬 3 📌 0
Opening slide for a presentation titled "From medical image interpretation to scientific discovery"

Opening slide for a presentation titled "From medical image interpretation to scientific discovery"

Excited to speak shortly at the Medical Imaging at @euripsconf.bsky.social workshop in Copenhagen, where I'll share some insights from our @msftresearch.bsky.social team's journey "From medical image interpretation to scientific discovery" over the past couple of years.

07.12.2025 09:51 👍 2 🔁 2 💬 0 📌 0
Post image

SpaceX is ready for its next Transporter mission! With 140 satellites onboard, this is the largest Transporter mission since Transporter-1 in 2021, which carried 143 satellites. Here's my identification attempt. Launch is scheduled for NET 18:19 UTC.

26.11.2025 17:02 👍 100 🔁 37 💬 28 📌 5
Preview
The official home of the Python Programming Language

TLDR; The PSF has made the decision to put our community and our shared diversity, equity, and inclusion values ahead of seeking $1.5M in new revenue. Please read and share. pyfound.blogspot.com/2025/10/NSF-...
🧵

27.10.2025 14:47 👍 6417 🔁 2756 💬 125 📌 452
Preview
Data Scaling Laws for Radiology Foundation Models Foundation vision encoders such as CLIP and DINOv2, trained on web-scale data, exhibit strong transfer performance across tasks and datasets. However, medical imaging foundation models remain constrai...

🩻Excited to share our latest preprint: “Data Scaling Laws for Radiology Foundation Models”
Foundation vision encoders like CLIP and DINOv2 have transformed general computer vision, but what happens when we scale them for medical imaging?

📄 Read the full preprint here: arxiv.org/abs/2509.12818

23.09.2025 08:34 👍 5 🔁 2 💬 1 📌 0

bsky.app/profile/fons...

22.09.2025 17:51 👍 2 🔁 0 💬 0 📌 0
Photo with Keir Starmer and copy that reads: RECOGNITION OF A PALESTINIAN STATE IS A HOLLOW GESTURE WITHOUT MEANINGFUL ACTION TO END ISRAEL'S GENOCIDE, APARTHEID & OCCUPATION

Photo with Keir Starmer and copy that reads: RECOGNITION OF A PALESTINIAN STATE IS A HOLLOW GESTURE WITHOUT MEANINGFUL ACTION TO END ISRAEL'S GENOCIDE, APARTHEID & OCCUPATION

Recognition is no doubt significant but it will be a hollow gesture if the UK does not also seek to end Israel's genocide, illegal occupation, and system of apartheid against the Palestinian people.

🧵1/3

22.09.2025 12:04 👍 116 🔁 61 💬 5 📌 3
Post image Post image

Yeah man we should really fight back by staying on X

21.09.2025 02:58 👍 17135 🔁 4102 💬 258 📌 346

Having some fun with DINOv3 and PCA! Although I'm not happy my nose has such a low foreground probability :D

21.08.2025 22:14 👍 6 🔁 0 💬 0 📌 0
Preview
Email is Easy Everyone knows what an email address is, right?

Email addresses are very simple, and you will score highly in this quiz.

e-mail.wtf

17.08.2025 17:15 👍 278 🔁 129 💬 40 📌 52
16.08.2025 17:31 👍 0 🔁 0 💬 0 📌 0
Post image

Introducing DINOv3 🦕🦕🦕

A SotA-enabling vision foundation model, trained with pure self-supervised learning (SSL) at scale.
High quality dense features, combining unprecedented semantic and geometric scene understanding.

Three reasons why this matters👇

14.08.2025 18:50 👍 25 🔁 8 💬 2 📌 2

We’re not sure who needs to hear this, but ‘blueberry’ has two b’s.

08.08.2025 20:38 👍 7184 🔁 927 💬 237 📌 97
title and abstract from https://arxiv.org/pdf/2507.19960

title and abstract from https://arxiv.org/pdf/2507.19960

table 1 from https://arxiv.org/pdf/2507.19960

table 1 from https://arxiv.org/pdf/2507.19960

Boiling here at home in Cyprus but I put the finishing touches a couple of days ago on this preprint: What Does 'Human-Centred AI' Mean? doi.org/10.48550/arX...

Wherein I analyse HCAI & demonstrate through 3 triplets my new tripartite definition of AI (Table 1) that properly centres the human. 1/n

29.07.2025 11:52 👍 345 🔁 100 💬 10 📌 42
Preview
Researchers value null results, but struggle to publish them Survey finds that fear of reputational harm and a lack of support and publication platforms are among respondents’ key concerns.

Scientists overwhelmingly recognize the value of sharing null results, but rarely publish them in the research literature

go.nature.com/450KElr

23.07.2025 12:22 👍 137 🔁 66 💬 5 📌 29

i love it when people make PRs really easy to review ❤️

feels really bad when there's a PR that's been open for a while, but i know i can't easily do a good job of reviewing it, so it's constantly "later"

23.07.2025 11:32 👍 22 🔁 1 💬 2 📌 0