Nicola Bordin's Avatar

Nicola Bordin

@nbordin

Research Fellow in Bioinformatics (Proteins + ML + Function) @ CATH University College London | Mountains, Proteins, Food in that order | MSCA Alumn | Replies to emails!

838
Followers
346
Following
49
Posts
27.09.2023
Joined
Posts Following

Latest posts by Nicola Bordin @nbordin

Had a great time at the #ML4NGP Training school in Sevilla! Back to Italy tomorrow (normally it would be back to London but things change!)

P.S. Signed my TT contract with @unipd.bsky.social , starting March 23rd. Exciting!

20.02.2026 19:00 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
ProFam: Open-Source Protein Family Language Modelling for Fitness Prediction and Design Protein language models have become essential tools for engineering novel functional proteins. The emerging paradigm of family-based language models makes use of homologous sequences to steer protein ...

To advance the family-based modelling approach, we are releasing the entire framework open source:

ProFam Atlas: A curated, large-scale training corpus containing nearly 40 million protein families.
Code & Weights: github.com/alex-hh/prof...
Data: zenodo.org/records/1771...

22.12.2025 14:32 πŸ‘ 3 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
Post image

For design, ProFam-1 excels at homology-guided generation. It produces diverse sequences with low sequence identity to natural proteins while preserving predicted structural similarity and conservation patterns of the natural family, even when conditioning on just a single example sequence.

22.12.2025 14:32 πŸ‘ 2 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Video thumbnail

Built by CATH, TÜM and NVIDIA, ProFam-1 is our new open-source protein family language model (pfLM) designed to generate functional protein variants and predict fitness using in-context example sequences.

22.12.2025 14:32 πŸ‘ 11 πŸ” 5 πŸ’¬ 1 πŸ“Œ 1

I’ll continue working on algorithms, deep learning, and AI-based methods to explore the protein structural and functional landscape.

Starting in early 2026!

Excited to return after 10 years around Europe!

19.12.2025 17:05 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Going full circle! 😊

The University of Padua was home for both my Bachelor’s and Master’s degrees.

After 7 amazing years at UCL in the Orengo Group, I’m really happy to share that I’ve won a Tenure-Track Assistant Professorship in Biochemistry at my Alma Mater!

19.12.2025 17:05 πŸ‘ 12 πŸ” 1 πŸ’¬ 3 πŸ“Œ 0
Preview
AlphaFold Protein Structure Database 2025: a redesigned interface and updated structural coverage Abstract. The AlphaFold Protein Structure Database (AFDB; https://alphafold.ebi.ac.uk), developed by EMBL–EBI and Google DeepMind, provides open access to

From Sameer Velankar & colleagues in @narjournal.bsky.social #NARDatabaseIssue | #AlphaFold #Protein #Structure #Database 2025: a redesigned interface and updated structural coverage | #Bioinformatics #Proteomics #OpenScience #AFDB πŸ§ͺπŸ”“ CC/ @ebi.embl.org
⬇️
academic.oup.com/nar/advance-...

24.11.2025 00:56 πŸ‘ 30 πŸ” 14 πŸ’¬ 0 πŸ“Œ 0
Quick Check Needed

We are looking for a computational postdoc to work with us on new optimisation algorithms to make #RELION even better. Join our bubbly team at the @mrclmb.bsky.social in Cambridge, UK. πŸ€— RTs appreciated.

mrc.tal.net/vx/appcentre...

29.10.2025 09:17 πŸ‘ 64 πŸ” 57 πŸ’¬ 2 πŸ“Œ 2
Post image Post image Post image

It was lovely to speak at the CATH 30 symposium, celebrating 30 years of the @cathgene3d.bsky.social protein structure classification database. I was presenting recent work on our new generative protein-family language model: preprint coming soon.

18.09.2025 10:32 πŸ‘ 11 πŸ” 3 πŸ’¬ 0 πŸ“Œ 0

Packing for our first flight with our kid tomorrow. Wish us luck!

We went from 9kg of checked luggage for 2 months in Thailand to 3 checked suitcases and a pram. Send help!

26.08.2025 16:35 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We have a stellar lineup of speakers!

Christine Orengo
Burkhard Rost
Janet Thornton
David Jones
Gonzalo Parra @gonzaparra.bsky.social
Sameer Velankar
Alex Bateman
Maria Martin
Rob Finn
Gerardo Tauriello
Alexey Murzin

22.08.2025 10:45 πŸ‘ 3 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0

There will be talks from world leaders in structural bioinfomatics on various themes including pioneering protein language models and key international resources including: PDBe, InterPro, UniProt, MGnify, SWISS-MODEL, FrustraEvo and CATH.

22.08.2025 10:45 πŸ‘ 1 πŸ” 2 πŸ’¬ 1 πŸ“Œ 0
Preview
Protein Annotations in the age of AI A not-for-profit symposium hosted at UCL - more details about speakers and venue below.

CATH turns 30 years old this year!

We are organising a 1-day symposium on September 16th at UCL, highlighting recent AI-based developments to enhance protein family classifications, annotations and analyses.

www.eventbrite.co.uk/e/protein-an...

22.08.2025 10:45 πŸ‘ 12 πŸ” 7 πŸ’¬ 2 πŸ“Œ 0

Thank you David! Officially a guiri!

13.08.2025 12:33 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Post image Post image

Today I became a British citizen! πŸ‡¬πŸ‡§

13.08.2025 12:18 πŸ‘ 5 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image Post image Post image Post image

#ISMBECCB2025 is over! Back to London tomorrow after a science feast, a talk, and a selfie with John Jumper. Not too bad!

24.07.2025 18:53 πŸ‘ 5 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Metagenomic-scale analysis of the predicted protein structure universe Protein structure prediction breakthroughs, notably AlphaFold2 and ESMfold, have led to an unprecedented influx of computationally derived structures. The AlphaFold Protein Structure Database now prov...

Today at 2 PM at 3DSIG #ISMBECCB2025, @nbordin.bsky.social presents our joint work on metagenomic-scale clustering and novel domain discovery in predicted structures!
πŸ“„ www.biorxiv.org/content/10.1...

Also check out poster:
B-50 lolalign Sensitive structural alignments by Lasse
B-123 BFVD by Rachel

22.07.2025 09:10 πŸ‘ 36 πŸ” 8 πŸ’¬ 2 πŸ“Œ 0

Off to Liverpool for #ISMBECCB2025!

Looking forward to some awesome science and friends!

20.07.2025 09:09 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Just reverted the video to explain protein folding!

12.07.2025 22:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

"in 2025 we will have flying cars" πŸ˜‚πŸ˜‚πŸ˜‚

05.07.2025 16:17 πŸ‘ 399 πŸ” 91 πŸ’¬ 8 πŸ“Œ 35
03.07.2025 09:33 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

We've updated our AFESM website to now include biome filtering, allowing exploration of protein structures adapted to specific environments.
🌐 afesm.foldseek.com
Read more about the work in the skeetorial
πŸ¦‹ bsky.app/profile/mart...
or our preprint
πŸ“„ www.biorxiv.org/content/10.1...

15.05.2025 14:03 πŸ‘ 60 πŸ” 22 πŸ’¬ 2 πŸ“Œ 0

Pinging @jingiyeo.bsky.social and @martinsteinegger.bsky.social

29.04.2025 14:52 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Very good point! It might worth investigating. We noticed this behaviour also when we clustered TED (over 81M singletons). that analysis was done at the domain-level, not at the chain level but the clustering wasn't that strict. Here I focussed more on the downstream from the domain end of things.

29.04.2025 14:52 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Amazing effort by @jingiyeo.bsky.social, @yewonhan.bsky.social, Andy Lau, @shaunkandathil.bsky.social, @hbkgenomics.bsky.social, Eli Levy Karin and @cathgene3d.bsky.social !

28.04.2025 11:16 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Explore AFESM with our website! You can search your favorite proteins from ESMatlas or AFDB using their identifiers. It's still a work in progress, with many exciting features on the way! Thanks @milot.bsky.social !

28.04.2025 11:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

However, these novel domain combinations comprise only a small fraction (0.3%) of ESM-only clusters. The remainder are mostly low-quality predictions (53%), fragments (16%), known domains with potential unknown extensions (19%), or without identifiable domains (9.3%).

28.04.2025 11:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

@yewonhan.bsky.social identified 11,941 novel multi-domain combinations!
We found membrane-associated domains (e.g., TonB dependent receptor), highlighting domain recombination rather than new folds as a driver of structural innovation in ESMatlas.

28.04.2025 11:16 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

ESM-only clusters contain ZERO novel folds using the TED workflow. Re-modelling discarded domains (2.3M) with ColabFold revealed 1 novel fold; unlike AFDB’s >7k novel folds, hinting at a saturating fold space or ESMfold limitations.

28.04.2025 11:16 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Post image

With MGnify environmental labels, we computed the lowest common biomes per structural cluster, revealing protein adaptations unique to specific environments, especially extreme ones like hyperthermal, hypersaline, and glaciers.

28.04.2025 11:16 πŸ‘ 1 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0