benjamin (@bclavie)

SentenceTransformers Documentation — Sentence Transformers documentation

I know a lot of people are working on making ModernBERT-based embedding models, but in the meantime, if you’d like to play around with it (no better way to learn than practice), it’s plug&play with Sentence Transformers www.sbert.net and we have examples on the repo

22.12.2024 01:11 👍 6 🔁 0 💬 0 📌 0

Hey! As Jeremy replied, this is fully expected, encoder-models aren’t expected to produce well-calibrated semantically similar scores out of the box, because it’s very far from the training task for the base model!

However, they fine tune really well into embedding models that are good at this 1/2

22.12.2024 01:10 👍 3 🔁 0 💬 1 📌 0

I'll get straight to the point.

We trained 2 new models. Like BERT, but modern. ModernBERT.

Not some hypey GenAI thing, but a proper workhorse model, for retrieval, classification, etc. Real practical stuff.

It's much faster, more accurate, longer context, and more useful. 🧵

19.12.2024 16:45 👍 620 🔁 147 💬 19 📌 34

White-on-black text saying "In fact, [MASK]-large’s processing speed is closer to that of [MASK]-base than it is to [MASK]-large's.", with the [MASK] drawn in purple toi draw attention

I wonder if some kind of model could fill this in...

11.12.2024 11:50 👍 6 🔁 0 💬 0 📌 0

There was one time my flight from Geneva got cancelled and I got a replacement one from Lyon. Still one of my most surreal experiences.

09.12.2024 09:24 👍 1 🔁 0 💬 0 📌 0

Won't be at NeurIPS but I'll be at ICLR in April, in case you're planning on being there 😄

08.12.2024 11:59 👍 1 🔁 0 💬 0 📌 0

Please do go on about the coffee. Is it a make-you-an-espresso-as-required kind of deal or a big pot? Perhaps a lovingly made 1L chemex?

02.12.2024 00:30 👍 0 🔁 0 💬 1 📌 0

Thank you @bsky.app team for correcting the mistake. Glad to be back!

28.11.2024 20:00 👍 304 🔁 24 💬 39 📌 32

I can understand this yeah. I’m generally open to discussion but I’ve seen enough unsavoury behaviour & DMs in the past couple days to want to dial it down a teensy bit at the moment sadly.

28.11.2024 14:55 👍 1 🔁 0 💬 1 📌 0

Jokes aside, it does make me kinda sad. ML Bluesky has a lot of the vibes of early twitter and interesting discussions, but seeing so many of the death threats posters unbanned while someone was banned for *posting a link to a dataset* is a really bad sign :/

28.11.2024 13:04 👍 12 🔁 0 💬 1 📌 0

GitHub - McGill-NLP/llm2vec: Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders' Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders' - McGill-NLP/llm2vec

LLM2Vec is also a nice approach for this -- only difference is you'd FT for classification rather than retrieval at the end github.com/McGill-NLP/l...

28.11.2024 13:03 👍 3 🔁 0 💬 0 📌 0

It’s only hate if it comes from
the Champagne region of X, otherwise it’s just sparkling outrage (I think?)

28.11.2024 10:25 👍 13 🔁 0 💬 1 📌 0

people on this platform will take your words out of context, twist, not mention your correction, because they just want to hate on what you work on, and insult you comfortably.

I'll keep posting here about my work but will not be interacting with anyone who wants to bash on my company.

28.11.2024 10:22 👍 116 🔁 4 💬 4 📌 5

i exclusively consent to my tweets being used for training neural networks. if you are not a neural network, stop reading this immediately

28.11.2024 02:59 👍 309 🔁 39 💬 17 📌 6

(ChromaDB is good too, but IMO it's targeting a different/less AI tinkery audience)

28.11.2024 10:06 👍 2 🔁 0 💬 1 📌 0

(they do not employ me, nor pay me in any way, I'm just out there doing unpaid advertising)

28.11.2024 10:06 👍 2 🔁 0 💬 1 📌 0

heartily recommend lancedb for local stuff where you don't want to fuss with things too much -- mostly sane default, has reranking and bm25 support so you can do two-step or hybrid search whenever needed, and the disk ANN is plenty for most people.

28.11.2024 10:05 👍 5 🔁 0 💬 1 📌 0

Note: you can still criticise the way the original dataset was built. Nothing's black and white. I understand why people are upset.
None of this implies there isn't something seriously wrong with sending death threats to someone because they *curated an open dataset from an open protocol*.

28.11.2024 06:10 👍 7 🔁 0 💬 0 📌 0

This might sound obvious, but bullying and threatening people doing perfectly legal things because you morally don't agree with them is wrong.

People stifling any serious discussion by doing this, albeit for another set of morals, is actually the exact reason that made a lot of people migrate here.

28.11.2024 05:39 👍 20 🔁 1 💬 1 📌 0

Some days I really like this place, and then there are others in which there's a level of puritanical fervour that permeates a lot of public discourse that I find off-putting. Some of the over the top hateful responses wouldn't be out of place in the Hellsite.

28.11.2024 05:16 👍 55 🔁 6 💬 4 📌 1

We should make sure that only really big companies can afford to pay really big copyright holders to access the data needed to do stuff with AI, and keep everyone else out.

Wouldn’t that be just super?

28.11.2024 05:04 👍 132 🔁 9 💬 6 📌 2

Data gathering on an open platform via an open protocol is only ethical if you're not told about it, silly.

28.11.2024 05:09 👍 11 🔁 0 💬 0 📌 0

It’s been absolutely horrible to watch this. Pure “it’s fine to insult, harass and threaten people as long as you are doing it for the right reason” energy.

At least blocklists help, I guess blocking toxicity on sight is the only way.

28.11.2024 01:36 👍 18 🔁 0 💬 0 📌 0

I'm disheartened by how toxic and violent some responses were here.

There was a mistake, a quick follow up to mitigate and an apology. I worked with Daniel for years and is one of the persons most preoccupied with ethical implications of AI. Some replies are Reddit-toxic level. We need empathy.

27.11.2024 11:09 👍 333 🔁 37 💬 29 📌 8

fast.ai—Making neural nets uncool again – fast.ai

I'm a TA for the new fast.ai course which starts in less than 30 minutes, and which sold out in <48 hours. It's so cool to see it all coming together

26.11.2024 22:35 👍 11 🔁 2 💬 1 📌 0

OLMo 2 is out 🥳 7B and 13B trained on 5T tokens, and meticulousy instruction tuned using Tulu 3 recipe.

Simply the best fully open models yet.

Really proud of the work & the amazing team at
@ai2.bsky.social

26.11.2024 21:12 👍 260 🔁 44 💬 9 📌 2

The fou du metro into 7€ happy hour drink pipeline does take its toll over time

25.11.2024 10:12 👍 1 🔁 0 💬 1 📌 0

Small French towns are really the only way to stay sane in spite of Paris to be fair

25.11.2024 10:11 👍 1 🔁 0 💬 2 📌 0

you guys need a cute a mascot, then we can start the posting.

> yet another very useful finding from the kraken, it seems DPO is stronger than we thought

24.11.2024 10:11 👍 5 🔁 0 💬 1 📌 0

benjamin

Latest posts by benjamin @bclavie