Jamais réussi à lire non plus. Et même sentiment : pas vraiment de vie là-dedans.
Jamais réussi à lire non plus. Et même sentiment : pas vraiment de vie là-dedans.
oh yes, obviously, i can make this now
who talk about cleanly?
Well 10 years of teaching it… Likely last time.
I guess Donald Knuth must have thought of that :)
just realized that jupyter is probably dead as a concept. it's all md+scripts now.
more seriously: i still think "computation" is also happening internally (just in a smooth/transient way, not that dissimilar to actual math search prior formal verification)
I'm afraid this is anthropomorphizing. The proof was there all along in future training data.
Nothing to see, just very powerful pattern matching. www-cs-faculty.stanford.edu/~knuth/paper...
actually, yes.
Not sure for the US, but in Europe started very early on (even q3 2023) with their positioning on safety/alignment and avoiding the mess openai got into at the same time (GDPR blocks, etc.)
(Our next release will actually be personas)
Would also open up the much more interesting question of how to design and tune personas. I’m currently switching to agentic model training and simulated personas are everywhere, one of the absolute core original seed.
Oh been part-time there for a while now. Always good to have a platform plan b.
Models should design, models should populate, models should compile.
Sorry to say that I'm slowly becoming anti-handcraft RL environments. Textbook bitter lesson.
Some example of how it works in practice: after golden gate claude, you can get red baguette.
SAE weights are now available. I think this make Baguettotron the smallest yet effective model available for merch interp research. huggingface.co/lyraaaa/bagu...
je crains que ce soit surtout des idées maintenant répandues dans l'électorat cœur. dérive collective…
By all account most used llm training technique, most critical one for synth pipelines, and, ThinkingMachine aside, very few committed research in the open.
Frankly the number of unsettled topics on SFT is insane.
He definitely but also seems to come more on the data design side: worked on early ChatGPT persona/behavior with Joanne Jang, recently on a literary model bundled inside gpt-5.
forever relieved to not have spent the last two years on prompt layer orchestration
Thanks Glyn for supporting our work and its recent global turn! So far the total amount in grant for Common Corpus is still about zero…
we had a hard time…
Last week, we presented officially our (famed) zip drive on French podcast A la French which got many people curious. www.youtube.com/watch?v=wirm...
Since Baguettotron is currently buzzing in France right now, announcing the first official demo on HuggingFace (in arena mode vs. gemma-270m). huggingface.co/spaces/PleIA...
New amazing interpretability/SAE work on Baguettotron! Almost surprised how much the high entropy section are actually connected to analytical features: lyramakesmusic.github.io/bread-slicer/
Sure, but at least in Europe, ambiguity is *very* unhelpful. If that's what we really mean, I think we need better words.
(latest Bender bizarre anti-Doctorow thread was all about reframing stochastic parrot to be only about harms + open data — except to have been in this space for years, hardly ever saw her)