On top if the obvious cautionary tale, the post mortem on this is very respetable
On top if the obvious cautionary tale, the post mortem on this is very respetable
Claude deleted a production database and all automated snapshots, immediately adopting a passive voice to avoid responsibility. It truly learned from humans!
"When I asked Claude where the database was, the answer was straightforward: it had been deleted."
alexeyondata.substack.com/p/how-i-drop...
In the Statsig experimentation tool we now have an SPRT option, in addition to frequentist and Bayesian.
As problematic as CI interpretation is we at least a shared misinterpretation.
Any advice for a laypersonβs interpretation of SPRT?
#rstats
New parable
Junior SWE: whatβs the secret to getting good LLM code?
Senior SWE: knowing what to look for in code reviews
Junior SWE: whatβs the secret to knowing what to look for in code reviews?
Senior SWE: writing a lot of code by hand
Old parable
Young one: whatβs the secret to making smart decisions?
Wise one: good judgment
Young one: whatβs the secret to gaining good judgment?
Wise one: bad decisions
For this and other exciting adventures is automation:
github.com/anthropics/c...
github.com anthropics / claude-code New issue [Bug] VSCode extension incorrectly attributes manual commits to Claude as author β’ Open Labels area:ide bug platform: vscode git platform:macos osalmine opened 3 hours ago ... Bug Description Committing manually through VSCode marks the commit as authorder by claude (not co- authored, just completely authorder by it) while it was actually completely authored by me (the user). Claude did not write the code or wasn't even running. Maybe the VSCode extension caused this?
βAI writes X% of our codeβ
Are you 100% sure about that?
I would totally try a hot sauce with the tagline βso spicy it will cURL your hairβ
I lolβed
Sooooo real
How is @spavel.bsky.social βs Product Picnic always so damn good?
Fun fact: No Sleep Til Brooklyn was inspired by a toddler on a road trip
ope, just gonna add some typos to make your writing look more human
Or you can skip all that and use the crap answers if youβre down with bottom shelf inference.
Itβs a whack-a-mole, but the need for intensive human participation seems to be constant.
Getting high quality results from a machine always seems to require a tedious manual step. If itβs not curating training data or hand labeling, itβs writing evals, moderation, and tracking down user-reported failure cases.
Bluesky needs polls
repost if you agree, like if you disagree
Too many requests
@sebastianraschka.com latest post looks like a hot one!
I'm a cognitive scientist with an interest in epistemic vigilance, and this essay that's been going around gave me pause.
I don't think it's straightforward to apply the concept of epistemic vigilance to interactions with LLMs, as this essay does.
π§΅/
sbgeoaiphd.github.io/rotating_the...
We are a few thousand subscribers away from being a bigger print newspaper than the Washington Post.
We are a mere flurry away from humiliating one of the two Big Space Perverts.
Ask not how a Space Pervert can dominate you. Ask how you can dominate a Space Pervert.
Subscribe below.
The raise of AI coding tools isn't a surprise because they have the potential to eliminate one of the largest software development productivity killers: searching for usage examples.
Iβm here for breakfast physics content
A graph using reports from people over papers showing Numbers of reported kidnappings and activity from Minnesota, California, Florida, Texas, and then a combination of all other states. The highest Spike is in mid-January showing that Minnesota was reporting 500 incidents a day while California was reporting right around 100. Along this whole graph, Minnesota is showing at least double reports compared to everyone else with the exception of a time in very early January when California was almost on Pace with what we were seeing here. There is a small heading up top that says: immigration enforcement in Minnesota dwarfed the rest of the nation.
Seeing this visualization has been really impactful for me. There were times when people tried to come at us sideways for claiming this is *different* and massive in a way that we hadn't seen other places. But it really really has been.
oh you mean human-authorship watermarrks?
I read this to my wife and she shouted βyes! I learned that the hard way last year in a parking lot!β
Iβm going to ask for a refund on our Subie Force Field package.
Not even if we turn on the Foresterβs X-mode snow button?
New post: Training an Artisanal Language Model's tokenizer
brandonrohrer.com/alms_tokeniz...
The first step in building a small-scale LLM, an Artisanal Language Model, is to create a vocabulary of tokens. Building a model from scratch gives us the rare opportunity to look closely at tokenization.
YOU CERTAINLY WILL NOT REGRET CONTINVOUCLY MORGING YOUR CODE
We have an opportunity to benchmark the indexing rate of search engines by noting how soon they turn up matches for βcontinvoucly morgingβ
Boyfriend in bed meme: "I bet he's thinking about continvoucly morging" "I could be continvoucly morging rn"
And she'd be right