"RL is hard" - probably a lion, 2025
"RL is hard" - probably a lion, 2025
Why does it matter? I would let people project whatever identities they want, if the work is getting done.
And, probably there is a strong survivorship bias: the "engineers" like popularizing their work, while "scientists" not so much :)
It is quite depressing that there is not an existing platform for article clubs, with some cool features like maybe ranking of articles in terms of "surprisingness of results" or maybe "existence of the problem".. Maybe a review system built on top..
I think the author confused "winter" with "bubble", eg dot com bubble didn't imply dot com winter
I had something like bottou 1998 in mind - the first non convex convergence in the online learning framework
That's the earliest work I know about convergence in a non-convex setting :)
Maybe you could cite the first non convex convergence proof?
I'd guess it was written around the 2010s, though ofc it's nearly impossible to figure out the "first" proof
Isn't it somewhat trivial? As in, if an LLM has answered the question (adequately) within the first 25 tokens, then it doesn't need to search :)
I guess that most of the training is done in a regime when l_inf is about the same, or like "similar", so I actually would expect this method to work well
But I don't believe it's generalizable to arbitrary datasets without seeing the data in advance
But you should get slightly longer runtime when running half batches in sequence;
that is the reason why you use "largest possible batch size": so that you are guaranteed to get 100% utilization of cuda/tensor cores
good point nonetheless; for many applications running stuff in sequence works out
I dislike that the thing is a part of not the thing but also not the other thing
And median is the point that minimizes the average of absolute differences of points drawn...
M-estimators (generalization of these location estimators) are wonderful, I'm sad people outside of robust statistics have barely heard about them :(
Sewing maching is like the most complex mechanical apparatus that most of the people encounter in life lol
Hmm, introduction into stochastic calculus without measure theory? That's the only reason to put quotes I guess - if the person reading the book knows only Riemann definition than the stochastic integral does not really make sense, but it is still cute
So you are effectively killing the same neurons over and over again, unless the gradient norm is extremely large?
Like you should have qualitatively the same bahaviour as a bottleneck, I would guess
I had a script called "logging.py" once which is also quite fun
But there are also cool tricks to improve speed of methods, which Rahul actually talks about: like the neat connection to Walds sequential testing
My conclusion is that you need to derive the RANSAC method for each setting; I'm pretty sure the current SOTA for panorams is MLESAC-ish (Rahul doesn't mention it?), which is just an M-estimator
Maybe we should write a paper about "how to derive RANSAC for your problem" ๐ค
RANSAC always feels to me like something that can be easily improved, but every time I remember that there is like a thousand variations, and I decide that it's not worth the effort :)
There are so many problems with science that can be solved by a decent publishing system, but oh well
I also think the situation improves: much more people just use preprints for most of their work these days, which might slowly force journalists to consider preprints to be real papers
I'm currently in the progress of attempting to publish such a paper, and my idea to make it "novel" is to just use a heuristic on top of the "old method" that improves performance.
It's not super novel, true, but it will satisfy reviewers who desire novelty ig
But - isn't it the problem with institutions, not with students? If e.g. Oxford requires 3 letters, but "no prior research experience is required", what the student should do?
I'd write a p.s. in the letter that you hate unis 'requiring' the recommendations, but still help the student.
This could actually be very useful for doing quick experimenting with LLMs, by putting the doc into the context, though the scale is somewhat small
Harry Potter and methods of rationality?) Some people like it, some don't, but ig the earlier you try to read it - the better
So we just need a billionaire to make their own journal :))
A simple way to enforce "paid only for corporations" is to highly encourage preprints, as was already mentioned in other comments
Another fun idea: publish not a % of all submitted papers, but % of all scientists in the field - nearly constant number of papers. This would significantly slow down publish or perish. And, with the right financial incentives, it forces people to produce high quality (high risk) research
The perfect publishing system would pay the authors and reviewers % of money earned from the paper + require only companies/academia to pay for access; So far the main issue is the starting capital - can't go to YC if not expecting decent returns
watch -n 0.1 ...
I only store the papers related to the currently running projects/ideas. If it's not related, and I don't have time to read it -> ๐๏ธ
"Choosing the right problems to work on is the most useful skill one can have"
Tell this BPTT (which explodes, but unbiased!)