Entropy is one of those formulas that many of us learn, swallow whole, and even use regularly without really understanding.
(E.g., where does that “log” come from? Are there other possible formulas?)
Yet there's an intuitive & almost inevitable way to arrive at this expression.
09.12.2024 22:44
👍 545
🔁 128
💬 22
📌 12
Super interesting (IMHO) example of HNSW limits and approximated behavior. In this test we have 2 out of 19167 nodes are not reachable doing a hnsw_search(their_vector). Why? Because of ambiguity: they are also names, so the search will enter a local minima (at EF=200).
17.12.2024 11:49
👍 15
🔁 1
💬 1
📌 0
What are important data systems problems, ignored by research?
A blog by and for database architects.
I was recently on a panel on "What are important data systems problems, ignored by research?" with @andypavlo.bsky.social and Allison Lee moderated by Viktor Leis - here is the write-up of the discussion databasearchitects.blogspot.com/2024/12/what...
13.12.2024 07:58
👍 63
🔁 12
💬 2
📌 2
Notes On: Disaggregated OLTP Systems
Aurora, Socrates, PolarDB, and Taurus.
Some folk from software internals discord and I read the series of disaggregated oltp papers and met to talk about them. I wrote an informal overview of the papers and a summary of some of the discussion after each paper: transactional.blog/n...
07.12.2024 05:52
👍 32
🔁 6
💬 0
📌 0
While there are countless code examples to learn from, formal models are harder to find 👀
Blog posts exploring modeling approaches are a rare chance to sharpen your skills ❤️
bsky.app/profile/domi...
28.11.2024 16:25
👍 5
🔁 1
💬 0
📌 0
The Porcupine linearizability checker is really cool: github.com/anishathalye...
I love ideas like P-compositionality (arxiv.org/pdf/1504.00204) - it's something nobody else thought of, that seems so obvious in hindsight. A relatively small insight that vastly simplifies a tough problem.
13.11.2024 16:35
👍 5
🔁 1
💬 0
📌 0
New version (v1.9.1) of Geogram, the award-winning geometry processing library is out !
New in this version:
- much faster (2x speed) large-scale periodic Delaunay triangulation and power diagrams
- Linsolve/GPU: AMGCL + new nlCuda backend goes brrrr !
github.com/BrunoLevy/ge...
26.11.2024 08:22
👍 43
🔁 12
💬 2
📌 0
When trying to compute a dual of a composite problem involving two functions and two linear operators (e.g., Total Variation regularization of inverse problems), it is sometimes useful to consider either of the operators as the dual operator.
28.11.2024 06:00
👍 6
🔁 2
💬 0
📌 0
Disillusioning the Magic of the fork System Call
How the kernels implement the fork system call
When you first learn about the fork() syscall, it can seem magical. How can a single system call produce two different return values at the same time?!
In my latest article, I demystify the hidden magic of fork and also show how it is implemented in Linux.
blog.codingconfessions.com/p/the-magic-...
27.11.2024 11:36
👍 16
🔁 4
💬 1
📌 0
Google Colab
I made a notebook with a few notes on Diffusion Models for a "tutorial" in a project-workshop yesterday. Not really an introduction, but I give some insights that I usually don't see elsewhere. Feel free to reuse.
🔗https://colab.research.google.com/drive/1EyqALXFvgKGsTiFDALGEHH5-WnuGjOKU?usp=sharing
23.11.2024 11:32
👍 33
🔁 7
💬 2
📌 0
Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!
We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423
27.11.2024 09:00
👍 355
🔁 102
💬 12
📌 11
Releasing SmolVLM, a small 2 billion parameters Vision+Language Model (VLM) built for on-device/in-browser inference with images/videos.
Outperforms all models at similar GPU RAM usage and tokens throughputs
Blog post: huggingface.co/blog/smolvlm
26.11.2024 16:58
👍 231
🔁 31
💬 4
📌 1
SmolLM - run, pre-train, fine-tune, evaluate SoTA fully open source LM 🔥
Run with Transformers, MLX, Transformers.js, MLC Web-LLM, Ollama, Candle and more!
Apache 2.0 licensed codebase - go explore now!
25.11.2024 13:17
👍 36
🔁 2
💬 1
📌 0
CS492(D) Diffusion Models and Their Applications (KAIST, Fall 2024)
Minhyuk Sung's course "Diffusion Models and Their Applications" at KAIST is now fully online, including all lectures, slides, and programming assignments: mhsung.github.io/kaist-cs492d...
25.11.2024 14:18
👍 46
🔁 7
💬 2
📌 0
Happy that our technology in AI recommendation is granted USA patent.
24.11.2024 11:02
👍 1
🔁 0
💬 0
📌 0