Tuan Trinh's Avatar

Tuan Trinh

@tuantx

Data, Machine Learning and Entreprenuership

95
Followers
1,502
Following
1
Posts
16.11.2024
Joined
Posts Following

Latest posts by Tuan Trinh @tuantx

Video thumbnail

Entropy is one of those formulas that many of us learn, swallow whole, and even use regularly without really understanding.

(E.g., where does that “log” come from? Are there other possible formulas?)

Yet there's an intuitive & almost inevitable way to arrive at this expression.

09.12.2024 22:44 👍 545 🔁 128 💬 22 📌 12
Post image

Super interesting (IMHO) example of HNSW limits and approximated behavior. In this test we have 2 out of 19167 nodes are not reachable doing a hnsw_search(their_vector). Why? Because of ambiguity: they are also names, so the search will enter a local minima (at EF=200).

17.12.2024 11:49 👍 15 🔁 1 💬 1 📌 0
Preview
Table foundation models for analytics Deep-learning typically does not outperform tree-based models on tabular data. Often this may be explained by the small size of such datasets. For image…

Slides for "Table Foundation Models"

I explain why these models can strongly outperform tree-based models, what are the intuitions,
hopefully pointing to ways forward for more improvement

speakerdeck.com/gaelvaroquau...

15.12.2024 22:43 👍 81 🔁 13 💬 3 📌 2
Preview
The Tortoise and the Hare in Alloy If you’ve done your share of leetcode-style interviewing, and you’re above a certain age, you may have been asked during a technical screen to write a program that determines if a linke…

New blog post: surfingcomplexity.blog/2024/11/27/t...

28.11.2024 06:03 👍 10 🔁 5 💬 1 📌 1
What are important data systems problems, ignored by research? A blog by and for database architects.

I was recently on a panel on "What are important data systems problems, ignored by research?" with @andypavlo.bsky.social and Allison Lee moderated by Viktor Leis - here is the write-up of the discussion databasearchitects.blogspot.com/2024/12/what...

13.12.2024 07:58 👍 63 🔁 12 💬 2 📌 2
Notes On: Disaggregated OLTP Systems Aurora, Socrates, PolarDB, and Taurus.

Some folk from software internals discord and I read the series of disaggregated oltp papers and met to talk about them. I wrote an informal overview of the papers and a summary of some of the discussion after each paper: transactional.blog/n...

07.12.2024 05:52 👍 32 🔁 6 💬 0 📌 0

While there are countless code examples to learn from, formal models are harder to find 👀

Blog posts exploring modeling approaches are a rare chance to sharpen your skills ❤️

bsky.app/profile/domi...

28.11.2024 16:25 👍 5 🔁 1 💬 0 📌 0

The Porcupine linearizability checker is really cool: github.com/anishathalye...

I love ideas like P-compositionality (arxiv.org/pdf/1504.00204) - it's something nobody else thought of, that seems so obvious in hindsight. A relatively small insight that vastly simplifies a tough problem.

13.11.2024 16:35 👍 5 🔁 1 💬 0 📌 0
Post image



New version (v1.9.1) of Geogram, the award-winning geometry processing library is out !

New in this version:

- much faster (2x speed) large-scale periodic Delaunay triangulation and power diagrams

- Linsolve/GPU: AMGCL + new nlCuda backend goes brrrr !

github.com/BrunoLevy/ge...

26.11.2024 08:22 👍 43 🔁 12 💬 2 📌 0
Post image

When trying to compute a dual of a composite problem involving two functions and two linear operators (e.g., Total Variation regularization of inverse problems), it is sometimes useful to consider either of the operators as the dual operator.

28.11.2024 06:00 👍 6 🔁 2 💬 0 📌 0
Preview
Disillusioning the Magic of the fork System Call How the kernels implement the fork system call

When you first learn about the fork() syscall, it can seem magical. How can a single system call produce two different return values at the same time?!

In my latest article, I demystify the hidden magic of fork and also show how it is implemented in Linux.
blog.codingconfessions.com/p/the-magic-...

27.11.2024 11:36 👍 16 🔁 4 💬 1 📌 0
Preview
Release 0.1.3 - Support for `aisuite` and `litellm` · cfahlgren1/observers What's Changed feat: initial packaged version by @davidberenstein1957 in #2 feat: argilla support by @davidberenstein1957 in #3 add datasets example by @cfahlgren1 in #4 Improve quickstart example...

Andrew Ng released "aisuite", so we added it to observes. Start observing your AI models but then lightweight.

`pip install observers[aisuite] # or observers[litellm]`

Release:
github.com/cfahlgren1/o...

27.11.2024 11:19 👍 8 🔁 2 💬 0 📌 0
Preview
Google Colab

I made a notebook with a few notes on Diffusion Models for a "tutorial" in a project-workshop yesterday. Not really an introduction, but I give some insights that I usually don't see elsewhere. Feel free to reuse.
🔗https://colab.research.google.com/drive/1EyqALXFvgKGsTiFDALGEHH5-WnuGjOKU?usp=sharing

23.11.2024 11:32 👍 33 🔁 7 💬 2 📌 0
Video thumbnail

Anne Gagneux, Ségolène Martin, @quentinbertrand.bsky.social Remi Emonet and I wrote a tutorial blog post on flow matching: dl.heeere.com/conditional-... with lots of illustrations and intuition!

We got this idea after their cool work on improving Plug and Play with FM: arxiv.org/abs/2410.02423

27.11.2024 09:00 👍 355 🔁 102 💬 12 📌 11
Post image

Releasing SmolVLM, a small 2 billion parameters Vision+Language Model (VLM) built for on-device/in-browser inference with images/videos.

Outperforms all models at similar GPU RAM usage and tokens throughputs

Blog post: huggingface.co/blog/smolvlm

26.11.2024 16:58 👍 231 🔁 31 💬 4 📌 1
Post image Post image Post image Post image

My deep learning course at the University of Geneva is available on-line. 1000+ slides, ~20h of screen-casts. Full of examples in PyTorch.

fleuret.org/dlc/

And my "Little Book of Deep Learning" is available as a phone-formatted pdf (nearing 700k downloads!)

fleuret.org/lbdl/

26.11.2024 06:15 👍 1252 🔁 248 💬 46 📌 17
Post image

SmolLM - run, pre-train, fine-tune, evaluate SoTA fully open source LM 🔥

Run with Transformers, MLX, Transformers.js, MLC Web-LLM, Ollama, Candle and more!

Apache 2.0 licensed codebase - go explore now!

25.11.2024 13:17 👍 36 🔁 2 💬 1 📌 0
CS492(D) Diffusion Models and Their Applications (KAIST, Fall 2024)

Minhyuk Sung's course "Diffusion Models and Their Applications" at KAIST is now fully online, including all lectures, slides, and programming assignments: mhsung.github.io/kaist-cs492d...

25.11.2024 14:18 👍 46 🔁 7 💬 2 📌 0
Post image

Happy that our technology in AI recommendation is granted USA patent.

24.11.2024 11:02 👍 1 🔁 0 💬 0 📌 0