Sung Kim (@sungkim)

Use AI to turn that experience into action. Build the things you’ve always wanted to build.

07.03.2026 05:10 👍 12 🔁 0 💬 0 📌 0

For those of you who have spent decades in the tech industry: AI has leveled the playing field with the young engineers grinding LeetCode.

You bring something they don’t—decades of experience in design patterns, system architecture, and infrastructure.
.

07.03.2026 05:10 👍 45 🔁 1 💬 4 📌 1

What arxiv.org/abs/2512.24873

07.03.2026 02:53 👍 23 🔁 2 💬 4 📌 1

LOL. It's always crypto.

07.03.2026 03:14 👍 4 🔁 0 💬 0 📌 0

Otherwise is known as TACO countdown has started.

07.03.2026 02:44 👍 7 🔁 0 💬 0 📌 0

Cursor Goes To War For AI Coding Dominance After becoming the hottest, fastest growing AI coding company, Cursor is confronting a new reality: developers may no longer need a code editor at all.

Source: www.forbes.com/sites/annato...

07.03.2026 02:17 👍 7 🔁 0 💬 0 📌 1

They also estimate that Claude Code’s $200 monthly plan, which previously may cost up to $2,000 to support, is now may costing Anthropic up to $5,000.

Cursor needs its own model now, just to compete!

07.03.2026 02:14 👍 14 🔁 2 💬 4 📌 1

Being an AI wrapper company is brutal.

Cursor, last valued at $29.3 billion (Series D) in November 2025, is on red alert. Despite surpassing $2 billion in ARR and doubling its revenue in the three months since its last round, the pressure is mounting.

07.03.2026 02:14 👍 49 🔁 8 💬 3 📌 4

Does this mean I have to buy a Mac mini and install OpenClaw to stay up to date with Chinese Grandpas? I really don't want to. Ugh.

06.03.2026 22:15 👍 13 🔁 1 💬 0 📌 0

UPDATE: It's real, per Tencent.

06.03.2026 22:09 👍 36 🔁 0 💬 1 📌 2

Actually, $5,000. Probably, not USD.

06.03.2026 22:02 👍 1 🔁 0 💬 0 📌 0

06.03.2026 21:59 👍 6 🔁 1 💬 0 📌 0

06.03.2026 21:59 👍 5 🔁 0 💬 1 📌 0

Is this real? There's a large turnout for an OpenClaw installation offsite in Shenzhen.

06.03.2026 21:54 👍 21 🔁 2 💬 2 📌 2

A better FlashAttention V3?

vLLM Triton Attention has ~800 lines of Triton, same source code across NVIDIA, AMD, and Intel GPUs. On H100, it matches state-of-the-art attention performance. On MI300, ~5.8x faster than earlier implementations.

blog.vllm.ai/2026/03/04/v...

06.03.2026 17:54 👍 11 🔁 1 💬 1 📌 0

Did the author solved the Riemann Hypothesis or not? You tell me.

"Analysis of the Riemann Zeta Function via Recursive Taylor Expansions"

arxiv.org/abs/2603.05122

06.03.2026 17:49 👍 5 🔁 0 💬 3 📌 0

BM25 There is a particular kind of respect reserved in engineering for the algorithm that outlives its era. BM25 is one of them. BM25 was born out of information retrieval research in the 1970s and 1980s, ...

When you need to tune for your domain, the parameters give you meaningful handles to turn. The interpretability is genuinely valuable."

arpitbhayani.me/blogs/bm25

06.03.2026 07:11 👍 6 🔁 1 💬 0 📌 0

BM25 by Arpit Bhayani

"What makes BM25 worth understanding is not just that it works. It is that it works for knowable reasons. Every part of the formula has a clear interpretation. When a result is surprising, you can trace why.

06.03.2026 07:11 👍 15 🔁 1 💬 1 📌 0

A word of wisdom to live by - do not let your luxury possession possess you.

06.03.2026 06:16 👍 10 🔁 2 💬 2 📌 0

So true.

05.03.2026 23:34 👍 27 🔁 1 💬 1 📌 1

My thoughts on gpt-5.4 high on Codex CLI

I have no idea if it is better than gpt-5.3-codex or even gpt-5.2, but it devours tokens like a competitive eater at a Las Vegas buffet.

05.03.2026 21:31 👍 26 🔁 0 💬 4 📌 0

Intel Panther Lake Die Shot

Why does it look like Impressionist painting? BSPDN.

05.03.2026 20:32 👍 37 🔁 1 💬 2 📌 0

FYI

05.03.2026 19:31 👍 8 🔁 1 💬 1 📌 0

Speculative Speculative Decoding (SSD)

It's up to 2x faster than the strongest inference engines in the world, but you need H100 or better GPUs.

Paper: arxiv.org/abs/2603.03251
Repo: github.com/tanishqkumar...

05.03.2026 19:16 👍 16 🔁 2 💬 1 📌 0

PyTorch's FlexAttention also supports FlashAttention-4 backend.

PyTorch now auto-generates CuTeDSL score/mask modifications and JIT-instantiates FlashAttention-4 for your custom attention variant

The result: 1.2× to 3.2× speedups over Triton on compute-bound workloads.

pytorch.org/blog/flexatt...

05.03.2026 18:52 👍 10 🔁 0 💬 0 📌 1

- Paper: github.com/Dao-AILab/fl...
- Code: github.com/Dao-AILab/fl...

- Blogposts:
together.ai/blog/flashat...
tridao.me/blog/2026/fl...
research.colfax-intl.com/flashattenti...

05.03.2026 18:47 👍 3 🔁 1 💬 0 📌 0

FlashAttention-4

I hope it is not pain to work with. It changes the algorithm & pipeline so that softmax & SMEM bandwidth no longer dictate speed. Attn reaches ~1600 TFLOPs, pretty much at matmul speed!

05.03.2026 18:47 👍 27 🔁 4 💬 2 📌 3

You can always go to other platforms and browse through 100s or 1,000s of postings and view posts by the original authors.

05.03.2026 16:22 👍 0 🔁 0 💬 0 📌 0

OpenAI's Symphony

A Linear Board for agents.

github.com/openai/symph...

05.03.2026 06:35 👍 15 🔁 1 💬 1 📌 0

Teaching LLMs to reason like Bayesians Google researchers demonstrate how Bayesian teaching through supervised fine-tuning enables LLMs to approximate optimal probabilistic reasoning and generalize to new domains.

Teaching LLMs to reason like Bayesians

By training models to mimic optimal probabilistic inference, they improved their ability to update their predictions and generalize across new domains.

research.google/blog/teachin...

05.03.2026 06:22 👍 50 🔁 2 💬 2 📌 0

Sung Kim

Latest posts by Sung Kim @sungkim