So cool, thanks. I didn't know there were so many apps using the ATprotocol
So cool, thanks. I didn't know there were so many apps using the ATprotocol
Hopefully, we'll soon have Iceberg Materialized Views: github.com/apache/icebe...
They would be the perfect destination for the push queries.
That's awesome. I didn't know that.
I always think about the Iceberg metadata as a 2 level tree based multidimensional spatial index. The way you construct the tree is up to you.
It's just that Iceberg was designed for really huge tables and they decided to split the metadata over multiple files.
I find the Datawarehouse Toolkit and Designing Data-Intensive Applications both incredibly good.
There will be a second edition of DDIA soon!
Too bad, doesn't look like there will be an online recording.
I'd love to hear your take on columnar formats for AI and if we can evolve parquet accordingly.
At long last, @chris.blue and I have submitted the final manuscript of Designing Data-Intensive Applications, second edition, to the publisher. There is always more that could be improved but at some point we just have to call it done. Now it goes into production; probably shipping in ~4 months.
I can't wait!
Looks like I will have to eventually give Omarchy also a try
Regarding the battery life, do you have some kind of CPU scaling installed?
One bad thing about arch Linux is that it comes with almost no defaults installed.
I installed power-profiles-deamon and it helped a lot with the battery.
wiki.archlinux.org/title/CPU_fr...
Yeah classic, I should also do a service sometime. But probably something also has to happen first ;)
It's a really difficult question, because in the end you need to pay developers to build the software.
I just feel that many of those companies get really greedy and want to raise a lot of money. And once you raised a lot you are forced to bring in bigger returns that aren't possible with OSS.
I think at some point you will need columnar storage to store the list of all table formats.
Uff, that's rough. Did you have the stuff to fix it? Or did you have to do the walk of shame?
As always, great article!
Great work! Really cool stuff.
Sounds really cool! I'll try to make the journey to Nürnberg next week.
Yes, you can remove both. I tested it on Android.
Same here. The design is so awesome.
Also love the "Unix" style interaction which makes it compose so well with other tools.
I tried Aider and Claude Code, their approaches are very similar but Claude Code feels much more powerful. It's really great at getting additional context in the process. While Aider only gets it beforehand.
The only thing that's missing from Claude Code is AI comments: aider.chat/docs/usage/w...
Awesome post, as always!
One thing I realized lately is that the authentication should be standardized with the Iceberg REST catalog (like an OIDC endpoint).
Otherwise every vendor has their own authentication and only their client will know how to authenticate.
Claude is actually getting pretty good at coding.
Did you find a good AI assistant for vim?
@thorstenball.com is contemplating the same thing: [register spill](registerspill.thorstenball.com/p/how-might-...)
It might be CONTEXT.md
Looking forward!
Sadly, this book is overlooked by too many people. It's a must-read if you're in data.
I wish there was an ebook version.
You're right, I wasn't entirely clear
Well, iceberg makes this metadata available at a higher level: the manifest-list and manifest files. Which means that you don't have to read all the parquet files.
Every commercial data warehouse stores additional metadata like upper & lower bounds, statistics, and distinct counts on top of the actual data files to assist the query optimizer.
Iceberg is an open standard for this kind of metadata and provides speed ups over plain parquet.
Well, ideally it would be complete read and write support.