Never used but @papra.app maybe?
Never used but @papra.app maybe?
Even found a second (minor) bug by coincidence
the sheer speed from initial bug report to @fjallrs.bsky.social getting a fix fully released is just wild.
Changelog: github.com/fjall-rs/fja...
V2->V3 migration: github.com/fjall-rs/mig...
A strange moment, but - I just released 3.0.0.
fjall-rs.github.io/post/fjall-3/
What has started as a rewrite of the block format has become almost a full rewrite, interrupted by a bachelor's thesis, faulty hardware and a couple of weeks of sickness - but I think the end product is pretty good.
Initial curve was bulk ingestion
Love it when SQLite just stops working but allocates hundreds of GBs of disk space.
Check out my talk on physical replication and Graft that I delivered at the recent @syncconf.bsky.social in SF. youtu.be/QoKzDyH2MEA?...
Restarting, the drive works again - strange!
So for the last 3 (!!!) days I was repro'ing a weird perf regression where read I/O would randomly spike. First I thought it was the block cache implementation, or compaction. But today I just got a "Input/output error". First I thought XFS died, but nope, even smartctl gets I/O error now, great.
Well... looks like my Kingston disk just died...?!
This year's winter album... goes to Maria Somerville
youtu.be/k9qEKyQ7jc4
I have prepared a preliminary changelog, too. This all has been a bit of work, you could say.
github.com/fjall-rs/fja...
RC 1 is out! I forgot to post about RC 0, but here we are.
At this point API and disk format are unlikely to further change - now it's all about stabilization for the final release...
crates.io/crates/fjall...
I rewrote the entire write buffer, and journal backpressure system to be a more robust queue + messaging system. Also, now no more threads are started and stopped in the background on the fly. There is a single thread worker pool now (adjustable, at least 1), and that's it.
v3 pre.6 is out - probably the penultimate prerelease before going into release candidate(s). At this point, all major architectural reworks are done, and just some APIs need some changes. Followed by a final cleanup and stabilization phase.
crates.io/crates/fjall...
I didn't know what I was in for
And because less data is written to disk, it frees up availabe IOPS and increases SSD endurance.
With key-value separation, blobs are written at least twice (journal + blob file). So if a blob is compressible (let's say by 40%), we will still incur a write amp of 1.6x.
Journal compression uses more CPU, but because less data is transferred to OS and disk, it can actually be faster.
3.0.0 pre.5 is out and brings journal compression support.
Here's 16K JSON blobs being written:
Yesterday or so I ran a benchmark over night, and v3 scales much better for extremely large (100+ GB) databases
v3 pre-release 2 is out!
Still not file format -or API stable, but most importantly this release marks lsm-tree being *feature complete*.
crates.io/crates/fjall...
Well, it was a faulty RAM stick everybody
Thanks ๐ญ
So I have now tried btrfs, xfs, ext4 on two diff. SSDs, updating to Ubuntu 24; and I'm still getting corruptions... at this point I don't really know what to do.
This is not code specific (RocksDB is also affected), yet I don't understand how my OS and all apps just seem to continue to work fine...
Without having numbers (right now), LMDB will probably read absurdly fast, while not so much space- or write efficient.
Most, if not all data, fits into RAM here. Would be interesting to benchmark 100s of GBs.
But I found redb to be slower (sometimes much slower) than LMDB, pretty much always.
And here's another read-heavy bench with 4K values, 5% sync random updates and 95% Zipfian reads.
ReDB uses 4x more disk space, and writes 6x slower. Interestingly, it's also ~5x slower in point reads - I think ReDB is not handling large values well; haven't read too much into its implementation.
Synchronous writes (fsync) I might add
95% Zipfian reads, 5% random updates on a Kingston PLP SSD with 16K JSON blobs - sled DNF! (OOM)
fjall uses LZ4 compression
constellation runs on a bare metal server with 4 dedicated CPU cores, 16GiB memory, 1TiB fast NVMe SSD attached, and 500mbit unmetered network connection.
**slightly slower in reading