Ohh, I'd love Santiago's feedback on this! I was going to wait until I added some more polish, but I think Claude found a few things that could be ported back to the C implementation as well :).
Ohh, I'd love Santiago's feedback on this! I was going to wait until I added some more polish, but I think Claude found a few things that could be ported back to the C implementation as well :).
Here’s my bug report, which links the C++ MRE. Curious if others see the same thing github.com/microsoft/mi....
Here’s my bug report, which links the C++ MRE. Curious if others see the same thing github.com/microsoft/mi....
Has anyone experienced instability issues (crash during teardown) when using mimalloc on MacOS? I don't see it in Rust, but it's showing up in C++, and asan on the program is clean.
Less than 48 hours later and here is where we stand. It started as a simple experiment, but, honestly, if I were doing something in Rust now, I think I'd use this rather than any of the bindings for the C implementation ;P.
Dear @anthropic.com, my lab builds a lot of OSS for genomics (github.com/COMBINE-lab). While we lack the widespread OSS market of popular NPM packages, pieces of our software are critical in biomedical research. Please consider extending your Claude Max offer to such labs!
The (native) Rust version of wfa2-lib isn't all *safe* Rust, but damn it's fast! Currently, only the Affine2p (specifically with CIGAR) really lags behind the C impl, several others are considerably *faster*.
Claude code is so cute when it gets excited:
I covered it in my class and linked to it in my slides ;). Maybe the students are looking (it's a grad class).
It is what I imagine a programming language would look like if invented by someone who actively hates programming languages, and programmers.
Spring break in Trump's America:
CMake is, perhaps, the ugliest build system ever. Thank goodness I can have an AI agent interface directly with that nonsense rather than doing it as a person. CMake wasn't meant for human interaction anyway, so perhaps we've come full circle in it's lifecycle.
Current performance story:
Sonnet also handles these tasks well; yes. Functionally, I've not noticed too much of a difference with the latest Sonnet. This is specifically true once you get to the phase of actually implementing — Opus may still be a little bit nicer for planning, but executing the phases, they work similarly.
Score-only edit distance is now slightly (~9%) *faster* than the C version. The biggest focus now, where Rust is non-trivially slower, is in affine with CIGAR and dual-affine with CIGAR.
Yes. I started with this hand-written PROJECT.md file (github.com/COMBINE-lab/...), and then used plan mode, and a few iterations, to have Claude come up with the PLAN (github.com/COMBINE-lab/...). Has worked well so far!
Functionally complete at a 2-7% performance penalty depending on the mode. Gonna try a bit more but that last few percent may need something beyond the current frontier models. Something like a @curiouscoding.nl v1.0 model might be needed.
I may play around a bit more tonight, but this is a good checkpoint: github.com/COMBINE-lab/.... All features implemented, unit and integration tests, dedicated aarch64 & avx2 backends, a driver CLI! Still a bit slower than the C version, but we're closing the gap quick!
Just put the kids to bed .... and ... we have all features implemented and the CLI! We have a feature complete WFA2-lib in native Rust.
Now, we start what I anticipate will be the longest of the 13 phases; phase 12, performance optimization. Onward!
Post ultimate Frisbee with the kids & dinner; BiWFA is done, 2-bit nucleotide encoding & lambda-based scoring are done! Next is just the CLI driver and we are "functionally complete". The last 2 stages are performance optimization & polish (docs, API cleaning, etc.).
For all implemented tasks so far, there are both unit & integration tests, and results match the C-baseline :). 2/2
Pre-dinner update (5:16 PM).
Have implemented WFA score-only, CIGAR, end-to-end, gap-linear, gap-affine, dual-gap affine, ends-free & extension as well as important heuristics.
Currently, mostly through BiWFA. Then comes lambda/custom match, packed 2-bit DNA, a binary driver program & perf opts. 1/2
lol, yup, we said the same thing! Sorry I didn't wait for your follow-up skeet :). I agree.
However, I agree that for truly huge, very intertwined projects, this may work less well. However, I also think that the next generation of models will have better ways to deal with these limitations (or lift them entirely). 3/3
In these cases, the C++ code was checked out locally as a point of reference. Claude is able to break down conversion into discrete stages (and to properly scaffold pieces not yet implemented). It uses intermediate reports to persist across context windows. It's worked well so far. 2/3
I've done several reasonable-scale migrations so far (20-50kloc). Claude code handled it surprisingly well. For example, I ported SSHash to Rust (sshash-rs: github.com/COMBINE-lab/...) and ported piscem-cpp to rust (github.com/COMBINE-lab/...). 1/3
This is also why I put a prediction a year out with the next major version release of Opus ;P. I don't think that this will be possible for the largest codebases, even with the 1M token context. However, I suspect the next gen. will substantially improve this capability; it's a core target. 2/2
Right, great point. So two thoughts here. (1) Right now, the only way to get around this is to have a *human* generated hierarchical plan. The model's pretty good if you can modularize the problem and reliably coarse-grain your solutions, but that vision right now required more human input. (2) 1/2
Trying an experiment. I want to use WFA2-lib in Rust. But, I don't want bindings (I am so *over* having C or C++ deps in by build chain). So, I'm going to make a Rust native port with Claude Code. I started at 2:03PM. I'll provide an update (or an implementation) by the end of the day :).
Rather, a year from now, one might rather point Opus 5 (or whatever we have then) at the existing C++ codebase and just give it careful instructions to "rewrite this in Rust, maintain perfect semantic compatibility, and ask me about any existing bugs" & go grab a coffee. 2/2