sadly OpenAI has a track record of not releasing new models in the API for ~2-4 weeks now (for "safety"), so you can only use 5.3 inside the Codex CLI for now
my bet is 4 weeks until general API access
sadly OpenAI has a track record of not releasing new models in the API for ~2-4 weeks now (for "safety"), so you can only use 5.3 inside the Codex CLI for now
my bet is 4 weeks until general API access
running glm or m2 (frontier open models) is not feasible on most hardware, so if you care the most about cost, use the free models on letta API
you can self host letta - check out the docker server
tbh if you want to play with super advanced memory systems, you need to use frontier models, which are expensive.
letta api (hosted) serves glm and minimax for free for this reason - to let people see whatβs possible for free
letta moves faster
yep
the βnotesβ your describing sounds the same as memory blocks in letta. letta is not automatic retrieval / rag based, archival memory is a separate aux memory system outside the main context engineering layer
try it out! it has the ability to set usage-based overage as well (w/ capped spend). overall it should be much better than direct anthropic API pricing if you're using something like opus
we're also trialing out a $200 max plan, since $20/mo doesn't get you that far if you're pushing tokens
ah gotcha - yeah I haven't tested the limits on the regular 20/mo plans for claude / openai in a while, but not surprised claude pro barely lasts for a session
qq are you referring to a letta pro plan? or a pro plan on a different platform? π
the free tier supports BYOK now so if you have an existing key from any of the main providers you can connect it
should be able to set LETTA_BASE_URL!
oh LOL even better, though tbh i donβt know how well bg works so if you try it and have issues let us know
will fix asap! a lot of the jank really just depends on what's in the critical path of our team and active users on our discord / gh
(and i personally haven't used /bg much, but now that we got a complaint about it can fix it asap)
do you have a link to this chainlink thing? having a problem finding it
(we're currently implementing hooks, would love to test w/ it if you're got your hook config to share)
The thing about working at a company that makes tools to build artificially intelligent persistent entities is that it is very weird to talk to intelligent persistent entities
github.com/letta-ai/le...
if you don't mind sharing, what kind of hooks do you use? (useful data for prioritizing what to ship first)
you could use letta code w/ a "call-local-llm" skill?
letta supports local models: docs.letta.com/guides/selfh...
if you're going to try and use local models w/ letta code, be careful though - most of them won't work well (eg if you have the hardware, try something like glm 4.6 air)
just an FYI though - the claude agent SDK wraps the claude CLI binary, which is not open source and is locked to claude models. letta code is an open source version of the same style CLI harness: github.com/letta-ai/let...
i don't know if it would be "better", but it would definitely be simpler.
if you want to use letta as a memory store, like you said you should use the ai memory SDK, which is much more powerful than a simple KV store (it's sleep-time compute / agentic memory management)
letta is actually also the original implementation of memgpt (you can go back through the git history to see the first commit in oct 23, repo was called cpacker/memgpt)
fun fact: the original memgpt repo was a CLI agent
yep totally understand the experimentation / education use case, makes sense - in that case, you probably also do not want to use the claude agents sdk (was in the op), and eg are better off writing on top of raw llm calls
imo all the fun experimenting these days is happening at the layer *above* the agent loop - eg, at the letta api, claude code sdk layer, or one level higher than that, by customizing skills that are fed into letta code, claude code, etc
fyi letta sdk allows you to modify the compaction prompts, same as claude agents sdk: docs.letta.com/guides/agent...
letta is much more comparable to the openai responses api, or the claude agents sdk (letta is OSS, both of those are not)
i havenβt had a chance to take a look at the post closely yet, but if youβre using claude agents sdk probably makes sense to use the letta ai memory sdk, not the main sdk
letta is based on memgpt and is an agent harness (combination of tool execution, state management, and context management inc compaction)
it is not a read/write memory store like sqlite/pinecone/etc, though if you really want to use it that way, you can use the ai memory sdk which wraps it