BYOK other LLMs

Question

Saw you only support claude for BYOK. Will you integrate openrouter and other llms? claude is the most expensive, so it would be great if we could experiment w others.

edit, another question: do you have some kind of limits so that we dont send messages at the same time or too fast when talking to multiple people?

sohaib.dmchamp · Answer

Hey, great question.\u000a\u000aIt\u0027s on the roadmap and we started building it, but through testing we found that in most cases other models work fine and then suddenly don\u0027t. For small projects, it\u0027s fine if the AI works 9 out of 10 times. But once you\u0027re actually running in production, a 10% failure rate is way too much and that\u0027s the failure rate we see with top\u002Dtier frontier models from other providers. We\u0027re not even talking about the cheaper ones.\u000a\u000aInstead, we\u0027re training our own model on all the data we have and fine\u002Dtuning it. Long\u002Dterm, that\u0027s faster and cheaper for the end user, and it means you don\u0027t have to mess about with which model to pick and get stuck there. We just want you to go out and close sales.\u000a\u000aYou can\u0027t just plug and play and expect it to work. They all have their own issues and edge cases that need to be handled. Supporting 100 different models and having them all work reliably is a myth.\u000a\u000aWe noticed this immediately in the early days when we were still using Cursor. They supported all these different models, but just switching between them would cause weird issues (like a model claiming it added a file when it actually hadn\u0027t). Claude was the best there. We recently checked all the other models again and those issues still aren\u0027t fixed. It\u0027s still a problem.\u000a\u000aAlso, \u0022cheaper\u0022 doesn\u0027t actually pan out. The other models don\u0027t handle our chat volume and context properly they choke on the amount of history, FAQs, follow\u002Dup logic, tool calls, and reasoning that goes into a single AI response. You end up paying for more retries, more tokens, and worse output. Claude with our prompt caching gives you the lowest real cost per chat, not the lowest sticker price per token. We recently added our own model, which is roughly four times cheaper than Claude and on par performance\u002Dwise. We are going to continue optimizing this to make it even better and more affordable as well.\u000a\u000aThank you.

DM Champ

Share DM Champ

Choose a plan

Related questions