PhilPhil
PhilPLUS
Edited May 28, 2026

Q: BYOK other LLMs

Saw you only support claude for BYOK. Will you integrate openrouter and other llms? claude is the most expensive, so it would be great if we could experiment w others.

edit, another question: do you have some kind of limits so that we dont send messages at the same time or too fast when talking to multiple people?

Founder Team
sohaib.dmchamp

sohaib.dmchamp

May 28, 2026

A: Hey, great question.

It's on the roadmap and we started building it, but through testing we found that in most cases other models work fine and then suddenly don't. For small projects, it's fine if the AI works 9 out of 10 times. But once you're actually running in production, a 10% failure rate is way too much and that's the failure rate we see with top-tier frontier models from other providers. We're not even talking about the cheaper ones.

Instead, we're training our own model on all the data we have and fine-tuning it. Long-term, that's faster and cheaper for the end user, and it means you don't have to mess about with which model to pick and get stuck there. We just want you to go out and close sales.

You can't just plug and play and expect it to work. They all have their own issues and edge cases that need to be handled. Supporting 100 different models and having them all work reliably is a myth.

We noticed this immediately in the early days when we were still using Cursor. They supported all these different models, but just switching between them would cause weird issues (like a model claiming it added a file when it actually hadn't). Claude was the best there. We recently checked all the other models again and those issues still aren't fixed. It's still a problem.

Also, "cheaper" doesn't actually pan out. The other models don't handle our chat volume and context properly they choke on the amount of history, FAQs, follow-up logic, tool calls, and reasoning that goes into a single AI response. You end up paying for more retries, more tokens, and worse output. Claude with our prompt caching gives you the lowest real cost per chat, not the lowest sticker price per token. We recently added our own model, which is roughly four times cheaper than Claude and on par performance-wise. We are going to continue optimizing this to make it even better and more affordable as well.

Thank you.

Share
Helpful?
1
Log in to join the conversation