Same APIs. No throttle.
Subscription AI services call the same LLM APIs you could call yourself — then quietly throttle your context, hide the token counts, and charge a flat monthly fee. Interlocute uses those same APIs to give you bigger context windows, full observability, and a rich workspace that would otherwise take a dozen third-party libraries to build.
A rich workspace — no assembly required
Everything a power user needs in one app. No stitching together vector databases, prompt management tools, billing dashboards, and observability platforms. It's all here.
Full Context Windows
No silent trimming. Your node sends the context you configure — you decide how much conversation history, how many documents, and how deep the memory goes. The model sees what you intend it to see.
Observable Processing
See exactly what happened on every call — token counts (input and output), model used, latency, capability traces, preprocessing steps, and cost. Real telemetry, not a black box.
Configurable Everything
Model, provider, persona, constitution, disclosure mode, capabilities — all live-editable from the dashboard. Change the model mid-conversation. Swap personas between threads. No redeploy.
Threaded Conversations
Run multiple threads in parallel, each with full isolated history. Cross-thread awareness lets the node reference other threads when you need broader reasoning across a workspace.
Documents & Attachments
Upload PDFs, images, and text files directly into conversation context. The AI reads, references, and reasons over your documents — no separate upload portal or file-size guessing.
Prompt Saving & Reuse
Save your best prompts, organise them, and reuse across conversations and nodes. Build a personal library without a separate prompt-management SaaS.
Tagging & Grouping
Tag threads by project, topic, or team. Group related conversations. Power users with hundreds of threads don't need an endless scroll — they need structure.
Artifacts & Outputs
Structured outputs — code blocks, tables, summaries, documents — that you can browse, export, and manage outside the conversation flow. Not buried in chat bubbles.
Long-term Memory
Your node remembers preferences, facts, and decisions across threads and sessions. Context persists — you don't re-explain yourself every time you open a new chat.
Real Cost Accounting
Every request is metered with cost attribution per thread, per node, per project. Set budget limits. Export usage data. Know exactly what you spend — and on what.
Governance & Guardrails
Platform contract enforcement, disclosure modes, budget caps, and IAM policies. Every response is governed and auditable. Built for production, not just weekend experiments.
Real-time Streaming
Token-by-token SSE streaming. Responses appear as they're generated — fast, responsive, and with full token attribution visible in the telemetry panel.
What subscriptions actually do to your AI
Behind every $20/month chat window is the same OpenAI or Anthropic API you could call directly. The subscription layer adds throttling, hides how your prompts are processed, and economises tokens to protect their margins — not yours.
Silent context trimming
To keep costs down, subscription services quietly truncate your conversation history before sending it to the model. You lose context mid-conversation — and you never know it happened.
Rate-limited and throttled
Hit a few dozen messages and you're told to slow down or wait. The "unlimited" plan has limits — they just aren't on the pricing page.
Zero processing visibility
You can't see how many tokens a response cost, what model actually served it, what system prompt was prepended, or how your context was composed. It's a black box — by design.
Vendor-locked configuration
Want a different system prompt? A custom persona? A different model for a different task? Too bad — you get whatever the vendor decided to ship this week.
Interlocute calls the exact same APIs — but gives you the full context window, complete processing telemetry, and the entire configuration surface that subscriptions deliberately hide. It's what those services could have been, if they weren't optimising for subscription revenue.
Stop renting a chat window. Own your workspace.
- Throttled context & rate limits
- Zero processing visibility
- Vendor-locked model & config
- Flat fee — pay even when idle
- Minimal chat UI
- Infra + vector DB to manage
- Build UI from scratch
- Separate prompt / billing / obs tools
- Manual governance & guardrails
- Weeks of integration work
- Full context — you control the window
- Token-level telemetry on every call
- You choose model, persona, config
- Pay per token — $0 when idle
- Rich workspace — threads, tags, artifacts
Frequently Asked Questions
The Interlocute App
How is this different from ChatGPT or Claude subscriptions?
What do you mean by 'throttled context'?
Do I need to be a developer to use the app?
What observability do I get that subscriptions don't offer?
Can I switch models mid-conversation?
What does 'no assembly required' mean?
Is my data private?
How does pricing actually work?
Ready for an AI workspace that doesn't throttle you?
Sign up, create a node, and start chatting with full context, full observability, and full control. No credit card required.