The world's fastestopen-source chat hub.

Every frontier model. One streaming runtime.

View on GitHub
supported out of the box
Anthropic
OpenAI
Google Gemini
xAI
DeepSeek
Meta
Mistral
OpenRouter
Qwen
Midjourney
Cerebras
NVIDIA
Xiaomi
Anthropic
OpenAI
Google Gemini
xAI
DeepSeek
Meta
Mistral
OpenRouter
Qwen
Midjourney
Cerebras
NVIDIA
Xiaomi

A runtime built for speed, designed for humans.

01

Frontier-parallel streaming

Every supported model streams through a single SSE runtime with Redis fallbacks, no vendor lock-in, no dropped tokens on reconnect.

02

Zero-rerender UI

Streamed tokens bypass React state entirely. A requestAnimationFrame painter writes directly to the DOM, keeping the UI at a hard 120fps even on low-end machines.

03

Concurrent DB + LLM

We fire the database write and the LLM inference at the same instant. Cold TTFT lands below 120ms; warm hits under 60ms.

04

Agentic web search

A parallel multi-agent swarm dispatched through FireCrawl. Visualized live, collapsible, with citations threaded through the final response.

05

Voice-first input

Whisper v3 Turbo via a custom WebRTC pipe. Hold, speak, send, latency under half a second from release to stream.

06

Generate anything

Image Studio built on Fal's latest diffusion models. Masonry gallery, shared history, one-click reruns across models.

07

Bring your own keys

BYO keys for any provider, or fall back to the hosted tier. All keys are encrypted at rest and never logged.

08

Open source. MIT.

Every line of the runtime is on GitHub. Self-host it, fork it, or deploy it to your own Vercel + Convex + Upstash stack in minutes.

Time to first token, measured honestly.

Informal browser spot check: same anti-cache prompt, fast (non–reasoning) model per app, median time to first token. How we tested (heuristic, not a lab benchmark)

AgentChat (Kimi K2 Instant)
1400ms
Gemini (fast, non-reasoning)
1710ms
ChatGPT (Instant)
2450ms
T3Chat (Kimi K2 Instant)
3500ms

Approximate in-browser times (April 2026). Measured with devtools / performance recording; not peer-reviewed. Does not include our own server-side “warm” app optimization claims, see the write-up for the exact prompt and limits.

Switch chats in milliseconds, not seconds.

Median time from choosing another conversation until the composer is ready again. Desktop Chrome, comparable network.

t3.chat
30ms
AgentChat
40ms
claude.com
600ms
chatgpt.com
1400ms

Competitor figures measured on their web apps in April 2026; methodology matches the subtitle above.

Every frontier family, behind a single input.

Switch between labs mid-thread. Keep the context. Keep the history.

Lab
Family
Variants
Context
Capabilities
Anthropic
Claude 4.6
Opus, Sonnet
200K
reasoningvisiontoolspdf
OpenAI
GPT-5.2 / OSS
GPT-5.2, OSS 120B, OSS 20B
128K
reasoningvisionimage-gentools
Google DeepMind
Gemini 3
Flash, Pro
1M
fastvisionaudiotools
xAI
Grok 4
4.20, 4.1 Fast, Code Fast, 3 Mini
2M
reasoningvisiontools
DeepSeek
DeepSeek V3.2
V3.2
164K
reasoningtools
Meta AI
Llama 3.1
405B, 8B Instant
131K
reasoningtools
Moonshot KimiMoonshot
Kimi K2
K2.5, K2
262K
reasoningvisiontools
Z.ai GLMZ.ai
GLM-5
GLM-5
131K
visiontools
MiniMaxMiniMax
MiniMax M2.5
M2.5
131K
fasttools

Your chat hub. Your infrastructure. Your rules.

AgentChat ships under the MIT license, with a one-command self-host recipe. Drop in your provider keys, point it at your own Convex deployment, and you're running in minutes.

We don't gate features, don't phone home, and don't train on your conversations. The public hosted tier exists so you can kick the tires, nothing more.

Stop waiting on tokens.

Open the app and feel the difference.