The world's fastestopen-source chat hub.

Every frontier model. One streaming runtime.

View on GitHub

supported out of the box

Anthropic

OpenAI

Google Gemini

xAI

DeepSeek

A runtime built for speed, designed for humans.

Frontier-parallel streaming

Every supported model streams through a single SSE runtime with Redis fallbacks, no vendor lock-in, no dropped tokens on reconnect.

Zero-rerender UI

Streamed tokens bypass React state entirely. A requestAnimationFrame painter writes directly to the DOM, keeping the UI at a hard 120fps even on low-end machines.

Concurrent DB + LLM

We fire the database write and the LLM inference at the same instant. Cold TTFT lands below 120ms; warm hits under 60ms.

Agentic web search

A parallel multi-agent swarm dispatched through FireCrawl. Visualized live, collapsible, with citations threaded through the final response.

Voice-first input

Whisper v3 Turbo via a custom WebRTC pipe. Hold, speak, send, latency under half a second from release to stream.

Generate anything

Image Studio built on Fal's latest diffusion models. Masonry gallery, shared history, one-click reruns across models.

Bring your own keys

BYO keys for any provider, or fall back to the hosted tier. All keys are encrypted at rest and never logged.

Open source. MIT.

Every line of the runtime is on GitHub. Self-host it, fork it, or deploy it to your own Vercel + Convex + Upstash stack in minutes.

/ performance

Time to first token, measured honestly.

Informal browser spot check: same anti-cache prompt, fast (non–reasoning) model per app, median time to first token. How we tested (heuristic, not a lab benchmark)

AgentChat (Kimi K2 Instant)

1400ms

Gemini (fast, non-reasoning)

1710ms

ChatGPT (Instant)

2450ms

T3Chat (Kimi K2 Instant)

3500ms

Approximate in-browser times (April 2026). Measured with devtools / performance recording; not peer-reviewed. Does not include our own server-side “warm” app optimization claims, see the write-up for the exact prompt and limits.

/ thread switching

Switch chats in milliseconds, not seconds.

Median time from choosing another conversation until the composer is ready again. Desktop Chrome, comparable network.

t3.chat

30ms

AgentChat

40ms

claude.com

600ms

chatgpt.com

1400ms

Competitor figures measured on their web apps in April 2026; methodology matches the subtitle above.

/ models

Every frontier family, behind a single input.

Switch between labs mid-thread. Keep the context. Keep the history.

Lab

Family

Variants

Context

Capabilities

Anthropic

Claude 4.6

Opus, Sonnet

200K

reasoningvisiontoolspdf

OpenAI

GPT-5.2 / OSS

GPT-5.2, OSS 120B, OSS 20B

128K

reasoningvisionimage-gentools

Google DeepMind

Gemini 3

Flash, Pro

fastvisionaudiotools

xAI

Grok 4

4.20, 4.1 Fast, Code Fast, 3 Mini

reasoningvisiontools

DeepSeek

DeepSeek V3.2

V3.2

164K

reasoningtools

Meta AI

Llama 3.1

405B, 8B Instant

131K

reasoningtools

Moonshot

Kimi K2

K2.5, K2

262K

reasoningvisiontools

Z.ai

GLM-5

131K

visiontools

MiniMax

MiniMax M2.5

M2.5

131K

fasttools

/ open source

Your chat hub. Your infrastructure. Your rules.

AgentChat ships under the MIT license, with a one-command self-host recipe. Drop in your provider keys, point it at your own Convex deployment, and you're running in minutes.

We don't gate features, don't phone home, and don't train on your conversations. The public hosted tier exists so you can kick the tires, nothing more.

Star on GitHub Read the deploy guide

~/agentchat

$ git clone github.com/Lxvi101/AgentChat
$ pnpm install
$ pnpm dev
> ready on http://localhost:3000
> TTFT: 58ms · frames: 120fps
$▊

Stop waiting on tokens.

Open the app and feel the difference.