Kimi K2.6

Kimi K2.6 is Moonshot AI's natively multimodal flagship focused on long-horizon coding and design with code, with a context window of 262.1K tokens, available through AI Gateway via moonshotai, fireworks, novita, baseten, togetherai.

ReasoningTool UseVision (Image)File InputImplicit Caching

index.ts

import { streamText } from 'ai'

const result = streamText({
  model: 'moonshotai/kimi-k2.6',
  prompt: 'Why is the sky blue?'
})

Overview About Providers Throughput Latency Uptime Status Similar FAQ

Playground

Try out Kimi K2.6 by Moonshot AI. Usage is billed to your team at API rates. Free users (those who haven't made a payment) get $5 of credits every 30 days.

Kimi K2.6

Ask Kimi K2.6 anything to try it out.

Providers

Route requests across multiple providers. Copy a provider slug to set your preference. Visit the docs for more info. Using a provider means you agree to their terms, listed under Legal.

Provider

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	ZDR	No Training	Release Date

Moonshot AI

262K

3.0s

49tps

$0.95/M

$4.00/M

Read:$0.16/M

Write:—

—

04/20/2026

Fireworks

262K

0.7s

78tps

$0.95/M

$4.00/M

Read:$0.16/M

Write:—

—

04/20/2026

Novita AI

262K

1.7s

73tps

$0.95/M

$4.00/M

Read:$0.16/M

Write:—

—

04/20/2026

Baseten

262K

0.2s

114tps

$0.95/M

$4.00/M

Read:$0.16/M

Write:—

—

04/20/2026

Together AI

256K

0.5s

162tps

$1.20/M

$4.50/M

Read:$0.2/M

Write:—

—

04/20/2026

More models by Moonshot AI

Model

Context	Latency	Throughput	Input	Output	Cache	Web Search	Per Query	Capabilities	Providers	ZDR	No Training	Release Date

moonshotai/kimi-k2.7-code-highspeed

262K

1.4s

250tps

$1.90/M

$8.00/M

Read:$0.38/M

Write:—

—

06/15/2026

moonshotai/kimi-k2.7-code

262K

0.7s

131tps

$0.74/M

$3.50/M

Read:$0.15/M

Write:—

—

06/12/2026

moonshotai/kimi-k2.5

262K

0.4s

76tps

$0.60/M

$3.00/M

Read:$0.1/M

Write:—

—

01/26/2026

moonshotai/kimi-k2-thinking

262K

0.5s

68tps

$0.60/M

$2.50/M

Read:$0.15/M

Write:—

—

11/06/2025

moonshotai/kimi-k2

131K

1.3s

24tps

$0.57/M

$2.30/M

—

09/05/2025

About Kimi K2.6

Kimi K2.6, released on April 20, 2026, is the natively multimodal successor in the Kimi line. Moonshot AI positions it around three capability areas: long-horizon execution, agentic coding, and design with code.

Long-horizon coding is the headline shift from earlier K2 variants. Kimi K2.6 sustains tool-use chains across thousands of calls in a single session, with reported workloads spanning many hours of continuous execution and iterative optimization across a codebase. The model maintains task state across these extended sessions rather than losing thread after a few dozen turns.

Native vision input changes how design tasks compose. Kimi K2.6 accepts images directly, so a screenshot or mockup can drive a frontend generation step without a separate vision model in the pipeline. Moonshot AI documents output that includes structured layouts, hero sections, interactive elements, and animations, rather than syntax-level scaffolding alone. Full-stack workflows that pair frontend output with authentication, user interaction, and database operations are part of the documented scope.

Access Kimi K2.6 through AI Gateway by setting the model string to moonshotai/kimi-k2.6. AI Gateway routes across moonshotai, fireworks, novita, baseten, togetherai with automatic failover, and the observability layer tracks token usage and costs across the long sessions this model is built for.

Kimi K2.6 supports a context window of 262.1K tokens and completions up to 262.1K tokens per request. It's available through AI Gateway at $0.95 per million input tokens and $4 per million output tokens.

What To Consider When Choosing a Provider

Configuration: Kimi K2.6 runs the longest agentic sessions in the Kimi family. Plan token budgets around extended tool-use chains and verify your agent harness handles multi-hour execution windows. Vision input is native, so a separate vision model isn't required for design or screenshot tasks.
Zero Data Retention: AI Gateway supports Zero Data Retention for this model via direct gateway requests (BYOK is not included). To configure this, check the documentation.
Authentication: AI Gateway authenticates requests using an API key or OIDC token. You do not need to manage provider credentials directly.

When to Use Kimi K2.6

Best For

Long-horizon coding agents: Sessions that run for hours, accumulate thousands of tool calls, and iterate across a full codebase
Vision-to-frontend pipelines: Frontend generation from screenshots, mockups, or design references without a separate vision step
Full-stack scaffolding: Workflows spanning UI, authentication, user interaction, and database operations from one model
Kimi K2.5 upgrade path: Teams that want stronger long-horizon execution and design output than K2.5 provides

Consider Alternatives When

Explicit reasoning traces: Kimi K2 Thinking emits chain-of-thought output for tasks that reward visible deliberation
Short-horizon throughput: Kimi K2 Turbo runs the K2 MoE without the longer-session emphasis when tasks finish in a few turns
Cost-sensitive deployments: Earlier K2 variants may meet your quality bar at lower cost per token
Text-only pipelines: A text-only K2 variant is a closer fit when no vision step is needed

Conclusion

Kimi K2.6 extends the Kimi line into long-horizon agentic coding and design-with-code workflows with native vision input. For agents that need to run for hours across a codebase, or for pipelines that turn visual references into working frontends, it's the K2 generation built for those sessions.

Agent Stack

Core Platform

Tools

Learn

Build

Explore

Kimi K2.6

Playground

Providers

More models by Moonshot AI

About Kimi K2.6

What To Consider When Choosing a Provider

When to Use Kimi K2.6

Best For

Consider Alternatives When

Conclusion