Fireworks AI’s cover photo
Fireworks AI

Fireworks AI

Software Development

San Mateo, CA 41,926 followers

Run AI faster, more efficiently, and on your own terms

About us

Fireworks is the fastest way to build, tune, and scale AI on open models. Ship production-ready AI in seconds on our globally distributed cloud infrastructure, optimized for your use case. Fireworks powers production workloads at companies like Uber, Doordash, Notion, and Cursor—delivering 15× faster speed, 4× lower latency, and 4× more concurrency than closed models.

Industry
Software Development
Company size
51-200 employees
Headquarters
San Mateo, CA
Type
Privately Held
Founded
2022
Specialties
LLMs, Generative AI, artificial intelligence, developer tools, software engineering, and inference

Locations

Employees at Fireworks AI

Updates

  • The new reality for teams is model routing. Fireworks is being used inside early agent workflows where model selection happens dynamically based on cost and performance. Two perspectives from our partners at Trilogy make this concrete: • Adoption → why they moved to Fireworks (cost pressure, rate limits, flexibility) • Usage → how it’s integrated into their systems (provider abstraction + orchestration like Open Symphony)

  • This is stepping stone for enabling customers to generate training data from traces and the lean into continuous post training and own their AI with their own data moat. Kudos to the LangChain for the incredible work. The continuation of this research will be essential for companies creating their specialized intelligence.

    View organization page for LangChain

    520,737 followers

    LangChain Labs teamed up with Fireworks AI to answer the question, “How can we cost-effectively mine important signals from every single trace, while maintaining frontier performance?” https://lnkd.in/gHu2PbfQ We fine-tuned a Qwen judge model to detect “Perceived Error” from user interactions, with experiments running around three primary questions: 1️⃣ Does fine-tuning improve baseline judge quality up to frontier model performance? 2️⃣ Does a learned judge transfer across datasets? 3️⃣ Is serving a fine-tuned model cost-effective? We found that our fine-tuned model exceeded frontier model performance and runs ~100x cheaper. Read our study to learn how we ran this experiment from data preparation to fine-tuning setup, and see how this will impact our future research on trace understanding.

  • View organization page for Fireworks AI

    41,926 followers

    We can't wait to see what people build. With this partnership, we're giving developers access to production-grade open model inference through a single Azure endpoint, with enterprise service-level agreements (SLAs) and zero-setup onboarding. This is available now. Check out Fireworks on Foundry at the link in the comments.

    View organization page for Microsoft for Startups

    149,127 followers

    Microsoft Build 2026 delivered. Here's what it means if you're building an AI startup. From Fireworks AI going GA in Microsoft Foundry to a new family of MAI models, to smarter discovery in Microsoft Marketplace, the thread running through all of it is the same: it's getting easier to build trustworthy AI on a single platform and connect it to the enterprise customers who need it. Microsoft for Startups is also making it simpler to get started with Startup credits and grow your benefits as you build on Azure. Five announcements. One read. Worth your time before your next sprint. https://msft.it/6049vg49D

    • No alternative text description for this image
  • View organization page for Fireworks AI

    41,926 followers

    Kimi (Moonshot AI) released K2.7 Code, the latest in their K2 line of coding models, and it's live on Fireworks Day 0, on serverless and the API. K2.7 Code generates roughly 30% fewer reasoning tokens than K2.6 while improving results on coding evaluations. For teams running long-horizon coding agents, this reduces real cost per task. In multi-turn agent workflows, every reasoning token becomes context for the next steps. Cutting reasoning length leads to smaller contexts, faster loops, and fewer retries across the entire trajectory. K2.7 Code is available now with Standard and Priority serving tiers. A high-throughput Fast path is coming soon. Pricing: $0.95 / 1M input, $4.00 / 1M output, and $0.19 / 1M on cache hits. 256K context window. Full details: https://lnkd.in/ggadMGRz

  • Qwen 3.7 Plus is now live on Fireworks. The official Qwen 3.7 Plus weights are now hosted and served on Fireworks infrastructure. Your teams get: → Strong performance on long-horizon agent workflows with tool use and verification loops → Ability to preserve reasoning across multiple turns → Flexible thinking / non-thinking modes per request → Native multimodal input and prompt caching (80% cheaper cached tokens) → OpenAI and Anthropic-compatible APIs Serverless pricing: $0.50 / 1M input ($0.10 cached) and $3 / 1M output. Full details: https://lnkd.in/gHwmM5Mu

    • No alternative text description for this image
  • Fireworks AI reposted this

    View organization page for Harvey

    158,174 followers

    We're expanding Legal Agent Bench (LAB) to better evaluate how agents perform on one of the most common functions inside enterprises: contract negotiation. The update adds 500 new tasks spanning contract drafting, review, and negotiation across a wide range of agreement types and negotiation stages. Our goal is simple: measure whether agents can effectively advance a negotiation, recognize risk, and bring humans into the loop when the situation demands it. Read how we're benchmarking contract negotiation and the research directions we're pursuing next: https://lnkd.in/ggWxtsfk

    • No alternative text description for this image
  • MiniMax M3 is now on Fireworks. This open-weight frontier model combines three capabilities that have historically been expensive or fragmented: - Native multimodality (text + image + video) - Strong agentic and multi-turn coding performance - 512K token context All at roughly 1/20th the price of comparable closed-source models, with pricing now aligned to the previous M2.7 generation. For engineering and product teams, this changes the economics of building production-grade agents, long-document and repository-scale systems, and multimodal applications. You no longer have to choose between capability and cost at scale. Fireworks is providing Day-0 support with the fastest inference endpoints for the full MiniMax model family, including serverless for quick starts and on-demand deployments for production workloads.

    • No alternative text description for this image
  • Fireworks AI reposted this

    Crazy last week! We went from the energy of Microsoft Build and our announcements in SF straight into NYC TECH WEEK by a16z! Kicked things off with an intimate dinner hosted alongside our friends at turbopuffer with great enterprise leaders, founder and deep conversations. Then closed out NYC Tech Week with a rooftop party alongside some amazing partners: Exa, turbopuffer, Composio, Intercom, Vanta. Summer officially feels like it's started. The city was electric, the Knicks pulled off the win that night, and the views were incredible. There's something special about building spaces where founders, engineers, and industry leaders can connect. Always grateful to the partners and communities that make it possible.

    • No alternative text description for this image
    • No alternative text description for this image
    • No alternative text description for this image
  • We’re excited to share that Fireworks AI has been named to Redpoint’s InfraRed 100 list, which recognizes companies building the next generation of infrastructure and AI. We're proud of the inclusion, and more proud of what put us there: a team working through hard infrastructure problems so that companies can run AI in production without giving up control, quality, or cost efficiency. If that's the kind of work you want to do, we're hiring: https://lnkd.in/gWuP3nFb

    • No alternative text description for this image

Similar pages

Browse jobs

Funding