Blog

WarpSpeed: AI That Changes the Cost of Running Software

New AI capabilities that mark a structural shift in the economics of software.

Published on

March 26, 2026

Share this post

The Problem

Software costs money to build — but it costs far more to run. The global bill for running software on hardware is upwards of hundreds of billions of dollars annually, towards trillions. What companies pay cloud providers and what data centers spend on electricity alone hint at the scale — but even those figures capture only a fraction of the true cost.

At the heart of this cost sits a bottleneck: a large portion of modern computing relies on specialized chips called GPUs to crunch enormous amounts of data, powering everything from social network analysis to recommendation engines to scientific research. But GPUs are only as good as the software instructions — called kernels — that tell them what to do. Writing those instructions at peak efficiency is one of the hardest jobs in computing. It takes years of deep expertise, and there are very few people in the world who can do it well. As a result, while GPU hardware keeps getting stronger, the software running on it can’t keep up: it is not optimized to the hardware and thus running it is excessively expensive. Organizations pay the price every second of every day in what amounts to one of the largest bills society pays today.

What doubleAI Built

doubleAI created WarpSpeed, an AI system that writes GPU software better than the best human engineers in the world — cutting the cost of running software by a factor of 3.6x or more, across the board.

To prove it, doubleAI pointed WarpSpeed at cuGraph — NVIDIA’s flagship library for graph analytics on GPUs, one of the hardest to optimize types of software that runs today. cuGraph has been built and refined by top-tier engineers for roughly a decade and is one of the most widely used GPU libraries on the planet. doubleAI chose it deliberately: if WarpSpeed can make cuGraph workloads run 3.6–100x cheaper on GPU hardware, all workloads are next.

WarpSpeed autonomously rewrote every algorithm in the library — and made every single one faster.

The Result

Metric	Result
Average speedup over human experts	3.6x
Algorithms improved	100%
Algorithms more than 2x faster	55%
Algorithms more than 10x faster	18%
GPU architectures tested	3 (NVIDIA A100, L4, A10G)

The optimized library, called doubleGraph, is available today as a free, drop-in replacement — users can install it and immediately benefit, with zero changes to their existing code.

Existing AI Systems Are Far From It

Today’s best AI coding tools — including Claude Code, Codex, and Gemini CLI — were tested on the same challenge. They produced buggy, broken code roughly 40% of the time, even when given the library’s own test suite. These failures are often very subtle but can in no way replace code in production; Current AI systems’ failure rates make them unusable. doubleAI’s unique technology allows them to optimize the code with full correctness guarantees, marking a turning point that allows AI technology to replace human experts in this arena.

GPU optimization is uniquely hard for existing AI systems because it breaks the three conditions where current AI excels: there is very little training data to learn from, it’s extremely difficult to verify whether a solution is correct, and getting to a good answer requires a long chain of interdependent decisions. Most AI systems today need all three to be easy. WarpSpeed is built to handle all three being hard.

How It Works (In Plain Terms)

doubleAI developed three core innovations:

‍Smarter search: Instead of trying random improvements and hoping for the best, WarpSpeed systematically explores the space of possible solutions — including the ability to “time travel” back to earlier attempts while retaining lessons from what didn’t work.‍
Self-checking: WarpSpeed builds its own testing tools tailored to each specific problem, so it can verify that its solutions are correct — even when there’s no answer key to compare against.‍
Deep specialization: Rather than writing one generic solution, WarpSpeed generates a custom-tailored version for every possible configuration of hardware and workload — 576 specialized implementations in total. No human team could afford to do this.

Why This Changes Everything

This is not just a technical achievement — it signals a structural shift in the economics of software. Until now, optimizing software for hardware was the exclusive realm of a small number of top human expert engineers. Now that AI has entered the arena, we are heading toward a fundamentally new cost structure for running software.

Think of it like the compression revolution of the 1980s and 90s. Information transmission channels were once a massive bottleneck. When compression was commoditized, the amount of data that could travel over the same lines jumped by 100x — and today’s entire digital reality was shaped by that leap. Hardware is the major bottleneck today. An AI that can squeeze 3.6x to 100x more performance out of the same chips means dramatically more software can run on the same hardware. Software running costs are set to plummet because AI will be optimizing them continuously. And the global bid for more hardware, the source of much geo-political turmoil, is about to reset at new balance points.

The Bigger Picture

WarpSpeed is doubleAI’s first demonstration of what the company calls Artificial Expert Intelligence (AEI) — AI that doesn’t just assist human experts but matches and surpasses them in their own domains.

The world’s hardest problems — in drug discovery, chip design, climate science, cybersecurity — are bottlenecked not by computing power, but by the scarcity of true experts. If AI can be made to reliably perform at expert level, that bottleneck breaks open.

GPU optimization is the proving ground. If AEI works here — where the data is scarce, verification is hard, and the human bar is elite — it can work anywhere expertise is the limiting factor.