Blog · Analysis · Last reviewed June 25, 2026

The Efficiency Gain Becomes the Demand Engine

More efficient AI does not automatically mean less AI infrastructure. It can also make model use cheap enough to spread everywhere.

A demand engine is the product, procurement, finance, grid, and workflow machinery that converts a lower unit cost into more total use.

The practical unit is the demand ledger: per-unit efficiency, total volume, workload mix, peak load, local water and heat, human review, and whether saved capacity is retired or reinvested.

Cheap Compute Is Not Small Compute

The comforting story says that artificial intelligence will get more efficient. Models will use fewer parameters, chips will improve, inference will get cheaper, data centers will optimize cooling, and software will do more with less. Some of that is true. It is also incomplete.

The missing object is the demand engine: the commercial and institutional machinery that spends the gain. It includes usage-based pricing, subscription bundles, product defaults, sales quotas, procurement assumptions, cloud commitments, accelerator depreciation, local power contracts, workflow metrics, and managers rewarded for moving more work through models.

Efficiency is a ratio with a boundary. It can tell us that a model uses less compute per task, less energy per token, less cost per answer, or less hardware for a given benchmark. It does not tell us how many tasks, tokens, answers, agents, products, users, workflows, retries, and high-compute modes the world will ask for once the cost falls.

The denominator is the politics. A provider may report cost per token, energy per answer, compute per benchmark point, power per rack, carbon per megawatt-hour, or water usage effectiveness at a facility. Each metric can be true and still miss total demand. Governance has to ask what numerator is shrinking, what denominator is growing, and whether the saved capacity is retired, shifted, or reinvested into more model use.

That distinction matters because AI is not only a product category. It is becoming a general-purpose layer for search, writing, coding, tutoring, customer support, medical documentation, legal drafting, image generation, video generation, agentic browsing, software maintenance, internal knowledge work, advertising, surveillance, fraud detection, finance, science, and government service delivery. When the unit price of cognition falls, institutions do not merely save money on the old workload. They invent new workloads.

This is the core governance problem in AI efficiency. A smaller model can expand the system. A cheaper query can invite more queries. A better chip can justify a larger cluster. A faster agent can make continuous automation normal. The demand engine is not outside the efficiency gain. It is often produced by it.

The Old Paradox

William Stanley Jevons gave the modern problem an old name. In The Coal Question, published in 1865, he argued that making coal more economical to use did not simply preserve Britain's coal supply. More efficient steam engines and industrial processes made coal more useful across the economy, helping expand the very system that consumed it.

The point was not that efficiency is fake. It was that efficiency changes behavior. If a resource becomes cheaper to use, more people and firms can use it, more uses become profitable, and the economy can reorganize around the new abundance. Under those conditions, total consumption can rise even while use per unit falls.

That is why Jevons paradox keeps returning whenever a society treats efficiency as a substitute for absolute constraints. A more efficient car can lower the cost of driving. A more efficient light can make lighting more common. A more efficient server can make computation more pervasive. The rebound is not mysterious. It is demand responding to lower effective cost.

AI gives the paradox a new surface. The relevant input is not only electricity. It is also accelerator time, inference capacity, model access, cloud budgets, human attention, organizational tolerance for automation, and the permission to run machine judgment through more parts of life.

Current Context

As of June 25, 2026, the rebound question has moved from metaphor into energy planning. The International Energy Agency's 2026 update reported that global data-center electricity demand grew 17% in 2025 and that electricity consumption from AI-focused data centers grew 50%. Its updated central projection has data-center electricity use roughly doubling from about 485 terawatt-hours in 2025 to about 950 terawatt-hours in 2030, while noting major uncertainty around adoption, efficiency, chip supply, financing, and grid bottlenecks.

The same IEA update shows why demand governance now has to follow finance and supply chains, not only energy meters. It says the largest technology companies' capital expenditure exceeded USD 400 billion in 2025 and was expected to rise another 75% in 2026, while satellite-based tracking showed AI-factory capacity more than tripling in 18 months. It also warns that not every announced project will be built, and that bottlenecks in chips, high-bandwidth memory, transformers, power electronics, grid connections, financing, and social acceptance can change the realized load. That makes project status part of source discipline: announced capacity, financed capacity, interconnection-ready capacity, energized capacity, and utilized capacity are different facts.

The same update makes the physical boundary clearer. IEA reports that energy per AI task has fallen sharply, but that reasoning, video, and agentic workloads can consume hundreds or thousands of times more electricity per query than simple text generation. It also warns that AI-server power density increased 11-fold from 2020 to 2025 and could rise another fourfold by 2027. In other words, efficiency at one layer can coexist with denser racks, faster load swings, greater heat density, and harder grid integration.

The U.S. evidence points the same way without reducing the problem to a single prediction. The Energy Information Administration's 2026 outlook projects rising server electricity use across the commercial building stock; even in a case with post-2040 server efficiency improvements, continued growth in server installations drives overall consumption higher. DOE's Lawrence Berkeley National Laboratory report remains the main U.S. baseline: 176 terawatt-hours in 2023, or about 4.4% of total U.S. electricity, with a 2028 range of roughly 325 to 580 terawatt-hours.

Model-level measurement is also improving. AI Energy Score and related benchmarking work compare inference energy use across tasks, models, and hardware in a more standardized way, including reasoning tasks. That is useful procurement evidence. It is not, by itself, a data-center forecast. A lower score for one model-call category does not answer how many calls are made, how much context is carried, how often agents retry, where the load lands, or whether the savings are used to automate more work.

Grid governance is catching up. FERC's RM26-4 docket treats large loads, generally above 20 megawatts, as an interconnection and cost-allocation question for the interstate transmission system. On June 18, 2026, FERC issued tailored show-cause orders to the six regional grid operators under its jurisdiction, directing them and their transmission owners to justify existing tariff rules or propose changes for data centers, manufacturing facilities, and other large energy users. The fact sheet names transmission-service application and study processes, cost-shifting prevention, co-location, flexible large-load service, and studies for generation serving electrically proximate large loads as reform categories, and it requires 30-day informational reports on generation adequacy for existing and new large loads. NERC's May 2026 large-load guideline treats data centers and other emerging loads as reliability objects that require data collection, validated models, operational coordination, event records, and planning assumptions. The guideline is voluntary and non-binding, which makes tariffs, interconnection agreements, procurement contracts, and operating procedures the places where flexibility and telemetry promises become enforceable. That shift matters: efficiency claims now sit beside grid load, data centers, local infrastructure, and public compute.

The AI Rebound

For this essay, AI rebound means an increase in aggregate model-related resource use caused or enabled by a fall in the effective per-unit cost of using AI. The resource can be electricity, accelerator time, inference budget, data-center capacity, human review time, institutional attention, or the political permission to insert models into new settings. Rebound is not a law of nature. It is the observable result when lower friction meets weak budgets, weak purpose tests, weak local planning, or products designed to turn cheaper inference into more inference. The AI rebound has several practical channels.

First, cheap inference turns occasional use into ambient use. If a model response is expensive, people reserve it for high-value tasks. If it is cheap, every email, ticket, meeting, classroom exercise, shopping comparison, code review, search query, and government form becomes a candidate for model mediation.

Second, cheaper compute enables more runtime reasoning. Products can spend more inference on chain-of-thought-like deliberation, hidden scratchpads, self-checking, search, tool calls, multiple samples, verifier passes, and agent loops. The unit model may be efficient, but the workflow can become compute-hungry because the product now expects the model to plan, inspect, retry, and verify.

Third, efficiency opens marginal markets. A use case that made no economic sense at one cent per action may become plausible at one-tenth of a cent. That is how AI moves from premium assistant to background infrastructure: automated quality checks, low-value content generation, bulk personalization, synthetic survey respondents, always-on classroom helpers, and internal reporting systems that would never have justified expensive model calls.

Fourth, organizations change their standards. Once AI drafting is cheap, one draft becomes five drafts. Once summarization is cheap, every meeting gets summarized. Once coding agents are cheap, more issues are converted into agent tasks. Once synthetic video becomes cheaper, more communications become video. The baseline expands.

Fifth, infrastructure creates its own expectations. A data center built for AI needs utilization. A company that has prepaid for accelerators wants products that consume them. A cloud provider that has secured power contracts wants demand. The physical stack pushes the social stack to find use.

Sixth, cheap model labor shifts review labor. When generation is cheap, more drafts, summaries, code changes, prompts, fraud attempts, and synthetic variants reach humans or downstream systems. The energy rebound has a management counterpart: more model output can create more verification work, more incident triage, and more trust debt unless institutions budget for review instead of only counting generated volume. That connects the infrastructure problem to the token meter, shadow AI, and workslop.

This is why the argument "models are getting more efficient" cannot settle the energy, labor, or governance debate. Efficiency changes the slope. It does not decide the total.

What Counts as Demand

AI demand is not one number. Training demand is episodic and project-shaped. Inference demand is continuous and product-shaped. Agent demand adds retries, tool calls, browsing, file analysis, memory, monitoring, evaluation, and background execution. Infrastructure demand includes power delivery, cooling, water, land, transmission, backup generation, chips, networking, and storage. Institutional demand includes human review, incident response, security monitoring, procurement control, and rework created by cheap generation.

This taxonomy matters because different evidence measures different layers. A model benchmark may report energy per task. A facility may report power usage effectiveness or water usage effectiveness. A utility may see peak megawatts. A regional planner may see transmission upgrades. A workplace may see tokens, retries, review hours, and quality failures. Rebound happens when improvement at one layer reduces friction enough to expand another.

Not every efficiency gain rebounds fully. Some savings can be banked, especially when budgets, procurement rules, safety policies, clean-power limits, model-routing rules, or regulation cap total use. The governance question is empirical: which savings are retired, which are shifted to public-interest work, and which are spent on more automation?

The Rebound Ledger

The practical governance tool is a rebound ledger: an auditable record that ties an efficiency claim to the total demand it is expected to create or prevent. A rebound ledger should name the unit metric, baseline, workload class, expected monthly and annual volume, peak timing, data-center region, water and cooling assumptions, cache policy, agent retry limits, context-retention rules, human review burden, incident-response burden, and the disposition of saved capacity.

The ledger should also name the counterfactual. Was the gain supposed to reduce an existing workload, make the same workload cheaper, move work to a smaller model, defer a capacity purchase, add new users, lengthen context windows, enable background agents, or enter a new market? A rebound claim without a counterfactual becomes a slogan: the institution can point to a smaller unit while avoiding the question of what the smaller unit replaced.

The ledger should separate three kinds of rebound. Direct rebound is more use of the same service because each call is cheaper. Indirect rebound is using the savings to fund larger models, richer media, longer context windows, or new product features. Structural rebound is the redesign of work around cheap model access: default meeting summaries, default agent tickets, default classroom helpers, default surveillance scoring, default generated drafts, default automated review. Structural rebound matters because it can make model demand look like ordinary workflow rather than a policy choice.

A rebound ledger also prevents category errors. It keeps model energy scores distinct from total product volume, chip efficiency distinct from grid interconnection, annual clean-energy claims distinct from local deliverability, and token spend distinct from review labor. That connects this essay to AI audit trails, AI audits and assurance, AI procurement, and the token meter as budget: the claim is not governed until the institution can show what changed after the unit got cheaper.

What the Energy Numbers Show

The best public numbers point in two directions at once: AI hardware and algorithms are improving, while total infrastructure demand is rising.

Epoch AI tracks long-run AI trends and estimates that pre-training compute efficiency is improving at roughly 3x per year, while AI chip performance per dollar has also improved substantially. Its energy research frames the tension directly: AI hardware and software can become more efficient while aggregate compute demand grows faster.

The International Energy Agency estimated that data centers used about 415 terawatt-hours of electricity in 2024, around 1.5 percent of global electricity consumption. Its 2025 Energy and AI report projected data-center electricity consumption to more than double to around 945 terawatt-hours by 2030, with AI as a major driver alongside other digital services. Its 2026 update kept the central 2030 projection close to that path, using a 2025 estimate of about 485 terawatt-hours and a 2030 projection of about 950 terawatt-hours.

In the United States, the Department of Energy summarized a Lawrence Berkeley National Laboratory report finding that data centers consumed about 4.4 percent of total U.S. electricity in 2023. The same report projected approximately 6.7 to 12 percent by 2028, with total data-center electricity usage rising from 176 terawatt-hours in 2023 to an estimated 325 to 580 terawatt-hours by 2028. EIA's 2026 outlook adds a different lens: data-center server electricity use alone accounted for an estimated 7% of commercial-sector electricity consumption in 2025 and grows substantially across its cases.

The 2025 IEA sensitivity cases show why the distinction matters. A high-efficiency case can lower data-center electricity demand relative to a base case, but still leaves demand high because AI and broader digital-service use continue to expand. The balanced lesson is not "efficiency fails." It is that efficiency can bend the curve without deciding whether the curve keeps climbing.

These numbers should be read carefully. They do not prove that every efficiency gain increases total electricity use. They do not prove that AI will overwhelm the global grid. They do show that per-unit improvements are not currently translating into a shrinking infrastructure footprint. The system is improving and expanding at the same time.

That is the policy-relevant fact. A society cannot govern AI energy demand by looking only at watts per token, training efficiency, model compression, or cooling improvements. It has to ask whether those gains are being banked as reduced load, spent on more capability, or converted into new forms of dependency.

Local Impact, Global Myth

Global percentages can make AI infrastructure sound abstract. One and a half percent of world electricity is significant, but it can also sound manageable beside electric vehicles, air conditioning, industrial motors, and broader electrification. That framing is useful, but it hides the local politics.

Data centers do not land evenly on the planet. The IEA notes that almost half of U.S. data-center capacity is concentrated in five regional clusters. Local grids, water systems, land-use boards, transmission queues, rate structures, emergency planning, and community consent carry burdens that do not appear in a global share.

This is where the rebound becomes institutional. Cheaper AI does not only mean more tokens in the abstract. It means more substations, more interconnection requests, more backup generation, more cooling systems, more water debates, more tax incentives, more utility planning, more ratepayer questions, more land-use fights, and more public officials asked to treat private compute demand as civic necessity.

The earlier essay The Data Center Becomes a Civic Machine treated the data center as local infrastructure. The Jevons version explains why that infrastructure keeps asking to grow even when the machines inside it get better. Efficiency lowers friction. Lower friction invites scale. Scale arrives as a zoning meeting, a grid upgrade, a power-purchase agreement, a water permit, or a local promise of jobs.

Cooling makes the same point in another unit. Water usage effectiveness can help compare facility designs, but LBNL's data-center guidance distinguishes direct cooling water from indirect water embedded in electricity generation. Dry cooling, evaporative cooling, reclaimed water, potable water, power draw, and drought exposure are tradeoffs, not interchangeable signs of virtue. A rebound analysis that counts only electricity can miss the water and heat burden shifted onto a place.

That local layer is also a safety issue. A large AI campus can change peak load, backup-power needs, outage planning, fire response, cyber risk, and ratepayer exposure. If the public sees only a promise of efficient chips, it misses the operational bargain being made with a shared grid.

The global myth says the cloud is everywhere. The local fact says it has addresses.

Banking the Gain

The practical test is whether an efficiency gain is banked or spent. A gain is banked when an institution can show that the same useful outcome is delivered with less aggregate resource use, lower peak burden, lower water or heat impact, less review labor, or less dependency than before. A gain is spent when the lower unit cost is converted into more calls, larger contexts, higher-resolution media, more agent loops, more users, more background monitoring, or larger infrastructure commitments.

Banking is not the same thing as lowering the price. A banked gain should leave a visible event: a retired workload, a smaller cluster plan, an enforceable cap, a reduced contracted load, a smaller peak obligation, a cancelled expansion, a lower review burden, or a policy that reserves the saved capacity for a named public purpose.

Banking efficiency requires a control surface. Product teams can route routine work to smaller models, cache stable results, limit context retention, batch non-urgent jobs, stop wasteful agent loops, and refuse low-value generation. Enterprises can set workflow budgets that pair model consumption with quality, rework, security, and review-time metrics rather than treating tokens as progress. Utilities and regulators can distinguish flexible compute from firm load, reward curtailment that is actually deliverable, and require large loads to show whether saved capacity reduces stress or merely enables the next expansion phase.

This is where model-level tools such as AI Energy Score are useful but not sufficient. A procurement team should ask for energy efficiency evidence, but also for expected monthly volume, workload mix, peak timing, data-center location, carbon and water accounting, fallback models, caching policy, agent retry limits, and conditions under which the vendor will reduce total resource use rather than upsell more capability. The governing unit is the workflow, not the benchmark alone.

Banking also requires a public-interest allocation rule. If a public agency, school system, hospital, or utility saves inference cost, the default should not be automatic expansion into every adjacent workflow. The saved capacity can be retired, reserved for accessibility or safety testing, shifted to lower-risk public service, or held back as grid margin. Without an allocation rule, the cheapest path will often be the most automated path.

Efficiency Theater

There is a weak version of AI efficiency politics that should be rejected: the claim that efficiency improvements are themselves proof of sustainability.

A company can announce a more efficient model while increasing total serving volume. It can tout lower energy per query while launching agent products that make many more queries. It can report power usage effectiveness while building larger facilities. It can describe renewable procurement while increasing local peak demand. It can celebrate model compression while pushing AI into workflows that did not previously require computation at all.

This does not mean the efficiency claims are false. It means they are partial. The honest question is: efficiency relative to what total system boundary?

For AI, the boundary has to include training, inference, data-center construction, chip manufacturing, networking, storage, cooling, water, grid upgrades, backup power, model retries, agent loops, downstream automation, and induced use. A narrow metric can still be useful, but it should not be allowed to masquerade as the whole ledger.

The same caution applies to carbon claims. A company can buy renewable energy certificates, sign a power-purchase agreement, or announce matching while still increasing the deliverable power a local grid must provide. The governance question is physical as well as accounting: which load, at which hour, in which balancing area, with which backup plan, and with whose costs?

There is an equal and opposite mistake: treating Jevons paradox as proof that efficiency is pointless. That is also wrong. Efficiency can reduce costs, expand access, lower emissions per task, reduce hardware pressure, make public-interest AI possible, and help institutions do useful work with less waste. The problem is not efficiency. The problem is efficiency without limits, disclosure, allocation politics, and demand governance.

The Governance Standard

A serious AI efficiency regime should govern absolute demand, not only per-unit performance.

Every efficiency claim should carry a rebound memo: the unit metric, baseline, workload mix, expected total volume, location and timing of load, water and cooling tradeoffs, review-labor assumptions, and whether saved capacity is retired, capped, or reinvested in more use. Without that memo, "more efficient" is an engineering fact trying to do policy work it cannot do alone.

First, efficiency claims should report total volume. A model provider should be able to say not only that a task uses fewer resources, but whether aggregate training, inference, and serving demand are rising or falling across comparable workloads.

Second, data-center permitting should include induced demand analysis. A proposal should not be evaluated only as a fixed facility. Regulators should ask what future expansion, power procurement, grid upgrades, water use, and local rate effects are made more likely by the project.

Third, product categories should be treated differently. A medical documentation assistant, a scientific model, a code agent, a synthetic video engine, an ad-personalization system, and a bulk content farm do not have the same public value. Efficiency policy should not treat every saved watt as equally worth reinvesting.

Fourth, providers should disclose enough to distinguish training from inference. Training runs are dramatic, but inference can become continuous. Public debate needs better evidence about how much demand comes from model development, consumer use, enterprise automation, agent loops, synthetic media, and search-like retrieval.

Fifth, local infrastructure costs should be made legible. Communities need to know who pays for transmission, generation, water systems, backup power, tax abatements, emergency services, and stranded infrastructure if demand projections change.

Sixth, public compute should be protected from pure rebound logic. A public compute commons should prioritize research, evaluation, education, accessibility, reproducibility, and public-sector capacity, not merely maximize utilization because the machines exist.

Seventh, safety governance should account for cheaper attempts. When models become cheaper to run, malicious or careless actors can run more trials, generate more variants, test more jailbreaks, automate more scams, and flood more channels. Lower cost changes the risk surface.

Eighth, demand reduction should be allowed to count as innovation. The AI economy rewards more use. Governance should also reward systems that avoid unnecessary model calls, preserve human judgment, use smaller models where adequate, cache responsibly, refuse wasteful automation, and keep some activities outside model mediation.

Ninth, large-load reliability should be part of AI governance. Operators, utilities, and regulators should know how compute loads ramp, curtail, restart, ride through faults, use onsite generation, and interact with regional planning. A data center is not governed only by its average annual energy use.

Tenth, water and heat should be reported with the same seriousness as power. Permits and public summaries should state cooling design, water source, consumption, withdrawal, reuse, discharge, drought plan, and tradeoffs between water savings and electricity use.

Eleventh, clean-energy claims should be deliverability claims. Providers should distinguish annual matching, hourly matching, certificates, physical power delivery, onsite generation, storage, and curtailment. A sustainability claim that cannot be tied to the load's time and place is weak governance.

Twelfth, workplace and product rebound should be measured, not assumed away. Enterprises deploying cheap AI should track token spend, rework, review time, security incidents, output quality, and workflow dependence so model consumption does not become a proxy for productivity.

Thirteenth, high-volume deployments should maintain rebound ledgers. Public agencies, regulated utilities, schools, hospitals, and large enterprises should document whether efficiency gains reduce aggregate demand, increase volume, shift load into new regions, expand review labor, or create new dependency on model-mediated workflows.

Fourteenth, agentic systems should count retries and background work. A product should not claim efficiency by measuring a single answer while excluding hidden planning, tool calls, web requests, verifier passes, failed attempts, memory operations, file analysis, and scheduled jobs. For agent products, the unit is the completed workflow.

Fifteenth, curtailability should be enforceable. If a data center, cloud service, or enterprise workload claims it can shift, pause, shed, or reroute demand during grid stress, that promise should appear in tariffs, interconnection agreements, operating procedures, tests, event logs, and penalties. Flexible load that cannot be called or verified is firm load with better public relations.

Sixteenth, contracts should reserve the right to spend less. AI procurement should avoid minimum-consumption commitments, bundle designs, or renewal terms that turn efficiency gains into mandatory growth. Buyers need audit rights, usage exports, model-routing controls, volume caps, workload deletion paths, and termination rights when a system saves unit cost by expanding total dependency.

What This Changes

The AI efficiency rebound is a pattern of recursive reality. The model becomes cheaper, so the world asks the model to see more of itself. More activity passes through the model. The model-mediated world then produces new expectations, new markets, new records, new habits, and new training traces. The system gets better at serving a society that has reorganized around being served.

The danger is not only higher electricity demand. It is the quiet normalization of machine intermediation because the marginal cost feels too low to notice. If summarization is cheap, every conversation becomes a record. If generation is cheap, every blank space becomes a content opportunity. If agents are cheap, every task becomes delegable. If surveillance inference is cheap, every signal becomes worth scoring. If persuasion is cheap, every user becomes worth optimizing.

That is why the efficiency debate belongs beside labor transition, synthetic media, institutional design, public compute, and high-control interfaces. Cheap AI is not merely affordable intelligence. It is a permission structure. It tells institutions that they can put models where they previously put friction, waiting, judgment, silence, or human refusal.

The right response is not to oppose efficiency. Waste is not a moral achievement. The right response is to make efficiency answer to purpose. What uses deserve scale? Which uses deserve limits? Who bears the grid cost? Who benefits from the automation? Which forms of attention, care, judgment, and public knowledge should not be converted into endless model calls just because the next call is cheap?

Jevons paradox does not say the future is fixed in advance. It says that a society must decide what to do with abundance before abundance decides what to do with society.

Source Discipline

Efficiency claims need source discipline because the same word can describe very different things. Model efficiency, chip efficiency, data-center power usage effectiveness, water usage effectiveness, training compute, inference price, annual electricity consumption, carbon matching, and deliverable peak power are not interchangeable. A lower price per token is not proof of lower total energy use. A lower watts-per-query estimate is not proof of lower grid impact. A megawatt interconnection request is not the same unit as annual terawatt-hours.

This article treats IEA, DOE, LBNL, EIA, FERC, NERC, and ISO materials as primary evidence for public energy, forecast, interconnection, reliability, and data-center water-metric claims. It treats Epoch AI as a research source for AI capability, compute, and efficiency trends, and academic rebound literature as conceptual support. Corporate efficiency announcements can be useful, but they should be paired with aggregate usage, workload mix, location, water, emissions, and cost-allocation evidence before being used as sustainability claims.

The source boundary is also explicit. EIA's May 2026 article reports server electricity use within the commercial building stock, not the whole U.S. data-center ledger. IEA projections are scenario-based outlooks, not promises. IEA capital-expenditure and satellite-tracking claims are signals about investment and capacity pipelines, not proof that every announced site will be built or fully utilized. Power usage effectiveness and water usage effectiveness are facility metrics, not full social accounts. Model-level energy scores are useful comparison tools, not substitutes for aggregate usage, siting, reliability, or cost-allocation evidence. FERC and NERC materials show governance questions under active development, not a completed settlement.

As reviewed on June 25, 2026, the IEA 2026 Key Questions on Energy and AI update should be cited separately from the 2025 Energy and AI report because it adds 2025 demand growth, capital-expenditure, AI-factory capacity, power-density, and updated 2030 projection claims. FERC's June 18, 2026 show-cause orders should likewise be distinguished from the earlier RM26-4 ANOPR docket: the first is targeted action against regional grid-operator tariffs; the second is the broader large-load rulemaking record.

The strongest version of the argument is conditional: when efficiency lowers the effective cost of useful AI, total demand can rise unless budgets, regulation, grid limits, safety rules, product design, or public purpose constrain the rebound. The weakest version is treating Jevons paradox as inevitability. Governance exists because the outcome is not automatic.

Sources

William Stanley Jevons, The Coal Question, 1865, via Yale Energy History.
International Energy Agency, Energy and AI, April 10, 2025, and Executive Summary.
International Energy Agency, Energy and AI: Energy demand from AI, 2025.
International Energy Agency, Key Questions on Energy and AI: Executive summary, 2026.
International Energy Agency, Energy and AI: Energy supply for AI, 2025.
U.S. Department of Energy Office of Electricity, Clean Energy Resources to Meet Data Center Electricity Demand, reviewed June 25, 2026.
U.S. Department of Energy, DOE Releases New Report Evaluating Increase in Electricity Demand from Data Centers, December 20, 2024.
Lawrence Berkeley National Laboratory, 2024 United States Data Center Energy Usage Report, December 19, 2024.
Lawrence Berkeley National Laboratory Center of Expertise for Data Center Efficiency, Water Efficiency, reviewed June 25, 2026.
International Organization for Standardization, ISO/IEC 30134-9:2022, Water usage effectiveness (WUE), 2022.
U.S. Energy Information Administration, Data center server energy use grows across the commercial building stock, May 19, 2026.
Federal Energy Regulatory Commission, Interconnection of Large Loads to the Interstate Transmission System, Docket No. RM26-4-000, reviewed June 25, 2026.
Federal Energy Regulatory Commission, FERC Launches Aggressive Targeted Action to Speed Large Load Integration, June 18, 2026.
Federal Energy Regulatory Commission, Fact Sheet: FERC Takes Action to Supercharge America's Grid for Efficiency, Reliability, and a Bold Energy Future, June 18, 2026.
North American Electric Reliability Corporation, Reliability Guideline: Risk Mitigation for Emerging Large Loads, May 2026.
Epoch AI, Trends in Artificial Intelligence, reviewed June 25, 2026.
Epoch AI, AI Energy Use: Data & Research, reviewed June 25, 2026.
AI Energy Score, standardized AI model inference energy-efficiency benchmark, reviewed June 25, 2026.
Alexandra Sasha Luccioni, Emma Strubell, and Kate Crawford, From Efficiency Gains to Rebound Effects: The Problem of Jevons' Paradox in AI's Polarized Environmental Debate, 2025.
Konstantin F. Pilz, James Sanders, Robi Rahman, and Lennart Heim, Epoch AI, Trends in AI Supercomputers, 2025.
Nature Cities, Digital Jevons paradox in urban data center energy systems, 2025.

Return to Blog