Resources

Why AI isn't saving your senior reviewers time (yet).

Written by The Team at qrtr | Apr 27, 2026 5:05:59 AM

The productivity paradox in professional services, and what has to change before AI delivers on its promise.

Key takeaways

  • The largest 2025 and 2026 surveys all point to the same finding: most firms have adopted AI, but most are not yet seeing measurable productivity gains.
  • PwC's 29th Annual Global CEO Survey found that 56% of companies report no significant financial benefit from AI to date. Only 12% report both cost and revenue gains.
  • A National Bureau of Economic Research study of nearly 6,000 executives found that over 80% of firms saw no impact of AI on productivity or employment over the past three years.
  • Workday's research found that nearly 40% of the time AI saves is being lost again to verification, correction and rework.
  • The pattern is recognisable. Robert Solow described it about computers in 1987. The fix then was the same as the fix now: redesign the work around the tool, do not just hand the tool to the workforce.
  • For professional services firms, the practical implication is that general AI tools cannot solve a verification problem they were not built to solve. A different layer of tooling is needed.

The productivity gains are not where they should be

Three years after ChatGPT arrived, the data on AI's productivity impact is starting to settle. The picture it paints is uncomfortable for any firm that has invested heavily in AI tools and is still waiting for senior reviewer time to come back.

The pattern is consistent across the largest surveys published over the past twelve months. Most firms have adopted AI. Most senior people are using it. And most are seeing little to no measurable productivity gain from it.

For professional services firms, where senior time is the asset and document review is the bottleneck, this matters. This piece walks through what the evidence actually shows, why the gap exists, and what needs to change if AI is going to start delivering on the promise.

What does the latest data say about AI productivity returns?

The numbers tell a consistent story across three of the most-cited studies of the past year.

PwC's 29th Annual Global CEO Survey, released at Davos in January 2026 and based on responses from 4,454 chief executives across 95 countries, found that 56% of companies have seen no significant financial benefit from AI to date. Only 12% (the group PwC calls the "vanguard") report both cost and revenue benefits. CEO confidence in their own company's revenue growth has fallen to 30%, the lowest level in five years, down from 38% in 2025 and 56% in 2022.

A National Bureau of Economic Research working paper released in February 2026, surveying nearly 6,000 senior executives across the United States, United Kingdom, Germany and Australia, reached a sharper conclusion. Although 69% of firms actively use AI, more than 80% reported no impact of AI on either employment or productivity over the past three years. The average executive was using AI for just 1.5 hours per week, and a quarter reported no use at all.

Earlier, in August 2025, MIT's NANDA initiative published The GenAI Divide: State of AI in Business 2025. Drawing on 150 executive interviews, 350 employee surveys and 300 public deployments, the study found that 95% of enterprise generative AI pilots had delivered no measurable impact on profit and loss, despite an estimated 30 to 40 billion dollars of enterprise spending.

These are not fringe findings. They are the largest and most rigorous studies available, and they agree with each other.

Why does this look so familiar? Solow's paradox, again

The pattern is not new.

In 1987, the Nobel laureate Robert Solow observed in the New York Times Book Review that "you can see the computer age everywhere but in the productivity statistics." Investment in computers had grown rapidly through the 1970s and 1980s. The expected productivity gains had not arrived. The observation became known as the Solow productivity paradox.

In February 2026, Apollo Global Management's chief economist Torsten Slok updated the line: "AI is everywhere except in the incoming macroeconomic data." His own analysis confirmed that AI was not yet visible in employment data, productivity figures, or inflation readings, and that outside a handful of large technology companies it was not visible in corporate profit margins either.

Solow's paradox eventually resolved itself. By the late 1990s, productivity gains from computing did show up in the data. But it took roughly fifteen years, and it required something more than the technology itself. It required organisations to redesign how work was done around the new tool. The technology was necessary. It was not sufficient.

The current AI moment may follow the same shape. Whether it does will depend less on the models and more on what gets built around them.

Where is the time saved by AI actually going?

Of all the recent research, the most diagnostic finding for professional services firms comes from Workday's January 2026 study, Beyond Productivity: Measuring the Real Value of AI, which surveyed 3,200 employees and leaders.

The headline numbers look promising. 85% of employees report saving between one and seven hours per week using AI tools. The catch is what happens to those hours. Workday found that nearly 40% of that saved time is immediately lost again to what the report calls rework: correcting errors, verifying outputs, and rewriting content that fails to meet quality requirements. Only 14% of employees report consistently positive net outcomes.

The most active AI users carry the biggest verification burden. 77% of those who use AI every day say they review AI-generated work just as carefully as work done by humans, if not more. Workday's president of product, Gerrit Kazmaier, put it directly: "Too many AI tools push the hard questions of trust, accuracy, and repeatability back onto individual users."

For professional services firms, this finding is the centre of the problem. Senior review work is verification work. If AI accelerates the writing of a report but leaves the verification load unchanged or heavier, the senior reviewer has not been freed. They have been moved from one part of the workflow to another, often with less context and lower trust in the underlying material.

Why are professional documents different?

There is a reason general-purpose AI tools struggle to deliver in document-heavy professional environments, and it is not a model quality problem. It is a representation problem.

A technical specification is not flat text. It has sections, subsections, requirements, definitions, and cross-references, each carrying a particular weight. An audit working paper is structured around evidence, methodology, and the conclusions that bind the two. An expert witness report is held together by the chain between assertion, supporting source, and qualification.

When a general AI tool processes these documents as flat text, it loses the information that makes them professionally meaningful. It can summarise. It cannot verify. It cannot reliably tell whether the cited clause actually says what is claimed, whether the executive summary aligns with the body, or whether a conclusion is supported by the evidence in front of it.

This shows up across professional contexts in different forms.

In an engineering consultancy, a project director signs off a design report citing standards she has not personally re-read in months. She is relying on the team to have got the citations right. AI has not yet given her a way to check that without re-reading every clause herself.

In an audit firm, a partner reviews a working paper and asks whether the evidence supports the conclusion. AI can summarise the paper. It cannot answer the question.

In a law firm preparing for arbitration, an expert witness report cites standards, methodologies and underlying data. The opposing side will check every reference. The senior partner reviewing the report needs the same level of assurance before sign-off. A general AI tool cannot provide it.

In each case, the work the senior person needs help with is verification, not generation. And verification is where general AI tools currently stop.

What separates the firms getting AI returns from the firms that aren't?

The data offers a clear signal about what differentiates the 12% that PwC calls the vanguard from the 88% still waiting.

PwC found that the firms achieving both cost and revenue gains from AI share something specific. They have built what PwC calls "AI foundations": responsible AI frameworks, integrated technology environments, and clearly defined road maps. These firms are three times more likely to report meaningful financial returns than peers without those foundations.

MIT's research surfaced a related pattern. The 5% of pilots that succeed are not generally the largest or best-funded. They are the ones tied to a specific workflow, with deep integration into how the work is actually done, often delivered through specialised vendors rather than general-purpose tools. According to the MIT study, internal builds succeed at roughly half the rate of focused, externally-built tools.

The throughline is consistent. Returns do not come from giving everyone a chat interface. They come from selecting one workflow, redesigning it around AI, and choosing tools that integrate deeply enough to change how the work happens, not just how fast it happens.

What this means for senior review work

For professional services firms, the practical implication is reasonably clear.

Senior review is one of the highest-value, highest-cost activities in the firm. It is also one of the activities most resistant to general AI tools, because it is fundamentally about verification: checking that requirements are met, that citations are accurate, that conclusions follow from evidence, that the document holds together against an external standard.

Generic AI summarisers compress the document. They do not verify it. Generic AI writing assistants accelerate authoring. They do not reduce the review burden, and as Workday's data shows, they often increase it by introducing more material that needs checking.

What is missing is a layer between the language model and the professional document. A layer that understands document structure: requirements, references, evidence, standards. A layer that produces verifiable findings rather than fluent prose. A layer that gives the senior reviewer the evidence to make a faster, more defensible decision rather than another draft to read.

This is the layer that the productivity data says has not yet been built at scale. It is also where the next wave of returns will come from.

What firms can do now

A few practical steps for firms looking to move from the 88% to the 12%.

Be specific about which workflow you are trying to change. General AI rollouts produce general results. Pick one document type, one review process, one bottleneck. Measure it before and after.

Treat verification as the primary problem, not generation. Most professional document work is reviewed multiple times before sign-off. Tools that reduce verification load will move the needle. Tools that only accelerate first drafts often will not.

Test for transparency, not just output quality. A reviewer needs to see why something has been flagged, with reference to the source. Opaque AI suggestions are not trusted in professional environments, and untrusted suggestions get re-checked manually, which eliminates the time saving.

Choose tools that integrate with how the work actually happens. This usually means tools that operate inside the document environment your authors and reviewers already use, rather than parallel chat interfaces that require switching context.

The firms that get this right over the next twelve to twenty-four months will be the ones that show up in the next round of survey data on the right side of the divide.

A closing observation

The Solow paradox of the 1980s did not resolve because computers got better. It resolved because organisations learned how to redesign work around them. The current AI productivity paradox is likely to follow the same pattern.

For professional services firms, the practical opportunity is to stop waiting for general AI tools to solve a verification problem they were not built to solve, and to start building the document-level tooling that actually addresses it. The senior time you are trying to recover is not on the other side of a better summariser. It is on the other side of better verification.

Qrtr is being built specifically to fill that layer. If you would like to see how it applies to your firm's review work, you can register your interest for early access.

Frequently asked questions

Why is AI not delivering productivity gains in professional services? The most common reason is that AI tools are being deployed without changing the workflow around them. PwC's research shows that companies with strong AI foundations and integrated workflows are three times more likely to report meaningful financial returns. MIT's research finds that focused, deeply integrated tools succeed at roughly twice the rate of broad rollouts.

What is the AI productivity paradox? The AI productivity paradox refers to the gap between widespread AI adoption and the lack of measurable productivity gains in the macroeconomic data. The phrase echoes Robert Solow's 1987 observation about computers and was revived in 2026 by Apollo's chief economist Torsten Slok.

Where does the time AI saves actually go? According to Workday's 2026 research, nearly 40% of the time saved by AI is lost to rework: correcting errors, verifying outputs, and rewriting low-quality content. Only 14% of employees report consistently positive net outcomes from AI use.

What kind of AI tooling is most useful for senior reviewers in professional services? Senior review is verification work. The most useful tooling is the kind that surfaces specific, defensible findings against a known standard or requirement, with transparent reasoning the reviewer can audit, and that operates inside the document environment the team already uses.

How long did it take for the Solow productivity paradox to resolve? Roughly fifteen years. The productivity gains from computing began showing up clearly in the data in the late 1990s, after organisations had redesigned work processes around the technology rather than simply layering it on top.

References

  1. PwC (2026). 29th Annual Global CEO Survey: Leading Through Uncertainty in the Age of AI. PricewaterhouseCoopers International. Available at: https://www.pwc.com/gx/en/ceo-survey/2026/pwc-ceo-survey-2026.pdf
  2. Yotzov, I., Barrero, J. M., Bloom, N., Bunn, P., Davis, S. J., Foster, K. M., Jalca, A., Meyer, B. H., Mizen, P., Navarrete, M. A., Smietanka, P., Thwaites, G. and Wang, B. Z. (2026). Firm Data on AI. NBER Working Paper No. 34836. National Bureau of Economic Research. Available at: https://www.nber.org/papers/w34836
  3. MIT NANDA (2025). The GenAI Divide: State of AI in Business 2025. Massachusetts Institute of Technology.
  4. Workday and Hanover Research (2026). Beyond Productivity: Measuring the Real Value of AI. Workday, Inc. Released 14 January 2026.
  5. Slok, T. (2026). Waiting for the AI J-Curve. The Daily Spark, Apollo Academy, 14 February 2026. Available at: https://www.apolloacademy.com/waiting-for-the-ai-j-curve/
  6. Solow, R. M. (1987). We'd Better Watch Out. New York Times Book Review, 12 July 1987, p. 36.