The Week Finance Got Its Agent Stack
I owe you an update.
If you’ve been reading this newsletter for a while, you know I’ve been saying the same thing about AI agents for two years: the technology is coming, but it’s not here yet. I meant it every time I said it. Too many vendors were slapping the word “agent” on what amounted to a chatbot with a calendar invite. I had a simple test: does the thing run a conditional loop? Can it take a task, evaluate the output, decide what to do next, and keep working until the job is done or it hits a wall and asks for help? Most so-called agents couldn’t pass that bar. So I didn’t cover them.
I’m covering them now. The agents are here. Early stages, yes. Rough edges, sure. But the core capability has crossed a line in the last couple of months that I can’t brush off.
What changed? The models got meaningfully better at reasoning. The current generation of frontier models, Claude Opus 4.5 and 4.6, GPT-5.2 and 5.3, Gemini 3, all ship with chain-of-thought reasoning built in. That’s the ability to decompose a multi-step problem before acting on it. Previous “agents” were auto-complete running in a loop. These actually evaluate context, weigh rules, and decide what to do next. That matters for finance work, where a wrong journal entry isn’t a typo. It’s a misstatement.
The economics shifted too. Inference costs dropped roughly 280x for equivalent model performance since late 2022 (Stanford AI Index, 2025). Multi-step agent loops that would’ve blown through hundreds of dollars per run now pencil out for routine transaction coding. Context windows expanded to 1M tokens, enough to hold an entire chart of accounts and a month of transactions in working memory. And on SWE-bench Verified, the best available proxy for “can this model do reliable multi-step work,” top scores jumped from around 4% in early 2024 to over 80% in February 2026.
That’s the foundation. Here’s what vendors built on top of it this week.
What Actually Shipped
Goldman Sachs + Anthropic went custom. Goldman’s CIO Marco Argenti told CNBC that Claude’s reasoning turned out to be “surprisingly capable” at rules-based accounting tasks beyond coding. The bank now runs Claude agents across trade accounting, compliance checks, transaction reconciliation, and client vetting. Argenti hinted it could reduce dependency on third-party providers. When a bank that size starts replacing vendor contracts with in-house agents, the rest of the industry notices.
Ramp’s Accounting Agent tackles the workflow most finance teams dread: transaction coding. It auto-codes every expense across GL account, department, class, location, and custom fields. It reviews 100% of spend against policy. Low-risk transactions sync automatically with full audit logs. And it learns from corrections, so the model gets sharper with each close cycle. Ramp says teams are seeing a 3x faster monthly close. For context, only 2% of finance teams currently use AI as their primary method for coding and posting transactions (Ramp, Feb 2026). That number is about to move.
Oracle NetSuite’s Autonomous Close is the most ambitious rollout of the bunch. It’s a full suite: an exception management agent that scans data and flags anomalies before you find them, a close management agent that tracks task completion and net income impact in real time, a flux analysis monitor that diagnoses root causes by pulling data across other agents, and AI-powered bank transaction matching. The system even identifies missing transactions (like unpaid commissions) and dynamically accrues them as pro forma entries. That’s the kind of judgment call that used to require a senior accountant with full context of the business.
HPE’s “Alfred” rounds out the picture. Built with Deloitte, it’s scaling from pilot to production across credit, collections, AP/AR, and forecasting. CFO Marie Myers told CFO Dive it’s the centerpiece of her 2026 finance strategy.
How Agents Actually Work (30-Second Version)
If you’ve used Robotic Process Automation (RPA), forget the analogy. RPA follows a fixed recipe and breaks when the recipe changes. An agent reasons through context.
Here’s the practical difference. RPA matches an invoice to a purchase order when the fields align perfectly. An agent handles partial payments, grouped transactions, and mismatched vendor names by evaluating context at each step and deciding what to do next. It flags what it can’t resolve. It logs what it did. It routes exceptions for human review.
The architecture pattern across Goldman, Ramp, NetSuite, and HPE is roughly the same: an ingestion layer pulls data from ERPs, bank feeds, and subledgers. A reasoning layer applies rules and learned patterns. An action layer codes transactions, drafts journal entries, and flags exceptions. And an escalation layer determines what gets auto-synced versus what needs a human sign-off.
Adoption is running ahead of the control frameworks designed to govern it.
The Numbers Worth Tracking
54% of CFOs say AI agent integration is their top digital transformation priority for 2026 (Deloitte CFO Signals, Q4 2025). 87% say AI will be extremely or very important to finance operations this year. The appetite is real.
So is the risk. 86% of mid-market CFOs who’ve deployed agents encountered hallucinated or inaccurate data (Maximor Finance AI Adoption Report, Feb 2026). 67% say human oversight is extremely or very critical. And only 21% of companies have what Deloitte calls “mature” agent governance (Deloitte State of AI in the Enterprise, Jan 2026).
Meanwhile, 27% of CFOs report that 50-75% of their F&A tasks are already managed by agentic AI. Adoption is running ahead of the control frameworks designed to govern it. That gap is the story.
Three Takeaways
The buy/build/DIY spectrum just got real. Goldman can embed Anthropic engineers for six months. Your team probably can’t. Mid-market finance leaders will choose between pre-built agents (Ramp, NetSuite), consulting-led builds (HPE/Deloitte’s Alfred model), or hands-on tools like Anthropic’s Cowork finance plugin. Knowing which lane fits your team, ERP, and risk tolerance is the first decision to make.
Agent governance is the new SOX readiness. 74% of companies plan agentic deployment within two years but only 21% have governance models in place. If you’ve lived through a SOX implementation, you recognize the pattern: the technology arrives before the control framework. CFOs who build audit trails, escalation protocols, and attestation standards now will own the next compliance cycle instead of scrambling to catch up.
Start with the close. Every vendor this week targeted the same workflow cluster: transaction coding, reconciliation, close management. That’s not a coincidence. These are high-volume, rules-heavy, time-pressured processes where agents deliver the fastest ROI and where the control framework is easiest to define. If you’re choosing a pilot, this is it.
Unlock the full model & templates →
The Pro edition walks through setting up your own finance agent stack this quarter, including a buy vs. build vs. DIY decision matrix, step-by-step Cowork finance plugin setup, a prompt pack for reconciliation and close workflows, and the five-point control framework every CFO needs before agents touch live data.


The real test of the agent stack: Does it accelerate CASH conversion — or just automate reporting, close, and recon?
Goldman can fund a lab and carve out expensive green-shade layers. The mid-market has to fund results.
For most operators, the fastest ROI won’t come from AI in the close. It will come from fixing broken invoicing and deploying AI-enabled exception resolution — what we often mislabel as “collections.”
Cash rarely stalls because a customer won’t pay. It stalls because something upstream was wrong.