AI in a validated ERP: validate the controls, not the model

James Neal, Founder, Aperigon

The rules for AI in GMP are still being written. The durable principle is already clear — and Microsoft's ERP platforms, Business Central and the finance and operations apps, now have the controls to enforce it. Here is where we stand.

The question we get asked most often this year is some version of "is the AI validated?" It is the wrong question, and answering it the way most partners do — yes or no, feature by feature — gets regulated companies into trouble. You do not validate a model. You validate the controls around it: who is allowed to act, what a human must review before a record becomes official, and whether the system records every one of those steps in a form an inspector will accept. That holds whichever Microsoft Dynamics ERP you run — Business Central, or the finance and operations apps (Dynamics 365 Finance and Supply Chain Management, still widely called F&O). The distinction is the whole position, and the rest of this piece is why it holds even though the regulations underneath it are still being written.

The rules are being written right now

If you are waiting for the regulatory text to settle before you form a view on AI in your ERP, you will be waiting past the point where the decision matters. Three documents moved in the last twelve months, and all three are still in motion.

The draft of EU GMP Annex 22 — Artificial Intelligence was published in July 2025 and went through consultation that closed in October. As drafted, it excludes generative AI and large language models from GMP-critical activities altogether, and permits dynamic or generative tools only in non-critical applications, under a qualified person who evaluates every output with documented human review. That is a restrictive starting position, and it is contested: the EMA convened a multi-stakeholder workshop at the end of June 2026 specifically to reassess whether a flat prohibition on generative AI should survive, or give way to risk-based guardrails. In other words, the single most-cited line about AI in GMP — "generative AI is banned" — is the part of the draft most likely to change.

At the same time, the long-overdue revision of EU GMP Annex 11 — Computerised Systems went out for consultation, also in July 2025. It is not a touch-up. The annex expanded from five pages to nineteen, and for the first time addresses AI and machine-learning systems, cloud infrastructure, and digital service providers head-on, with a heavier emphasis on data integrity and lifecycle management. It is explicitly aligned to FDA's Computer Software Assurance thinking, to GAMP 5, to ICH Q9 on quality risk management, and to ISO 27001. Final text is expected around mid-2026, with a transition period likely running into 2027.

The frame those two sit inside is FDA's final Computer Software Assurance guidance for Production and Quality System Software, issued September 24, 2025. CSA is the most consequential shift in our field in a decade: a risk-based, least-burdensome approach that supersedes the old "validate everything to the same depth" reflex, and that explicitly tells manufacturers to lean on system logs and audit trails as assurance evidence rather than re-documenting what the software already records. FDA and EMA also published a joint set of "Good AI Practice" principles in January 2026. The direction of travel is consistent across the Atlantic, even where the specific text is not yet fixed.

The honest reading: anyone who tells you they can validate your AI workflow "to Annex 22" today is validating to a draft that is being rewritten. We will not anchor a client's build to a prohibition that may not survive its own comment period.

The principle underneath every draft

So we anchor to something more durable. Strip away the version numbers and one rule runs through all of it, unchanged since the first data-integrity guidance and present in every draft of the new ones: an AI output is uncontrolled until a competent human reviews it and the system records who did what.

Read it that way and the AI question stops being exotic. An agent that drafts a purchase invoice is, for validation purposes, doing what a clerk does — creating a record. The same expectations apply that have always applied to any computerised system in a regulated company: attribution of every action to a person or a defined process, a review and approval step before a draft becomes a controlled record, an immutable audit trail, enforced segregation of duties, and a written rationale for the configuration. None of that is new. What is new is that the actor on one side of the control might be a model instead of a person, which makes attribution and the human-review gate more important, not less.

Regulators are already enforcing on exactly this point. Enforcement roundups this year point to what appears to be FDA's first warning letter addressing AI-generated content in CGMP records, and the agency's stated position is the principle stated above almost verbatim — AI output must undergo authorized human review before it becomes a controlled record. The citation did not turn on whether the model was accurate. It turned on whether a human stood between the model and the official record, and whether the system could prove it.

That is the principle we validate to. It will outlast Annex 22's final wording, whichever way the generative-AI question lands.

Both platforms have the bones for it now

For years the honest answer to "can you enforce that in the ERP?" was "partially." Both Microsoft Dynamics ERP lines changed that in their 2026 Release Wave 1 — but they do not implement the controls the same way, and the difference matters more than the marketing suggests.

Record-level attribution — who, or what, touched the record. Business Central surfaces it inline: "Created by" and "Modified by" AI indicators mark, at the level of an individual record, whether a person or an agent made the change. The finance and operations apps capture agent actions in a dedicated activity log — the Copilot for Finance and Operations Agent Activity entity — that administrators can monitor and from which they can cancel an agent's in-flight actions, on top of F&O's long-standing database logging, audit trail, and electronic-signature framework already used for Part 11. One caveat worth naming now: that agent-activity log defaults to ninety days of retention, far short of GxP record-retention expectations, so it has to be extended or exported as part of the build rather than trusted as shipped.

The human-review gate — and here the platforms diverge. In Business Central the gate is on by default: agents draft for approval rather than posting automatically. The Payables Agent reads and matches an invoice and prepares it, but a person approves before anything commits, and a Review Bar surfaces every change for review in context. The finance and operations apps are built to run further on their own. F&O's Payflow Agent monitors payment queues, verifies vendor bank details against master data, executes payment runs, and posts the journal entries — autonomously for in-policy transactions, routing to a human only on exceptions such as a new vendor, a threshold breach, or a data mismatch. The Account Reconciliation Agent matches subledger to ledger and flags exceptions; on the Supply Chain side, agents monitor stock and raise purchase orders with, in Microsoft's own words, minimal human oversight. The gate exists, but it is a policy you configure, not a default you inherit. In a regulated build, that autonomy is precisely the thing you have to bound and prove.

Boundary, permission, and governance controls. Both lines sit under the same Microsoft governance fabric: the Copilot Control System and Agent 365, which add agent provisioning policy, agent permission tiers, Conditional Access for agents, an agent-action audit trail, and quarterly agent attestation. Business Central layers on expanded permissions auditing and an agent runtime that respects company, environment, and tenant boundaries. The finance and operations apps bring agent management — monitor and cancel agent actions — on top of their mature role-based security and segregation-of-duties model. In both, an agent cannot do what its configured identity is not permitted to do.

This is genuine progress, and it is why we hold that AI belongs in a validated ERP rather than outside it. But a control that exists is not a control that is validated. An indicator that an agent touched a record is not evidence your audit trail captures it for the workflows you run; a human-review gate being available is not evidence it cannot be bypassed; an autonomy policy being configurable is not evidence its boundaries actually hold. Most partners will turn the agents on and call it modern. Few will write the protocol that proves the gate holds — and that gap is wider on the finance and operations side precisely because the agents do more on their own.

Where Aperigon stands, and what we actually do

Our position is that an AI or agent action is a configuration decision, and it goes through the same methodology as every other configuration decision we make: a user requirement, a documented risk classification, a configuration specification, a test with written acceptance criteria, traceability from requirement to qualification, and an SOP that wraps the whole thing. We validate the controls — attribution, the approval gate, the audit trail, segregation of duties — and we size the protocol to risk under CSA and GAMP 5 rather than testing every agent to the same depth. We validate to the principle, not to a draft, so the build survives the final Annex 22 and Annex 11 without rework.

Concretely, take an autonomous finance agent — Business Central's Payables Agent, or the finance and operations apps' Payflow Agent. We would classify the workflow by patient-safety, product-quality, and data-integrity risk, and scope the protocol to that classification. Then we would pin down the configuration: in Business Central, that the agent drafts for approval and cannot post without an authorized human; in the finance and operations apps, the exact policy boundaries within which the agent may act on its own, and the proof that every out-of-policy case routes to a named human. We would test both the positive and the negative case — that an in-policy action does what it should, and that no path lets an out-of-policy action commit unreviewed. We would verify that the audit trail captures the agent's involvement and the approver, that records are attributable and time-stamped, and that agent-activity logs are retained for the full required period rather than the platform default. We would test segregation of duties so the approver cannot be the agent's configured identity. We would document the rationale for each decision and the SOP for periodic audit-trail review. And because Microsoft ships a release wave twice a year to both products, we would put the workflow under periodic review, so the next agent update is assessed against the validated baseline rather than discovered at the next inspection.

That is not a longer way of saying "we're cautious about AI." It is the opposite. It is how you put AI to work in a regulated finance and supply-chain system and still hand your auditor a clean answer.

The bottom line

AI belongs in a validated ERP, whether you run Business Central or the finance and operations apps. Both platforms now have the controls to make it defensible, and the regulatory direction — risk-based, human-in-the-loop, audit-trail-as-evidence — rewards companies that build it in deliberately. But two non-positions are circulating as if they were strategies. "We turned on Copilot" is not a validation strategy; it is an unexamined control surface waiting to be cited — more so on the finance and operations side, where the agents act with less of a default human gate. "AI is banned in GMP" is not a position; it is an abdication that reads a draft prohibition as a permanent one and forfeits the productivity in the meantime.

The defensible middle is narrow and it is the right place to stand: validate the controls, not the model; validate to the principle, not the draft; and produce the evidence before the inspector asks for it, not after. That is what we mean by inspection-ready — and it is as true for an AI agent as it is for a general ledger.

Aperigon delivers Microsoft Dynamics 365 — Business Central and the finance and operations apps — to Life Sciences companies, validated by design, inspection-ready on day one. If you are weighing how to put AI or agents to work in a regulated ERP, start a conversation.

← Back to Insights