Health · Briefing

Governance notes from Geisinger’s diagnostic agents rollout.

What changed when Geisinger deploys diagnostic agents, in under five minutes.

INTELAR · Editorial cover · Editorial visual for the Health desk.

Tomás Aguilar AI Editor · Health desk · Swiss-neutral charter

AI March 17, 2024| 7 min read| Live

Geisinger Health System completed the first formal governance review of its diagnostic agent programme in February 2024, clearing four capabilities for expanded deployment across its primary care network in central Pennsylvania. The announcement was internal — a memo from Dr. Rafael Montoya, Geisinger's Chief Medical Information Officer, to the heads of its 11 hospital medical staffs — but its substance is consequential for any health system operating outside the major academic medical centres. Geisinger is not Mayo Clinic. It is a 600-physician rural and semi-rural integrated delivery network serving a population that is older, sicker, and more geographically dispersed than the coastal teaching systems that dominate clinical AI coverage. That context is precisely what makes its governance decisions worth reading closely.

A rural system with a research pedigree

Geisinger's institutional identity sits at an unusual intersection: a community health system with research infrastructure that would embarrass many academic medical centres. The MyCode Community Health Initiative, Geisinger's longitudinal genomic biobank, enrolled its 300,000th participant in late 2023 — making it one of the largest health-linked genomic cohorts in the United States. MyCode is not incidental background. It established, over 15 years, the clinical rigor and consent architecture that Geisinger applies to any programme touching patient data at scale. When Montoya's team began designing the governance framework for diagnostic agents in early 2022, they did not start from a blank page. They started from the MyCode governance committee's precedent: phased rollout, population-stratified validation, and a standing ethics review with patient advocate representation.

The central Pennsylvania patient population creates constraints that reshape the agent design calculus. Geisinger's primary care catchment includes 45 rural counties with limited specialist access. The median drive time to a Geisinger tertiary facility for patients in the northern catchment is 78 minutes. An agent that surfaces a differential requiring specialist confirmation — but where no specialist appointment is available for six weeks — is not simply an advisory tool. It is a care-coordination instrument. Montoya's team built that reality into the governance framework's scope definitions from the start. Agents approved for primary care deployment at Geisinger carry a rural-access annotation specifying how their outputs should be interpreted in low-resource clinical contexts where specialist escalation is logistically constrained.

The governance committee Montoya chairs — formally the Clinical Artificial Intelligence Oversight Board, CAIOB — includes three patient advocates drawn from Geisinger's MyCode participant advisory council. Patient representation on governance bodies is standard in research ethics; it is unusual in operational AI deployment. Montoya's rationale is direct: the population Geisinger serves has historically been underrepresented in the training data of commercial clinical AI systems, and the only structural protection against that gap is having that population's representatives in the room when deployment decisions are made.

From biobank to deployment: what MyCode taught the governance team

The MyCode precedent shaped Geisinger's agent governance in three specific ways. First, consent architecture. MyCode established that broad, recontactable consent — where participants agree to future uses of their data for purposes not yet specified — is operationally viable at Geisinger's scale but requires a standing ethics committee empowered to review novel applications before they proceed. The CAIOB inherited this structure. Any new agent capability that touches patient data in a way not explicitly covered by Geisinger's existing clinical AI consent notice triggers a review cycle before deployment, regardless of its technical readiness. The review cycle has a defined 45-day ceiling; after 45 days, the CAIOB must either approve, reject, or formally defer with documented rationale.

Second, population stratification. MyCode's research programme long ago identified that Geisinger's patient population carries a higher-than-average burden of hereditary disease, related in part to the historically limited genetic diversity of European-origin founder populations in rural Pennsylvania. This pattern trained Geisinger's researchers to test every output — genomic or otherwise — across population subgroups before generalising. The agent evaluation framework Montoya's team built specifies that validation case sets must be stratified by age decile, primary insurance status, county of residence (as a proxy for rural access tier), and comorbidity burden. A capability that clears the aggregate validation threshold but fails performance criteria in any single stratum does not clear for deployment. It returns to the vendor for remediation.

Third, long-horizon monitoring. MyCode participants are followed indefinitely. The governance culture that produces manages data over decades, not quarters. Montoya applied an equivalent expectation to agent deployment: performance monitoring is not a 90-day post-launch window but a standing programme with no defined end date. CAIOB receives a quarterly performance report on every deployed capability. Capabilities that show consistent performance above threshold for eight consecutive quarters are eligible for reclassification to reduced-oversight status — but the monitoring never stops. It scales down; it does not switch off.

We serve patients who drive 78 minutes to reach us. An agent that sends them home with an incomplete differential is not a productivity tool. It is a harm at scale we have no way to walk back.

Epic as the governance surface, not just the delivery channel

Geisinger runs on Epic. Every agent capability in the deployment touches the clinical workflow through Epic's clinical decision support and ambient documentation modules. Montoya's team made an early architectural decision that is now being replicated at peer institutions: they treat the Epic integration layer as a governance surface, not merely a delivery channel. The distinction matters. A delivery-channel framing means the agent runs elsewhere and its output is piped into Epic for display. A governance-surface framing means Epic's workflow logic — the sequence in which information reaches the clinician, the permissions model that controls what actions are available at each step, the audit trail that Epic natively generates — becomes part of the governance architecture itself.

In practice, this means Geisinger's CAIOB reviews every Epic workflow configuration change that affects agent output display before it is pushed to production. The vendor partner managing the ambient documentation layer — Abridge, the Pittsburgh-based clinical AI company whose ambient documentation technology Geisinger began piloting in 2022 — cannot modify the Epic display logic unilaterally. Changes go through a joint change-control board that includes Geisinger's Epic analysts, Abridge's implementation engineers, and a CAIOB designee. The change-control board meets fortnightly and has veto power over any modification that affects how agent outputs are surfaced to clinicians, regardless of whether the change originates from the vendor or from Geisinger's own Epic team.

The second vendor relationship — covering the diagnostic differential and care-gap identification capabilities — runs through Suki AI, which provides the underlying natural language processing infrastructure integrated into Geisinger's primary care workflow. Suki's contract with Geisinger includes a governance addendum that Montoya's legal team negotiated in Q3 2022: Suki is required to notify Geisinger of any model update that affects the capabilities deployed in Geisinger's environment within five business days of internal release, and Geisinger retains the right to delay the update pending CAIOB review for up to 30 days. The addendum is unusual in vendor contracts of this type. It reflects Geisinger's position that governance of a deployed clinical agent includes governance of updates to the agent — not simply the initial deployment decision.

Primary care first, specialty by invitation

Geisinger's rollout sequence is deliberate and asymmetric. Primary care went first. Specialty departments — cardiology, oncology, nephrology — are not in the current live deployment; they are in a parallel evaluation programme operating under what CAIOB calls the specialty access protocol, a distinct evaluation pathway with higher validation thresholds and additional specialty-society input. Montoya's rationale for this sequencing is specific: primary care volume is large enough to generate statistically meaningful performance data quickly, but the consequences of an agent error in primary care are, in most cases, remediable. A missed differential in a primary care visit that is caught at follow-up is a documentation failure. A missed differential in an oncology visit that delays a cancer diagnosis is a different category of harm. The specialty access protocol requires that candidates complete primary care validation first, then undergo a separate 180-day specialty-specific evaluation before clearing for deployment in clinical specialties.

The primary care network where agents are now live covers 62 outpatient clinics across Geisinger's central Pennsylvania and northeastern Pennsylvania catchment. The four capabilities that cleared the February 2024 governance review are: ambient documentation (visit note generation from physician-patient conversation), care-gap identification (flagging overdue preventive services based on population health data), medication adherence summarisation (surfacing pharmacy refill patterns and flagging likely non-adherence), and chronic disease management prompts (structured reminders for HbA1c monitoring, blood pressure targets, and CKD staging in patients with qualifying diagnoses). All four are advisory. None can modify the EHR record without explicit clinician action. None can generate a referral, order a test, or prescribe.

The care-gap identification capability is the most operationally significant. Geisinger's population health programme has tracked preventive care gaps for years using its existing HealtheRegistries infrastructure. The agent formalises that tracking into a real-time clinical encounter prompt — when a patient matching a care-gap criterion presents for any visit type, the agent surfaces the gap within the Epic encounter interface. In the first six weeks of live deployment, the capability generated 14,200 care-gap prompts across 62 clinics. Clinicians acted on 61 per cent of those prompts within the same encounter. The historical baseline for care-gap closure at Geisinger's primary care sites, using the prior passive registry approach, was 34 per cent same-encounter action rate. That delta — 61 per cent versus 34 per cent — is not an AI story. It is a workflow integration story. The agent did not improve the medicine. It changed when the information reached the physician.

What to watch

The variables that determine whether Geisinger's governance model diffuses into mid-market health systems over the next 18 months are concrete and trackable.

Whether the specialty access protocol clears its first candidate capability in cardiology before year-end 2024 — cardiology is the highest-volume specialty at Geisinger and the deployment that would generate the clearest performance data for a rural population with elevated cardiovascular disease burden.
Whether CAIOB formalises the rural-access annotation as a published standard that other rural and community health systems can adopt — the concept is transferable but currently exists only as internal Geisinger documentation, which limits its influence on peer institutions.
Whether Abridge's contract with Geisinger serves as a template for governance addenda in clinical AI vendor agreements more broadly — the five-day model-update notification and 30-day review right are not standard in the industry, but several health system legal teams are understood to be studying the Geisinger contract structure.
Whether the MyCode biobank's longitudinal data is used to build Geisinger-specific fine-tuned validation sets for the specialty access protocol — this would represent a significant competitive advantage in clinical evaluation that no commercial vendor can replicate and no peer institution without a comparable biobank can access.
Whether the care-gap closure data — 61 per cent same-encounter action rate versus a 34 per cent historical baseline — is published in a peer-reviewed clinical informatics journal, which would establish Geisinger as the reference institution for primary care agent deployment outcomes in rural populations and accelerate the specialty rollout timeline by providing external validation for the CAIOB's internal performance claims.

Frequently asked

What is the Clinical Artificial Intelligence Oversight Board and who sits on it?: CAIOB is Geisinger's standing governance body for clinical AI deployment, chaired by Dr. Rafael Montoya, the system's Chief Medical Information Officer. Membership includes senior representatives from clinical informatics, legal and compliance, quality improvement, and three patient advocates drawn from the MyCode Community Health Initiative participant advisory council. The board holds veto authority over all new capability deployments and over vendor-initiated model updates to live capabilities.
Why did Geisinger deploy in primary care before specialty medicine?: Volume and harm gradient. Primary care generates sufficient interaction volume to produce statistically valid performance data quickly, and the consequence of an agent error at primary care — typically remediable at follow-up — is categorically different from an error in oncology or cardiology where delayed action produces irreversible harm. The specialty access protocol requires primary care validation as a prerequisite and adds 180 days of specialty-specific evaluation before any capability clears for deployment in a clinical specialty.
What is the rural-access annotation and how does it change agent behaviour?: The rural-access annotation is a deployment-context flag attached to each approved capability specifying how outputs should be interpreted when specialist escalation is logistically constrained. It does not change the agent's underlying output but modifies the workflow prompt that accompanies the output — indicating, for example, that a recommendation to consult a subspecialist should be accompanied by Geisinger's telehealth consult pathway rather than an in-person referral. The annotation was developed in response to the 78-minute average drive time to tertiary care for patients in Geisinger's northern catchment.
How does Geisinger's governance addendum in vendor contracts work in practice?: The addendum requires vendors to notify Geisinger within five business days of any model update that affects deployed capabilities. Geisinger may then trigger a 30-day review hold, during which the update is not applied to production systems. The CAIOB designee on the change-control board must clear the update before it proceeds. Vendors that miss the five-day notification window are in breach of contract and subject to a remediation process that includes a mandatory post-incident review. This structure gives Geisinger's governance committee effective control over the deployed agent's behaviour between formal contract renewal cycles.
Can smaller rural health systems replicate Geisinger's governance model without MyCode infrastructure?: The governance structure — CAIOB composition, the 45-day consent review cycle, the specialty access protocol sequencing, the vendor addendum language — is replicable without a biobank. What smaller systems cannot replicate is the population-stratified validation case set that Geisinger builds using MyCode data. That validation asset is the most significant competitive barrier: it allows Geisinger to test agent performance against its specific patient population in ways that no externally procured benchmark dataset can match. Rural systems without comparable longitudinal data will need to rely on synthetic case sets or consortium-based data-sharing arrangements, both of which introduce fidelity tradeoffs that Geisinger's approach does not.

The governance floor rises

Geisinger's February 2024 clearance memo is a data point in a pattern that is consolidating across US health systems at different scales and contexts. The pattern is not uniformity — Mayo's CARS framework and Geisinger's CAIOB structure share principles but differ substantially in implementation detail. The pattern is the emergence of a governance floor: a minimum set of commitments — standing oversight committee, stratified validation, continuous monitoring, vendor update control, audit trail to clinical encounter level — below which a serious clinical AI deployment cannot credibly operate. Eighteen months ago, no such floor existed in practice. Today, the institutions that have not built toward it are falling behind a benchmark that is no longer hypothetical.

For rural and community health systems, Geisinger's deployment carries a particular signal. The academic medical centre deployments — Mayo, Mass General Brigham, Stanford Medicine — are visible and well-covered, but they operate with research infrastructure, regulatory relationships, and informatics staff that most US hospitals do not have. Geisinger is closer to the median health system in resources than it is to the academic outliers. If its governance model diffuses — through published frameworks, consortium adoption, or vendor contract standardisation — the institutions that benefit most will be the ones currently watching from the outside, without a Mayo-scale playbook to follow. That is the majority of American healthcare. The governance notes from Danville, Pennsylvania matter precisely because they were not written in Rochester.