rf-fullcolor.png

 

March 16, 2026
by Jeff Craven

Experts propose new framework for overseeing genAI health tools

Generative artificial intelligence (AI) tools in clinical practice should have an oversight framework that resembles how human clinicians are trained and certified, with continuous learning, regular assessments, and supervised practice, experts argued in a recent Viewpoint article published in JAMA Health Forum.
 
Medical device frameworks as they exist today “are ill equipped” to evaluate safety and efficacy of generative AI tools in clinical practice, Bakul Patel, MSEE, MBA, senior director of global digital health strategy and regulatory at Google; and David Blumenthal, MD, MPP, of the Harvard T.H. Chan School of Public Health in Boston, said. Further, it would be infeasible to evaluate every possible interaction with a generative AI tool for premarket approval, and a premarket assessment would not be able to capture all the postmarketing behavior of the tool, they noted.
 
“We suggest overseeing clinical [generative AI] through a system analogous to the preparation, evaluation, and lifelong professional oversight of human clinicians,” Patel and Blumenthal said. “This is not a regulatory scheme but an oversight mechanism.”
 
The framework would consider generative AI “as a novel form of intelligence, not a traditional medical device, and would be modeled on societal approaches to assuring the competency of human intelligence applied to health care,” they said.
 
While physician certifications would not be a “literal blueprint” for oversight of generative AI in clinical practice, the framework “should enable the technology to meet standards of performance that human clinicians must meet during the course of their training for independent practice,” they explained.
 
In preparing generative AI for clinical practice, the AI tool could meet the same standards as a human clinician through traditional certification examinations, receive supervision by expert clinicians similar to residency programs, and undergo retraining and certification over time through continuous learning, Patel and Blumenthal said.
 
Before implementation of the framework, stakeholders will have to consider factors such as extensive research and development needs, the funding of the oversight and certification process through public-private partnerships, and the development of organizations with the medical and technical competence to training generative AI tools.
 
There would also be the question of what entity is responsible for certification, continued evaluation, and oversight of clinical generative AI tools. This could be performed by the US Food and Drug Administration (FDA), but could also be the domain of a theoretical public-private national review board for clinical generative AI that has the authority to certify or decertify clinical generative AI. (RELATED: Generative AI: FDA adcomm makes recommendations on postmarket performance of medical devices, Regulatory Focus 22 November 2024; RELATED: FDA questions on genAI-enabled chatbots raise concerns from expert panel, Regulatory Focus 10 November 2025)
 
“The novel characteristics of clinical [generative AI] require creative new approaches to assuring that it is safe, effective, and trustworthy when applied to real-world clinical problems. Any such approach would require considerable developmental work and collaboration between diverse private and public stakeholders,” Patel and Blumenthal said. “Nevertheless, the need for oversight of clinical [generative AI] is urgent because these technologies increasingly affect the daily work and lives of clinicians and patients.”
 
A shift in thinking about generative AI
 
Chevon Rariy, MD, a member of the FDA Digital Health Advisory Committee and chief clinical innovation officer at Visana Health, told Focus that the proposal by Patel and Blumenthal “reflects an important shift in how we think about the role and use of AI in clinical care, and particularly generative AI which evolves over time and produces new information dynamically the more it is used.”
 
While traditional regulatory frameworks that evaluate AI consider fixed tasks, generative AI is different, Rariy explained. As generative AI learns, changes, and adapts, and as the tool is used across different clinical contexts, there will be a need for ongoing monitoring and oversight.
 
“Treating oversight more like clinician training, with structured evaluation, supervised use, and ongoing recertification, acknowledges that these tools behave more like learning systems rather than static devices,” she said. “That kind of continuous evaluation may ultimately be necessary to ensure these technologies remain safe, effective, and clinically relevant over time.”
 
The implementation of an oversight system for generative AI “would require meaningful coordination across all stakeholders, along with rigorous clinical validation across diverse populations, geographies, and care settings, with particular consideration for settings beyond academic medical centers, including community-based sites, home-based care, and virtual care environments,” Rariy said.
 
“The challenge will be building infrastructure that allows for continuous monitoring and improvement without slowing innovation. Given the pace at which generative AI is entering healthcare, developing new oversight approaches is an urgent priority we should be actively exploring,” she added.
 
Rariy stressed that human oversight will remain important, and that the technology does not replace “sound clinical judgment” or years of medical training, education, and experience.
 
“However, physician oversight alone is not enough. Today’s clinicians are already burnt-out and stretched thin—asking them to audit every output, bias and inaccuracy from an AI system is not realistic or sustainable long term,” she said. “That’s why strong governance, transparency, and continuous validation remain essential, even when a clinician is involved in the decision-making process.”
 
Rariy also noted a generative AI oversight framework needs to address factors like gaps, underrepresentation, and biases in historical research. “Without that foundation, poorly designed AI will not simply mirror existing inequities, it may amplify them at scale across diagnosis, treatment, and care delivery,” she said.
 
JAMA Health Forum Patel et al.
×

Welcome to the new RAPS Digital Experience

We have completed our migration to a new platform and are pleased to introduce the updated site.

What to expect: If you have an existing login, please RESET YOUR PASSWORD before signing in. After you log in for the first time, you will be prompted to confirm your profile preferences, which will be used to personalize content.

We encourage you to explore the new website and visit your updated My RAPS page. If you need assistance, please review our FAQ page.

We welcome your feedback. Please let us know how we can continue to improve your experience.