Insights AI News FDA regulation of AI medical devices: How to ensure safety
post

AI News

13 Nov 2025

Read 18 min

FDA regulation of AI medical devices: How to ensure safety

FDA regulation of AI medical devices guides approval and postmarket checks to keep patients safe now.

AI is moving from labs to clinics fast. FDA regulation of AI medical devices uses a risk-based system, clear labeling, and ongoing monitoring to protect patients while allowing innovation. This guide explains what the FDA reviews, how changes get approved, what happens after launch, and what builders and hospitals should do now. Artificial intelligence already reads scans, flags heart rhythm problems, and guides diabetes care. Most tools use predictive models, while researchers explore generative models for images and text. The United States now lists more than 1,250 authorized AI-enabled devices. With this rapid growth, developers and clinicians need a clear picture of how oversight works, where it is strong, and where it needs to improve.

FDA regulation of AI medical devices: The basics

What makes software a regulated device

Software becomes a medical device when its intended use is to diagnose, cure, treat, mitigate, or prevent disease. If the software only schedules visits or transcribes notes, it is usually not a device. But if it suggests a diagnosis or a treatment, it can fall under the FDA’s rules. In low-risk cases, the FDA may use “enforcement discretion.” That means the tool could be a device under the law, but the agency does not expect a formal submission. This often applies to general wellness tools, like step tracking or medication reminders. When in doubt, developers should check the FDA’s Digital Health Policy Navigator and seek regulatory advice. Clinical decision support (CDS) sits in a gray zone. A 2016 law narrowed what the FDA regulates. CDS can be outside the device definition if it only supports a clinician, not replaces one, and if the clinician can understand the logic behind the advice. The FDA’s 2022 guidance explains this boundary. It warns that tools used in time-sensitive settings, or that rely on complex, opaque algorithms across many data types, may be devices. Some experts think the line is too narrow and may slow useful CDS features inside electronic health records.

Two software types: SaMD and SiMD

Software as a Medical Device (SaMD) is stand-alone software used for a medical purpose. It runs on a phone, a desktop, or in the cloud. Examples include AI that enhances images, measures lesions, or detects hidden patterns in ECG data. Software in a Medical Device (SiMD) sits inside a hardware device. It drives the device’s function. Examples include handheld ultrasound with built-in AI that guides image capture, or apps that pull data from glucose sensors to coach patients.

Risk classes and premarket pathways

The FDA uses three risk classes:
  • Class I: Low risk (for example, tongue depressors)
  • Class II: Moderate risk (many AI tools fit here)
  • Class III: High risk (life-sustaining or high-harm potential)
  • The review path depends on risk and whether a similar device already exists.

    The three main pathways

  • 510(k) Clearance: For devices “substantially equivalent” to a marketed predicate. It is common, fast, and often does not need clinical trials. Many AI tools clear this way by proving similar safety and performance to an existing product.
  • De Novo: For novel devices with no predicate, but low to moderate risk. It creates a new device type and sets special controls.
  • Premarket Approval (PMA): For Class III, high-risk devices. It requires strong clinical evidence and deep review.
  • A simple 510(k) example

  • Build: A startup trains an AI model to spot lung nodules on CT scans.
  • Find a predicate: The team identifies an already cleared image-analysis tool with a similar use.
  • Submit: The company shows performance data, validation studies, and explains differences.
  • Review: The FDA asks questions and may push back on claims that go beyond the evidence.
  • Clearance: If equivalent, the device can be marketed. Clearance does not prove clinical benefit; it confirms safety and effectiveness relative to the predicate.
  • Managing change: Predetermined Change Control Plans (PCCPs)

    AI does not stand still. Models update. Data shifts. To handle this, the FDA uses Predetermined Change Control Plans. A PCCP is submitted with the original application. It spells out what changes are allowed later without a new review, as long as the device’s intended use and safety are maintained or improved. A strong PCCP covers three parts:
  • Description of Modifications: What will change, how often, and which components.
  • Modification Protocol: Data practices, retraining steps, testing plans, and how users will be informed. Each change must have a clear verification and validation plan.
  • Impact Assessment: Benefits, risks, mitigations, and how issues will be handled.
  • PCCPs improve speed and transparency. They also put duty on makers to track versions, document updates, and communicate them. For adaptive and generative models, this is vital. It helps clinicians understand what changed and why it still works for their patients.

    Labeling, transparency, and security

    Clear labeling helps safe use. The FDA expects AI devices to disclose, in plain language:
  • That the device uses AI and how the AI supports its intended use.
  • What data goes in, what the model outputs, and any system interactions.
  • Performance metrics, known risks, and possible bias sources.
  • How updates will occur if a PCCP is in place, and how performance is monitored.
  • Patient-facing instructions that are easy to read and follow.
  • Cybersecurity is also part of safety. The agency wants products to be “secure by design,” with threat modeling, risk assessments, update mechanisms, and a Software Bill of Materials (SBOM) to track vulnerabilities. Clear security guidance in labeling helps hospitals manage patches and configurations.

    Real-world performance and postmarket oversight

    Premarket review is only the start. After launch, device makers must follow quality rules, manage risk, and document changes. The FDA’s Quality System Regulation is moving to the Quality Management System Regulation in 2026 to align with international ISO 13485:2016. The agency encourages collection and analysis of real-world performance, especially where a PCCP allows updates. For higher-risk devices, the FDA can order postmarket surveillance under Section 522. That may require long-term studies, especially for implants, pediatric devices, or life-supporting tools used outside the hospital. The agency also uses warning letters and recalls when performance slips or marketing exceeds the evidence. The Medical Device Reporting (MDR) program collects adverse event reports from manufacturers, importers, and user facilities. These appear in the MAUDE database. MAUDE has limits, including underreporting and uneven quality, and it struggles with AI-specific failure modes. The FDA has discussed modernizing adverse event systems and exploring automated platforms that use AI to detect signals faster. The details and timing are still developing. Real-world monitoring for AI is hard. Models can drift as populations, practice patterns, and devices change. Larger systems and academic centers may track performance, but many rural hospitals cannot. Experts have proposed public–private registries for higher-risk AI, shared validation studies, and periodic re-checks to catch bias and decay early. In September 2025, the FDA sought public comment on practical ways to measure and evaluate AI device performance in real use.

    Generative AI: new promises, new risks

    Most cleared devices rely on predictive models. But interest in generative AI is growing. Researchers test synthetic medical images and multimodal models that mix text and scans. These systems can produce variable outputs and can “hallucinate.” That raises new review questions. The FDA’s Digital Health Advisory Committee advised the agency to push for transparency on intended use, training data, rates of errors and hallucinations, and to consider standard model cards. Members discussed independent testing, not just manufacturer-led studies. They also urged expansion of the Medical Device Development Tools program to include reference datasets and benchmarks for AI. Some commercial foundation models do not disclose their training data. That makes evaluation harder. Members compared testing AI to testing clinicians with standardized cases. They also noted differences: AI does not reason like a human and can fail in unfamiliar ways, especially when prompted outside its intended use.

    The FDA’s capacity and its own AI tools

    Regulators face heavy workloads and fast-changing tech. As of late 2025, FDA staffing was down by roughly 2,500 positions from 2023. Some lawmakers have asked whether AI could help with tasks like scanning adverse event reports. The FDA has begun to use AI internally. In 2025, it introduced “Elsa,” a Claude-powered chatbot to help staff read, write, and summarize documents. Leaders say it could speed scientific reviews. Critics ask how it might shape decisions and how courts would view choices made with AI support. The agency also launched cross-agency councils for external AI policy and internal AI use, and the federal AI Action Plan calls for evaluation standards, sandboxes, and stronger national benchmarks led by groups like NIST.

    Good Machine Learning Practice and lifecycle thinking

    Two ideas anchor safe AI:
  • Total Product Life Cycle (TPLC): Plan for safety and effectiveness from design to development to deployment and postmarket monitoring. This matters most for models that evolve after launch.
  • Good Machine Learning Practice (GMLP): Ten guiding principles developed with international partners. They stress representative data, sound engineering, thoughtful human-AI interaction, transparency, and continuous monitoring.
  • These principles show up in PCCPs, labeling, and expectations for postmarket vigilance. The FDA also works with global regulators through the International Medical Device Regulators Forum to align on change control, validation, and labeling—helping reduce friction across borders.

    How builders can succeed under current rules

    Design and data

  • Define intended use and indications clearly. Avoid vague claims.
  • Use diverse, representative data. Check performance across subgroups by age, sex, race, and comorbidities.
  • Document data sources and preprocessing. Track versions and data lineage.
  • Plan for drift. Build dashboards to monitor key metrics after launch.
  • Evidence and usability

  • Align evidence with risk. For 510(k), show equivalence. For De Novo or PMA, plan robust clinical studies.
  • Test usability. Make workflows clear for clinicians and easy for patients.
  • Stress-test edge cases and time-sensitive settings. Be honest about limits.
  • PCCPs, transparency, and security

  • Write a tight PCCP with clear modifications, methods, and impact controls.
  • Prepare high-quality labeling, including performance, bias risks, and update plans.
  • Ship with an SBOM and secure-by-design controls. Make patching simple for customers.
  • What hospitals and clinics should ask before buying AI

  • Intended use: Does the device’s cleared labeling match our use case and patients?
  • Performance: Do we have subgroup results for our population? Are claims supported by peer-reviewed data?
  • Updates: Is there a PCCP? How will we be notified of changes and new risks?
  • Monitoring: Does the vendor offer real-world performance dashboards? Can we export logs for audits?
  • Security: Is there an SBOM? How are vulnerabilities tracked and patched?
  • Contracts: Do we have rights to audit performance, pause updates, and report issues? How is data protected and who owns it?
  • Policy gaps to watch next

  • Defining intended use for adaptive and generative models that can be repurposed easily.
  • Setting shared evaluation datasets and benchmarks for common specialties.
  • Standing up practical, privacy-protective registries for higher-risk AI in the real world.
  • Clarifying CDS boundaries so helpful decision support is not chilled, while risky features are reviewed.
  • Improving adverse event reporting so AI-specific failures are easier to spot and fix.
  • Helping smaller and rural providers monitor performance without heavy costs.
  • Balancing transparency with intellectual property when training data disclosures are requested.
  • Understanding FDA regulation of AI medical devices helps innovators, clinicians, and patients pull in the same direction: safer care and faster gains. The core tools exist—risk classes and review pathways, PCCPs that manage updates, clear labeling, and postmarket checks. The next step is to scale real-world monitoring, build trusted evaluations for generative AI, and keep rules simple enough for broad adoption while strong enough to protect patients. The path forward is not about choosing innovation or safety. It is about doing both well, every day, across the full product life cycle. With smart design, honest evidence, and steady oversight, FDA regulation of AI medical devices can support better outcomes and build long-term trust. (Source: https://bipartisanpolicy.org/issue-brief/fda-oversight-understanding-the-regulation-of-health-ai-tools/) For more news: Click Here

    FAQ

    Q: What determines whether healthcare software is regulated as a medical device? A: Software becomes a medical device when its intended use is to diagnose, cure, treat, mitigate, or prevent disease, which is a core threshold in FDA regulation of AI medical devices. Administrative tools like scheduling or transcription usually fall outside that scope unless the software makes diagnostic or treatment recommendations. Q: What’s the difference between Software as a Medical Device (SaMD) and Software in a Medical Device (SiMD)? A: SaMD is standalone software intended for a medical purpose that runs on phones, desktops, or the cloud, while SiMD is embedded in or drives a physical medical device. Examples include image-analysis apps as SaMD and handheld ultrasound systems with built-in AI as SiMD. Q: How does the FDA classify risk and which premarket pathways apply to AI-enabled devices? A: The FDA uses three risk classes—Class I (low), Class II (moderate), and Class III (high)—and applies greater review rigor to higher-risk devices, with many AI tools falling into Class II. Depending on risk and whether a predicate exists, devices can go through 510(k) clearance, De Novo classification, or Premarket Approval (PMA), which are the primary pathways under FDA regulation of AI medical devices. Q: What are Predetermined Change Control Plans (PCCPs) and how do they help manage AI updates? A: PCCPs are plans submitted with an initial application that specify which algorithm changes a manufacturer may make postmarket without a new FDA review if safety and intended use are maintained. A strong PCCP includes a description of modifications, a modification protocol with verification and validation steps, and an impact assessment describing benefits, risks, and mitigations. Q: What labeling and security information must manufacturers provide for AI-enabled devices? A: The FDA expects labeling to state that a device uses AI and to explain in plain language how the model supports its intended use, what data are inputs and outputs, performance measures, known risks or biases, and how updates will occur if a PCCP exists. Manufacturers must also demonstrate “secure by design” practices, supply a Software Bill of Materials (SBOM), and include threat modeling and update mechanisms to manage cybersecurity risks. Q: How does the FDA monitor real-world performance, and what limitations exist in postmarket oversight? A: Postmarket oversight relies on quality regulations, adverse event reporting through the MDR program and MAUDE, and targeted surveillance under Section 522 for higher-risk products, while the agency is updating rules to the Quality Management System Regulation in 2026. Limitations include MAUDE’s underreporting and variable data quality, challenges tracking model drift and real-world performance across diverse systems, and uneven monitoring capacity among smaller or rural providers. Q: What regulatory concerns does generative AI introduce for medical devices? A: Generative AI can produce variable outputs and “hallucinations,” which raises questions about intended use, error rates, and transparency of training data that some commercial models do not disclose. The FDA’s advisory committee has recommended disclosures such as model cards, independent testing, and expanded reference datasets and benchmarks to evaluate generative systems. Q: What practical questions should hospitals ask vendors before deploying an AI-enabled device? A: Hospitals should confirm that the device’s cleared intended use matches their clinical setting, request subgroup performance data, and verify whether a PCCP and vendor-provided real-world performance dashboards exist. They should also review security documentation like an SBOM, contract rights for audits and update controls, and how patient data are protected and owned.

    Contents