For some, the phrase artificial intelligence bring up visions of the rise of the machines in the movie series Terminator, computers taking over the world in the movie The Matrix, or countless other stories of computers developing consciousness and then evolving uncontrolled. The basic questions raised in all of these stories, however—including what happens when computers return an incorrect answer, what’s an appropriate checkpoint, and how we should allow AI to influence human decision making—are paramount to the discussion of how we regulate the use of AI in medicine.
In this installment of our series, we’ll provide some background and a few high-level developmental and regulatory insights regarding devices that incorporate AI, to provide guidance for any entrepreneur developing an AI software package.
In 2018, the first AI-based system in ophthalmology, the IDx-DR, designed for “the evaluation of ophthalmic images for diagnostic screening to identify retinal diseases or conditions” was granted clearance. To date, however, there’s no specific FDA guidance document regarding AI. However, the FDA does reference other reports and standard software documentation guidance in its publications.
Let’s first look at the general framework of digital health. The FDA Safety and Innovation Act (FDASIA) provides a strategic framework and recommendations regarding an appropriate risk-based method for defining the oversight of health information technology that “promotes innovation, protects patient safety, and avoids regulatory duplication.”1 The framework consists of three health IT function categories:
• administrative (e.g., admissions, claims or billing);
• health management (e.g., data capture, medication management, order entry or electronic access to records); and
• medical device (e.g., disease-related claims, clinical decision support software or mobile medical apps).
The importance of these distinctions is in the proposed risk assessment and risk mitigation, with the FDA being more focused on the areas of Health IT that carry higher risk.
Evaluating Risk & Modality
AI uses techniques such as machine learning (ML) to produce intelligent behavior. A system incorporating ML has the capacity to learn, based on being trained to accomplish a specific task and tracking performance measures. Ultimately, it can design and train software algorithms to learn from data, and then act on what it’s learned. AI/ML-based software, when used to treat, diagnose, cure, mitigate or prevent disease or other conditions, is considered a medical device by the FDA; it’s referred to as “software as a medical device” or SaMD.2
The FDA references a risk system for SaMD that was created by the International Medical Device Regulators Forum (IMDRF), based on the risk to the patient. It asks questions such as: Is there a valid clinical association with the SaMD? Does the SaMD correctly process input data to generate output data? Does use of the SaMD output achieve the intended purpose in the target patient population, in the context of clinical care?
The IMDRF risk assessment also focuses on the activities needed to validate an SaMD. The assessment is based on two things: the state of the health-care situation or condition (i.e., critical, serious or non-serious); and the purpose of the information provided by the SaMD (to treat or diagnose, to drive clinical management, or to inform clinical management). Together, these two dimensions result in a risk score ranging from I to IV. (The IMDRF report doesn’t provide specific recommendations regarding regulatory oversight, but focuses more on the level of risk as it relates to independent review.)
In addition to being assessed on a spectrum of risk to patients, SaMD is also assessed regarding where it lies on a spectrum from “locked” to “continuous learning.” Locked programs give the same result each time the same data is input. A continuous learning program, however, changes its behavior over time based on learning, so for any given set of input data, the output may change as learning progresses. This ability to adapt over time is an advantage, but it raises the question: How should algorithm changes be reviewed at the FDA prior to marketing?
Meeting FDA Criteria
The FDA is now looking at a “total product lifecycle” (TPLC) approach for facilitating rapid cycles of product improvement. On April 29, 2019, the FDA published a discussion paper titled Proposed Regulatory Framework for Modifications to Artificial Intelligence (AI/ML)-Based Software as a Medical Device (SaMD) that describes this in more detail, as well as explaining the agency’s thoughts on its potential approach for premarket review.3
This paper proposes that applications for approval include a predetermined change-control plan, with anticipated modifications and methodologies for implementing these changes in a controlled manner, as well as a process for ongoing evaluation and monitoring to ensure safety and quality. (Note: This proposed regulatory approach would apply to only those AI/ML-based SaMD that require premarket submission—not those that are exempt from requiring premarket review [i.e., Class I exempt and Class II exempt].)
The FDA’s proposed process for a TPLC is based on two principles: 1) quality systems and good machine learning practices (GMLP) that demonstrate analytical and clinical validation; and 2) an initial premarket assurance of safety and effectiveness.
When submitting a system to the FDA, a predetermined change-control plan should be developed, including a list of the types of changes the manufacturer plans to make while the SaMD is in use (SaMD pre-specifications, or SPS). It should also specify methods for controlling the risks, described in a step-by-step algorithm-change protocol (ACP). The extent to which a pre-approval of SPS and ACP can be relied on for future changes depends on multiple factors:
• changes that involve performance (assuming that performance is improved or at least maintained over time);
• changes in intended use that increase the system’s IMDRF risk level, as well as certain changes related to intended use with a specific patient population that evolve with new data;
• an approach for modification after the initial review with an established SPS and ACP. Modifications may lead to a submission of a new 510(k), or be documented for reference; and
• transparency and real-world performance monitoring of AI/ML-based SaMD. This includes transparency to the patients, users and doctors, as well as the FDA.
Another key element of the regulatory landscape the entrepreneur should continue to watch is special controls documents that may appear as the agency grants clearances in the 510(k) process under a de novo designation. These special controls documents may include important requirements for future products with similar indications for use.
Approaching the Agency
With this background in mind, and some of the many abbreviations defined, where should the budding entrepreneur with an idea that incorporates AI start? Let’s walk through some of the steps when planning for the initial regulatory interactions.
Your pre-submission package should contain the following key elements:
• Device description. Be as specific as possible, including flow diagrams.
• Indications for use. Consider other examples, and address both the target patient population and user in your full statement. Some key words that you might consider as indications include telehealth (e.g., storing, managing, displaying and enhancing images); screening (diagnostic screening and detection); and diagnosis (identification of disease and/or severity, and/or providing treatment suggestions).
Be sure to describe the target user in your package. Possibilities might include health professionals such as primary care doctors or endocrinologists; eye-care professionals (MD or OD); general technicians (those with limited, specialized training who work in health care); and/or trained ophthalmic technicians, such as those with specialized training in fundus imaging or fluorescein angiography.
• Proposed performance data. Most devices of this type are Class II. Be specific about inclusion/exclusion criteria in your protocol, planned statistical analysis and endpoints. For example, if you’re looking for an indication of screening, you’ll probably need an all-comers study, with the patient population not defined too narrowly.
• Proposed device output. Representative outputs are helpful, including any different options that are possible.
General Strategies for Success
To increase the likelihood of a successful submission:
• Start regulatory considerations early. As soon as you have a solid device description and understand who your target user is, plan for a presubmission.
• Given the complexity of this type of approval, you may need to plan on multiple interactions with FDA prior to collecting your performance data.
• Remember that study size modeling is critical.
• Beware of biases. Examine your datasets for potential biases to make sure your real-world experience will mimic your testing. REVIEW
Mr. Chapin is senior vice president of Corporate Development at Ora, which offers device and drug consulting, as well as clinical R&D. Mr. Bouchard is vice president for medical devices at Ora. The authors welcome your comments or questions regarding product development. Send correspondence to firstname.lastname@example.org or email@example.com, or visit www.oraclinical.com
1. FDASIA health IT report. April 2014. https://www.fda.gov/media/87886/download
2. Software as a medical device. Sept 21, 2017. http://www.imdrf.org/docs/imdrf/final/technical/imdrf-tech-170921-samd-n41-clinical-evaluation_1.pdf
3. Proposed regulatory framework for modifications to artificial intelligence (AI/ML)-based software as a medical device (SaMD). FDA. April 2019. https://www.fda.gov/media/122535/download