Therapeutic products are now being developed that target particular molecular lesions found in various types of cancers. The ability to correctly identify patients whose cancers have targetable lesions generally depends on a well-validated diagnostic test. Development and use of diagnostic tests together with therapies in clinical trials yields the information necessary to make a regulatory determination that both products are safe and effective, likely have clinical utility when used together, and reach the market for patient benefit. This model, called co-development, has been developed relatively recently, and is being put to use in numerous cancer therapeutic development programs. The U.S. Food and Drug Administration (FDA) has articulated a policy that requires the coapproval of a diagnostic with a therapeutic product when the diagnostic is essential to the safe and effective use of the therapeutic product. At the same time, FDA has implemented a number of processes to manage the model without slowing the approval of the co-developed products. New diagnostic technologies, together with a rapid uptick in interest in targeted drugs, will challenge the still-evolving regulatory paradigm, but will likely result in some simplified approaches presenting new challenges in determining safety and effectiveness, but all with the promise of greater benefit to patients with cancer.
See all articles in this CCR Focus section, “The Precision Medicine Conundrum: Approaches to Companion Diagnostic Co-development.”
Clin Cancer Res; 20(6); 1453–7. ©2014 AACR.
It has long been recognized that certain cancer drugs work better in patients with specific tumor molecular characteristics. For example, the presence or absence of hormonal receptors (estrogen receptor and progesterone receptor) in breast cancer predicts response to endocrine therapies such as tamoxifen and aromatase inhibitors.
In the late 1990s, during the development of the drug Herceptin (trastuzumab; Roche), scientists recognized that only women whose breast tumors overexpressed the v-erb-b2 avian erythroblastic leukemia viral oncogene homolog 2, ERBB2, also called HER2 receptor were benefiting from the drug. Because the drug was designed to target the protein, this was hardly surprising, yet the ability to detect how much HER2 was present, and to determine how much was necessary for response to the therapy, required a development program for a new in vitro diagnostic device to accurately and reproducibly detect and quantify HER2. In 1998, Herceptin and its companion diagnostic, HercepTest (an immunohistochemistry kit; Dako) were approved together, and each referenced the other in their product labels.
Thus, was witnessed, perhaps without much recognition of what would follow, the birth of the companion diagnostic model. Over the ensuing years, the Herceptin label has changed to accommodate new tests and to treat new types of cancer, and a number of different tests using different technologies have been developed and approved for use with the drug, providing an ongoing lesson in how the U.S. Food and Drug Administration (FDA) must approach the concept of companion diagnostics: Simplicity will often give way to complexity, risks and benefits will be recalculated, and regulatory models must evolve. The need for companion diagnostics, however, may be ever more critical as cancers are increasingly subdivided by their molecular characteristics, and tests become more advanced and can more finely dissect cancer characteristics.
Beginning with the Herceptin/HecepTest approval, the concept of coapproval of a therapeutic product and a companion diagnostic was exercised several times over the next 5 to 8 years, with varying levels of regulatory planning and input. Key FDA–stakeholder interactions solicited critical thinking about tests used in the context of therapeutic decision making in a series of public meetings held beginning in 2002 in which FDA and industry discussed issues around pharmacogenetic testing and submission of exploratory data on a voluntary basis to FDA. These meetings culminated in the publication of the “Guidance for Industry: Pharmacogenomic Data Submissions” (1) and a white paper (2) laying out a vision of how co-development might work, both in 2005. These activities set the stage for defining regulatory approaches to companion diagnostics and introduced pharmaceutical and biotechnology industries to FDA's Center for Devices and Radiological Health (CDRH), which regulates medical devices, including diagnostic tests.
In 2008, as targeted therapies began to make up a significant presence on FDA's radar screen, it seemed crucial to lay out a policy defining the critical tests (now called companion diagnostics) to enable appropriate use of such therapies, and to explain the need for an approved test to support the risk–benefit decision for therapeutic product approval. In 2011, FDA published the draft guidance document “In Vitro Companion Diagnostic Devices” (3). See Text Box 1 for key draft guidance points. Since 2008, FDA has worked with both the pharmaceutical/biotechnology and diagnostic industries to ensure that companion diagnostics were approved in a timely manner along with their therapeutic partners. The program, although it has been difficult to apply perfectly across the board, has yielded some spectacular successes. Notable examples are the coapproval of Zelboraf (vemurafenib; Genentech) and its companion diagnostic, the Roche cobas 4800 BRAF V600 mutation test for metastatic melanoma, and Xalkori (crizotinib; Pfizer) and its companion diagnostic, the Vysis ALK Break Apart FISH Probe Kit (Abbott Molecular) for metastatic non–small cell lung cancer (NSCLC). The companion diagnostic concept is clearly scientifically established, although other regulatory jurisdictions have approached the co-development policy differently with regard to approving the test (see refs. 4 and 5 in accompanying articles for an explanation of the European Union approach).
Text Box 1. IVD companion diagnostic devices: Key draft guidance points
Define “companion diagnostic” as essential for the safe and effective use of a therapeutic product
Need for approval of companion diagnostic to approve therapeutic product
Expectation of contemporaneous approval of therapeutic product and companion diagnostic
Labeling of therapeutic product and in vitro diagnostic (IVD) companion diagnostic device trial
Investigational status of IVD used to support therapeutic product clinical
As the co-development model has progressed, FDA has confronted many new issues in science, policy, and process, raising various questions ranging from when a test requires FDA's investigational approval, to how to rapidly deploy the test system once a drug and companion diagnostic are approved. See Text Box 2 for published guidance documents relevant to co-development. To date, no two co-development activities have been exactly the same, so the learning curve for all parties has been steep. In its efforts to be as flexible as possible within the bounds of regulatory constraints, FDA has considered each situation individually. Although this approach allows the greatest flexibility in development, it does not lend itself easily to defining a prescriptive or predictable pathway. FDA continues to believe that although predictability is important, flexibility must take precedence, as the co-development model is applied over many types of development programs in different cancers and settings.
Text Box 2. Some relevant FDA guidance documents for co-development
From FDA's vantage point, the simplest and most efficient co-development model can be implemented when the therapeutic product sponsor has early (preclinical or early clinical) information about the biomarker(s) that will be needed for successful clinical trial implementation when investigating the therapeutic product. When this is possible, the test can be designed and developed early in the process, tuned around necessary performance characteristics such as cutoffs or range of relevant mutations in a gene as trial phases progress, and be completely specified and essentially market ready when the pivotal therapeutic trial begins. In our limited experience, however, the early selection and characterization of the biomarker(s) vis-à-vis the therapy is not always possible, and at this point in time, not the norm. Therefore, other models must be accommodated that push the co-development start time into later phases, which are potentially riskier in terms of development, yet still can result in coapproval of the therapeutic product and the companion diagnostic.
Of significant concern in therapeutic product development, especially as trials may become smaller and more focused at earlier stages, is whether a biomarker selected as a candidate for a companion diagnostic is actually “predictive” of therapeutic product effect, especially when that effect is measured in length of survival, as opposed to being simply prognostic within the disease category or perhaps even insignificant either to the disease process or the drug's effect.1 Attention to stratification by biomarker status is important, even if the candidate biomarker is the target of the therapeutic product, to balance prognostic effects across treatment arms. In addition, attention to selection of a “cutoff,” or test value that defines which patients will be considered marker positive and marker negative (e.g., mutant allele frequency, limit of detection or quantitation, or other quantitative value), is critical, such that, in most cases, the cutoff will be defined and specified for trial subject selection or assignment before the initiation of the pivotal trial. In some cases, cutoff selection may be adaptive, or even defined after the pivotal trial ends, provided that the selection process is prespecified and is not biased by any knowledge of pivotal trial outcome.
In cancer therapeutics, the Office of Hematology and Oncology Products (OHOP) in the Center for Drug Evaluation and Research (CDER; Silver Spring, MD) has allowed sponsors, at their own risk, to conduct pivotal trials in marker-positive patients only (where marker positivity is defined by the sponsor), when the sponsor hypothesizes that benefit will accrue to only those patients with the marker of interest.2 This situation, from the diagnostic point of view, can only yield a claim that the test selects a population in which a therapeutic product is safe and effective, but cannot support a claim that the presence/absence of the marker predicts response to the therapeutic product. This fact, taken together with the frequent desire of therapeutic product sponsor to undertake the most efficient trial with a demonstration of the greatest efficacy, has led to the approval of companion diagnostics that are simply labeled for patient selection. In these cases, the test will not have predictive value, because the marker-negative population is absent. Although this approach is often sufficient to gain approval of the therapeutic product and its companion diagnostic, it does not provide information that might enable more precise use of the therapeutic product, through understanding of the optimal marker cutoff point, and likewise, diagnostic sensitivity and specificity or negative predictive value for response. In some cases, postmarket activities have been requested by OHOP to investigate whether marker negatives can also benefit from a drug that was approved on the basis of marker positives only. Companion diagnostic labels could be modified with predictive claims and predictive performance if postmarket studies support the claim.
To date, FDA and sponsors have followed a pattern of “one drug, one test” in coapproval situations. Although this accomplishes the immediate goal of ensuring that a test with known and independently reviewed performance characteristics3 is available to use in prescribing the drug, it has inadvertently resulted in a practical requirement for multiple tests for the same disease entity, and sometimes in having two tests that provide slightly different spectra of eligibility for closely related drugs (e.g., for EGF receptor gene mutation testing in NSCLC, two tests are approved that have similar but not completely overlapping mutation detection, each approved with a different therapeutic product). At the outset, it was not obvious to FDA that this would or could occur, as the idea of co-development was not yet fully comprehended or appreciated by FDA or sponsors. Today, it seems as though “per drug” testing is rapidly becoming impractical, as multiple development efforts converge on the same targets, and multiple targets are recognized within a disease entity that was previously not easily subclassified on a molecular basis. For lung cancer, for example, a patient may formally need three or more tests to determine which drug might be appropriate. There are clearly practical restraints to this model, not only in how three or more tests might be conducted on one sliver of available biopsy tissue, but also in the cost of performing each test individually, especially when only one, or even none, of the tests provides definitive information on the appropriate therapeutic strategy.
It is fortunate that the rapid development of new technological approaches to testing multiple analytes (biomarkers) at once may resolve some of the problems of needing more than one test, given the diagnosis of a particular disease. It is now possible to create a single test for a large panel of relevant analytes, for which only one aliquot of patient sample is needed to query all possibilities. Although this would of course be useful for biomarkers that are relatively simple and easy to interpret and have the same measurement matrix (e.g., DNA), it will not solve all issues. For example, use of gene expression signatures as biomarkers might be a one drug, one test scenario, as will other types of tests in which multiplexing biomarkers is not feasible.
Among other multiplex technologies in development, high-throughput DNA or nucleic acid sequencing (HTS, also called “next-generation sequencing,” NGS) has great potential in oncology for rapidly reporting on any number of molecular changes within a single tumor sample. This technology shows promise in having adequate measurement and detection performance, when testing is carefully designed and evaluated, to be used as a companion diagnostic device platform (6). Due to the possibility of simultaneously and independently testing numerous markers at once, it should be possible for investigational use of a “new” biomarker to reside on the same instrument system and panel as approved companion diagnostic tests (albeit with different reporting standards). As new therapeutic products are approved that require companion diagnostics, the diagnostic sponsor could simply alter reporting software to reveal the once investigational, now clinically validated companion diagnostic marker(s) in the clinical test report. Indeed, one could even contemplate the usefulness of a panel that contains not only companion diagnostic markers, but also markers for which there are as yet no approved therapeutic options, assuming that all markers had appropriate measurement performance validation. Such a panel could simplify molecular testing for therapeutic selection, as well as identify molecular lesions that could qualify a patient for a clinical trial of a new targeted therapeutic product. Application of technologies in new ways to make testing more efficient will require discussion and input from FDA and product developers, but there is no question that new models are needed.
A perplexing wrinkle in the application of companion diagnostic tests that detect nucleic acid mutations has been, and will continue to be, defining the applicable spectrum of mutations in the context of use of the therapeutic product. Without a technological barrier to limit the spectrum of mutations, and without an exact knowledge of how the therapeutic product interacts with its target(s) to produce a response, it is likely that many rare mutations of unknown significance would be detected using an HTS or other non–mutation-specific panel. How such mutations would or should be interpreted clinically has not been resolved, and will likely require consensus from the clinical community. An early decision on how findings of unknown significance should be handled would be very useful to FDA as it makes regulatory decisions on the nature and extent of diagnostic device claims that it will allow for companion diagnostics.
Although the use of new technologies seems attractive from a high-level view, the ability to transition from already approved companion diagnostic technology to a new platform, perhaps measuring a different form of a biomarker, e.g., protein expression rather than DNA mutation, could be difficult if there are few or no samples available from clinical trials in which the diagnostic and the therapeutic products were used together. This is often a problem for “follow-on” diagnostics, because once a therapeutic product is approved, especially in oncology, it is rare for a similar trial to be repeated. Without samples that have associated outcome information, it would be more difficult to assure that different biomarker forms have similar biologic significance and that different tests give comparable results. This problem is already being addressed in health technology assessments as carried out by the National Institute for Health and Care Excellence (NICE) (4). FDA and diagnostic sponsors will need to work together to define necessary performance characteristics and validation strategies to assure that new technologies will represent the status of the same biomarker in the same way as the traditional technology, as applied to selection of therapy.
The desirable measurement sensitivity and the appropriate sampling paradigm remain unresolved in companion diagnostics for oncology therapies. To date, the technologies underlying approved companion diagnostics have often dictated the sensitivity at which an alteration of clinical interest can be detected. A technology with the ability to reliably detect a change in 5% or greater of the input sample will rarely detect the same change when it is present at 1% of the input sample, yet generally there is a complete lack of knowledge about what percentage of lesion is relevant for response to a therapeutic product. It is not common for sponsors to conduct “dose/response” studies to determine the prevalence of a lesion that predicts a clinically significant response to a targeted drug. Neither is it common to resolve potential tumor heterogeneity by testing an entire tumor, so that the location of the tissue sampled would not have any effect on the potential percentage of cells with the lesion of interest. The effect of previous therapies can be another confounding factor. Currently, approved therapeutic and companion diagnostic labels provide little or no information about these issues, because such studies have not been done. Transitioning from one technology with a defined sensitivity to another technology with a different sensitivity, for example, moving from Sanger sequencing with an average sensitivity of 20% mutation allele frequency to quantitative PCR with an average sensitivity of 5% mutation allele frequency, is very likely to change the spectrum of patients who become, through testing, candidates for a therapeutic product.
Another area where important knowledge is often lacking is whether primary and metastatic tumors provide the same information for clinical decision making, given a therapeutic product that is intended for use in the metastatic setting. Frequently, primary tumor tissue is harvested and available, and patients who present with metastases are not, or cannot be, resampled. Thus, the investigational companion tests (and often also the test that is ultimately marketed) are measuring the relevant biomarker at a different point in the “lifetime” of a patient's cancer than when the intervention of interest is being made. In many cases, the molecular changes that might take place over time as tumor cells are exposed to therapies or metastasize and undergo further evolution in a new cellular milieu are unknown because they are not tested, and therefore the state of the biomarker is simply inferred from the primary tumor to the metastatic tumor. Greater knowledge of the changes in tumor biology over time and space, potentially through use of new technologies that avoid invasive sampling, e.g., through detection and monitoring of tumor-specific analytes in blood, has the potential to provide better patient selection for the most effective therapy.
The companion diagnostic paradigm has evolved from a relatively loose approach of associating test results with response to therapeutic products to a defined policy in which a companion diagnostic approval is required when a test is essential to the safe and effective use of a therapeutic product. FDA has used its experience to develop a flexible regulatory pathway, yet recognizes that therapeutic product/companion diagnostic development schemes can vary widely, so that one approach will not apply to all development programs. Even as it seems that many questions about co-development have been resolved, the rapid accumulation of new knowledge about tumor biology and the rapid evolution of diagnostic technology are challenging FDA to continually redefine its thinking on companion diagnostics. It seems almost inevitable that a consolidation of diagnostic testing should take place, to enable a single test or a few tests to garner all the necessary information for therapeutic decision making. Even so, as we continue on the path toward deeper knowledge of disease processes, greater ability to measure multiple relevant markers, and more therapeutic options, many new questions requiring creative solutions are generated. FDA is committed to flexibility and science-based approaches that bring the greatest benefit (with the least toxicity) to patients.
Disclosure of Potential Conflicts of Interest
No potential conflicts of interest were disclosed.
↵1 A purely predictive marker, in this sense, will predict that patients, given a particular marker status, will have better or worse outcomes than patients without the marker, solely as a result of having received the investigational therapy; that is, there is a clear therapy/marker interaction. A prognostic marker would suggest that patients with the marker would, as a consequence of the natural history of the disease, have better or worse outcomes even absent treatment with the investigational therapy; that is, the marker has little or no interaction with the therapy. Some markers may have both predictive and prognostic properties in a given disease/therapy setting.
↵2 Performing a trial in this manner cannot, of course, rule out by itself the possibility that marker negatives could also respond.
↵3 Performance characteristics of a test will include measurement or detection performance over those parameters that are critical to assure that the test is identifying the correct marker (accuracy) in the tissue specified, in a manner that is reliable (precision, reproducibility, linearity if continuous values are critical) and yields a satisfactory result at the level of detection required (limit of detection/quantitation, precision around a cutoff). Other measurement parameters may be important depending on the type of test put forward.
- Received November 1, 2013.
- Revision received December 13, 2013.
- Accepted January 10, 2014.
- ©2014 American Association for Cancer Research.