Feel Good Merck Meeting Leaves Everybody Agreeing, But Brazil Study Bodes Ill for the Future
On June 23 1995, members of TAG met with statisticians from Merck and outside statistical experts to discuss the methodological assumptions of its Phase III program for its protease inhibitor Crixivan. TAG’s Michael Ravitch, who was instrumental in setting up the meeting, prepared this report.
The meeting came about after TAG issued its report Problems with Protease Inhibitor Development Plans at the National Task Force on AIDS Drug Development meeting in Bethesda, Maryland in February. TAG criticized many aspects of the Merck development program, from the lack of an expanded access program to the poorly designed and inadequately controlled clinical studies. Merck responded by offering to have their statisticians meet with top biostatisticians to discuss the design of their trials. TAG members Michael Ravitch, Spencer Cox and Mark Harrington attended, although the substance of the discussion mostly occurred between the statisticians.
While our critique had been wide-ranging, we decided the meeting should focus on the clinical trial in AZT-experienced patients. The only other planned trial with clinical endpoints, the Brazilian study in AZT-naive patients, has already begun, so the second-line trial was our best opportunity to have an influence. Also, since this population (antiretroviral experienced) will probably be the first group of patients to take the drug after approval (expected in spring 1996), we though it important to have some good information about clinical benefit as rapidly as possible.
The attendees from Merck included John Doorley, Dr. Jeff Chodakewitz, Dr. George Williams, Dr. Al Getson, Dr. Henrietta Ukwu and two other Merck statisticians. We had invited Dr. Victor DeGruttola from SDAC and Harvard, Dr. Thomas Fleming from the University of Washington and Dr. Terry Field from Brigham & Young Women’s Hospital and Harvard. Merck invited two outside consultants, Dr. Bill Brown from Stanford and Dr. Scott Zeger of Johns Hopkins. No proprietary information was revealed; in fact, we did not discuss data at all.
We began the meeting by describing the overall concerns we had about the proposed trial. We are unhappy that the trial is not slated to begin until 1996. This trial should be a priority, we argued, since this is the population with the most urgent need for information. Also, it will become increasingly difficult to ensure compliance in a controlled trial after approval of Crixivan and other protease inhibitors. We stated that sample size and statistical power were major issues to consider, since designers of antiretroviral trials had historically been over-optimistic in their assumptions, relying on surrogate marker changes which did not accurately correlate with clinical benefit. We also brought up the issue of the control arm. The results could be problematic if Merck used, as a control, a regimen whose clinical utility is as yet undefined, such as d4T or AZT/3TC. To mandate a control arm, for a population in which no single regimen has been convincingly validated or is broadly accepted, could cause problems with compliance. We suggested that Merck consider the control arm being used by Abbott, Glaxo and Agouron; that is, standard-of-care (switching among the nucleosides at will) versus standard-of-care plus MK-639.
Dr. Chodakewitz described the current status of its Phase III program, essentially unchanged since the February meeting with the community. Dr. Getson discussed a list of design considerations in putting together the trial. The discussion among the statisticians was organized by that list.
The population for this study is AZT-experienced patients in the lower T-cell range. Dr. DeGruttola argued that the trial should not exclude patients with < 50 CD4 cells. He pointed out that including people with lower CD4 levels will provide a higher rate of endpoints, and that in the original AZT BW-02 study, many of those with CD4< 50 had a measurable response to treatment.
The outside statisticians all supported the idea of the standard-of-care control arm. A Merck statistician wondered if the statistical “noise” of switching nucleosides would interfere with the interpretability of the data. Dr. Zeger (Merck’s consultant) was not concerned by the noise issue. Dr. Fleming said, “What we want to understand is how to use this drug in clinical practice. In clinical practice they use the standard-of-care, so add a new agent and see who does better. We’re legislating non-compliance by prescribing rigid regimens which are often inappropriate for individual patients. By using the standard-of-care, we’ll have better recruitment, better compliance and better relevance to our answer.” There seemed to be a general consensus among the outside statisticians in favor of this design.
Everyone agreed that progression (new opportunistic infections) and death should be the sole primary endpoints and that composite endpoints (such as CD4 drops) are actually surrogate endpoints and dilute our ability to measure true clinical benefit.
They asked the question, What is the minimum treatment benefit we want to detect reliably? Dr. Fleming argued that the study should be powered to detect, at the least, a reduction in relative risk from 1.5 to 1.0 (33%) with a 90-95% power. Dr. Zeger felt that this magnitude of reduction (33%) was overly optimistic and argued that the trial should be able to detect a smaller treatment benefit, such as a 25% reduction in relative risk. Dr. Fields agreed. To deduce a 33% reduction would require from 600 to 1,000 patients per arm. To detect a 25% reduction could require up to 2,500 per arm.
The Merck statisticians nervously agreed that Dr. Fleming was correct, and they would never think of powering a study with any less. However, when TAG brought up the current Brazil study, it was clear to us that that study is not sufficiently powered to detect a 33% reduction in hazard–much less a 25% reduction. Merck commented, “We are not prepared to discuss that study in detail today.” So it remains to be seen whether they actually will follow the scientific standards they agree to in principle.
Dr. Fields suggested that Merck try to nail down the exact magnitude of the treatment benefit, which would be a great help to clinicians, and recruit enough patients to plan demographic sub-group analyses. Both of these ideas would require more patients, however. And Dr. Zeger, thinking of the window of opportunity to study Crixivan in advanced patients, suggested that Merck design a one-year study that would be much larger. Dr. Fleming, on the other hand, argued that the number of events depends critically on both the duration of follow-up and on the sample size. “You need to make it as big as you can, ” he said, “and as long as you can–at least two years median follow-up.”
Overall, there seemed to be a remarkable consensus among the outside statistical experts, whether those invited by TAG or those of Merck: 1) The trial should begin as soon as possible; 2) Proper design requires 1000 patients (or 300 events) per arm, with a median of two years of follow-up, and 90-95% power to detect a 25-33% reduction in relative risk; 3) The standard-of-care control arm would answer the question most effectively, by mimicking the real world of clinical practice, and encouraging honesty and compliance. Hopefully, Merck will incorporate these ideas into the final design of the trial.
At the end of the meeting, we briefly discussed the small expanded access program (1,400 patients). TAG told Merck the number was still too small, but once again we were told the drug supply was not available. We also recommended that they give this drug to the sickest population (CD4< 50) as opposed to the plan of Hoffmann-La Roche. They said they had already planned on doing so. Then we discussed an idea that Merck had been grappling with internally.
Since the company is assigning the drug randomly, by means of a lottery system, they might be able to compare the survival results of those people who receive the drug and those who do not. Dr. Fields, an epidemiologist, suggested they use the National Death Index, which would allow them to measure mortality without additional cost to the company, or burden for the patients. The only difference for the patients would be that they would need to sign an informed consent, allowing Merck to look them up on the NDI.
The statisticians were very supportive of this idea, arguing that it could provide the most solid data yet about the clinical benefit of an antiretroviral drug. TAG also supported this idea, reasoning that if the expanded access is going to be limited, we might as well gain as much information from it as we can. While the primary purpose of the program should remain access to treatment, it never hurts to gather more information. Hopefully, the production facilities will come on-line in the spring of 1996, as Merck has promised, and access to Crixivan can expand exponentially.
We ended with a brief discussion of concerns about their plans for postmarketing studies. We urged them to start planning studies with combination protease inhibitors, in order to answer questions about cross-resistance and to try to identify the best possible regimen among new alternatives. They said they were in discussion with Roche about preliminary combination studies but that Abbott, which is not part of the Inter-Company Collaboration (ICC), is reluctant to collaborate. They should also be aware that if they want to market Crixivan to HIV+ asymptomatic individuals, they will eventually need to prove the value of early Crixivan treatment with a large clinical trial in that population.