médical IA

Mirai in mammography at a safety-net hospital: what changes when AI risk stratification is prospectively deployed

Published on May 21, 2026 · 9 min read

A UCSF team published in npj Digital Medicine on May 18, 2026 the first prospective evaluation of a Mirai deployment — the AI model predicting one-year breast cancer risk from mammograms — at a U.S. safety-net hospital. Across 4,145 screening mammograms, 525 women were flagged as high-risk by the algorithm, and 100 accepted an expedited workflow that shortened time from screening to interpretation by 99 percent, and time to biopsy by 87 percent. This is a real-world workflow study, valuable for its setting and for the modesty of its claims, but it should be read for what it is: an operational demonstration, not a diagnostic validation of the algorithm.

The context

Screening mammography rests on a fragile logistical chain. The image is acquired one day, read by a radiologist the next or the following week, the result is sent to the patient, who must then schedule a follow-up diagnostic workup (ultrasound, additional views, sometimes MRI), and finally a biopsy if needed. In well-resourced centers, this pathway takes a few days to three weeks. In so-called safety-net hospitals — which serve uninsured or under-insured populations in the United States — it takes weeks, sometimes months, and some patients are lost along the way. This attrition is a recognized driver of inequalities in breast cancer mortality, particularly among Black American women.

AI models capable of estimating individual breast cancer risk from mammography have been mature since 2021. Mirai, published by Adam Yala and colleagues at MIT then UCSF, predicts the probability that a patient will develop breast cancer within one to five years after a screening mammogram. It has been validated on more than 200,000 exams from American, European, and Asian centers, with one-year AUC (area under the ROC curve, which measures the ability to separate a positive case from a negative case — 1 is perfect, 0.5 is chance) around 0.75–0.80 across cohorts. The open question was no longer whether Mirai works on datasets, but what happens when it is integrated into an actual clinical workflow, in a real hospital, with real patients.

The method

The study runs at the Avon Breast Center of Zuckerberg San Francisco General Hospital, the public hospital of San Francisco, which serves predominantly uninsured, Medi-Cal, or recently immigrated patients. IRB-approved protocol, HIPAA-compliant, controlled but non-randomized design.

The pipeline works as follows. A screening mammogram is acquired and transmitted in real time to a local instance of Mirai. The model computes a one-year risk score. If the patient falls in the top decile of risk (top 10%), she is flagged as high-risk in the information system. On enrollment days (two to three days per week during the study period), a coordinator informs flagged patients that they can stay on site for an immediate read by a radiologist, then, if the image suggests an abnormality, for a same-day diagnostic workup (additional views, targeted ultrasound) and if needed for a biopsy scheduled at very short notice.

Non-enrollment days serve as the control group: high-risk patients per Mirai follow the usual pathway — the standard wait of several weeks. An interesting comparator because it avoids the AI placebo effect: controls are also high-risk, just not prioritized.

Over the study window, 4,145 screening mammograms were processed by Mirai. 525 women (12.7%) were flagged in the top 10% risk decile. On enrollment days, 100 accepted the expedited pathway and form the experimental cohort. On other days, the corresponding high-risk patients serve as controls.

The results

Of the 100 patients on the expedited pathway, 94% received an immediate read of their mammogram on the same day, and 26 required (and received) follow-up diagnostic evaluation the same day. The standout numbers concern delays.

Among patients found to have a screen-detected cancer (six cases in the experimental arm versus controls of comparable size), time between screening and result reporting dropped from several weeks to a few hours — a reduction of 99.1%. Time to diagnostic workup also fell by 99.1%. Time to diagnostic biopsy was reduced by 87.2%.

On cancer detection rate, the expedited cohort shows 60 cancers per 1,000 women screened, versus 2.3 per 1,000 among women not flagged as high-risk by Mirai. This difference is not a measure of model performance in the classical sense — it is mechanically the product of selection (Mirai concentrates cancers in the top 10%). But it confirms that the targeting works: in the field, the algorithm isolates a population genuinely enriched for actual cancers.

In clinical translation: for a thousand patients screened at Zuckerberg San Francisco General, about 23 carry a cancer in the upper Mirai decile. The expedited pathway brings the full diagnostic workup of these 23 cancers down from several weeks to a few hours, and from one or two months to a few days for biopsy. It is a logistical improvement, not a change in the absolute number of cancers detected for the same population.

What's good

Three specific strengths.

The setting is exactly the right one. U.S. safety-net hospitals are the institutions where inequalities in access to diagnosis weigh heaviest. A significant share of AI-radiology literature is produced at Stanford, MIT, or Mayo Clinic — wealthy, digitized centers where delays are already short. Showing that a Mirai integration works in an urban public hospital, on a majority non-white, economically precarious population, usefully shifts the conversation.

The comparator is honest. Control patients are also flagged as high-risk by Mirai — the difference is not the performance of the model, but the prioritized workflow. This avoids the classical biased-comparator pitfall (comparing AI-plus-workflow to no-AI-and-no-workflow), often seen in deployment literature.

Mirai's code has been public under MIT license since 2021 (github.com/yala/Mirai), and the study uses a local deployment controlled by the team — not a proprietary black-box API. This transparency lets other safety-net systems attempt the same integration without depending on a commercial vendor.

What's less good

Three precise limitations to keep in mind.

The experimental sample is very small. 100 patients accept the expedited pathway, 6 actually have a screen-detected cancer. All percentage reductions in delay (99.1%, 87.2%) rest on these small numbers. No confidence intervals are reported in the available abstract. On 6 cases, a single outlier delay changes the mean considerably — the classic misleading-metric trap. The direction of the effect is credible, the exact magnitude much less so.

The study is not randomized. The "enrollment days" versus "non-enrollment days" contrast is vulnerable to calendar biases (available staffing, type of patients showing up on a Tuesday versus a Thursday, etc.). To causally conclude that the expedited pathway reduces delays, a patient-randomized trial would be needed — which the authors do not claim, but the press might oversell. The risk of population bias between arms is real even if both populations come from the same upper Mirai decile.

Notable conflict of interest. Maggie Chung and Adam Yala hold shares in Voio Inc., an AI healthcare company that Adam Yala co-founded. The authors declare no financial or non-financial relationship "relevant to this study," but Mirai is Adam Yala's historical product and the demonstration of its clinical utility has a direct effect on the company's valuation. Not disqualifying — the study remains academic, NIH- and Breast Cancer Research Foundation-funded — but independent replication by a team with no commercial stake would be valuable before generalizing.

Additional note: the May 18 version is an unedited version in early access, which will still go through editing before final publication. Reported figures may evolve marginally.

What changes

For the research community, this is a useful methodological signal. Retrospective evaluations of mammographic risk models are abundant; real prospective deployments are rare, and those done in safety-net settings rarer still. The paper provides a reusable protocol blueprint — real-time integration, flag in the information system, dedicated pathway — for other teams to follow.

For clinical radiologists and care coordinators, the lesson is not that Mirai improves diagnosis, but that automatic prioritization based on an AI score allows reorganization of the queue. The added value is logistical. Centers already fast (under seven days from screening to biopsy) will not get much from this; those where delays exceed a month could draw inspiration, provided they have the human resources to absorb immediate reads.

For patients and the general public, the takeaway is measured. Mirai does not detect more cancers in absolute terms in this study; it serves to learn earlier that one has cancer, in a population for which "later" translates into loss of chance. This is an unglamorous use of AI in health, but probably more impactful in the short term than many AUC-beaten demonstrations.