GigaPath in digital pathology: what changes when a foundation model is trained on 1.3 billion tiles
The Nature paper published in May 2024 by Microsoft Research and Providence Health presents GigaPath, a foundation model for digital pathology trained on 1.3 billion image tiles extracted from 171,189 whole-slide images, covering 30,060 patients and 28 cancer types. Evaluated on 26 public benchmarks, it outperforms previous models on 18 of them, with notable gains on rare cancer subtype classification and image-based mutation prediction. A real methodological milestone, to read with caution on generalization and model availability.
The context
Digital pathology consists of digitizing microscope slides for computational analysis. Since 2017, AI in pathology has relied on convolutional networks (CNNs) trained for specific tasks — breast cancer detection, lymphoma subtyping, etc. These models worked but each required a dedicated annotated dataset, costly to assemble.
The arrival of foundation models changes this. These models, pre-trained on huge corpora without a specific task, learn general representations that can be adapted quickly to any downstream task with little labeled data. That is what transformed NLP with BERT and then LLMs. In pathology, the first large-scale vision foundation models appeared in 2023-2024: CTransPath, RudolfV, Prov-GigaPath (the paper decoded here), among others.
The method
The architecture has two stages. First stage: a vision transformer derived from DINOv2 extracts representations (embeddings) from each 256×256 pixel tile of a slide. This transformer has 1.1 billion parameters and is trained by self-supervised learning (without labels) on 1.3 billion tiles. The term transformer refers to a neural architecture based on the attention mechanism, standard since 2017 in NLP and more recently in vision.
Second stage: a sequence transformer called LongNet aggregates the thousands of tiles of an entire slide into a global representation. LongNet is designed to process very long sequences without exploding memory cost — a limitation of classical transformers facing pathology slides (which typically contain 5,000 to 50,000 tiles).
Training data come from the Providence Health hospital system in the United States. 171,189 digitized slides, 30,060 patients, 28 cancer types, 2017-2023 period. All American, all from a single hospital network. Evaluation is then done on 26 public external benchmarks, covering subtype classification, image-based mutation detection, and survival prediction.
The results
On 18 of 26 tasks tested, GigaPath outperforms the previous state of the art (mainly CTransPath, the reference model published in late 2022). The biggest gains concern three areas.
Rare cancer subtype classification, where traditional datasets lack examples. On certain lymphoma or sarcoma classification tasks, GigaPath gains 3 to 8 points of AUC (area under the ROC curve, which measures the ability to distinguish a positive from a negative — 1 is perfect, 0.5 is chance).
Genetic mutation prediction from images alone — for example detecting a PIK3CA mutation in breast cancer just by looking at histology, without sequencing DNA. This is a non-obvious use of images, and GigaPath gains several points of AUC on mutations such as TP53, KRAS, PIK3CA.
Survival prediction for certain cancers from the histology image. On glioblastoma and certain breast cancer subtypes, GigaPath improves patient stratification into risk groups.
What's good
Three notable strengths.
The training scale is unprecedented. CTransPath in 2022 used 32,000 slides. GigaPath uses 171,000. The foundation model rule — *more data, more parameters, better performance* — seems to hold in pathology too.
The LongNet architecture is a genuine technical contribution. For the first time, it allows processing an entire slide without artificial cropping, capturing spatial relationships between distant regions — useful for example for cancers with extended stromal components.
Code and model weights have been released on GitHub and Hugging Face, under a non-commercial license accessible to academic research. Better than the fully proprietary models of some competitors, allowing replication and extension by other teams.
What's less good
Three serious limitations to keep in mind.
Training data come from a single hospital system. Providence Health is a large network (51 hospitals), but all American, with probably homogeneous fixation and staining protocols. Pathology is sensitive to technical variation between labs — the same cancer doesn't look exactly the same depending on scanner, fixation time, operator. No prospective validation on European, Asian, or African populations is reported. Performance outside the American context remains to be proven.
The non-commercial license locks real clinical use. No hospital can deploy GigaPath in production diagnosis without renegotiating with Microsoft. Commercially understandable, but it means the model remains a research tool, not a clinical tool. Several competitors — RudolfV (Aignostics), Virchow (Paige) — are also under restrictive or fully proprietary licenses. The field has a commons problem.
The comparative evaluation is partial. GigaPath is compared primarily to CTransPath (2022) and a few earlier models. But 2024 saw the parallel emergence of several other pathology foundation models (RudolfV, Virchow, Phikon-v2) not systematically compared. Without independent rigorous benchmarking between these models, the "state of the art" claim warrants caution.
Additional note: all lead authors work for Microsoft Research or Providence Health, which hold the model rights. Five of the seven corresponding authors are employees of the sponsor. This does not disqualify the result, but an independent replication study would be welcome.
What changes
For the research community, this is a new baseline. GigaPath joins a few other available models that can be fine-tuned on any pathology task with little annotated data. Experimentation cost drops, innovation accelerates.
For clinical pathologists, nothing changes immediately. No routine deployment is imminent — it would require multi-center prospective validation, regulatory certification (FDA SaMD, CE), integration into existing digital slide management workflows. Realistic horizon: 3 to 7 years for widespread clinical use, starting with limited indications (rare tumor subtyping, where AI is faster than expert consultation).
For patients and the general public, the change is coming but real. Pathology is the medical discipline most likely to be profoundly transformed by AI in the next ten years, because it relies entirely on visual pattern analysis — exactly what these models do. What is being prepared silently in papers like GigaPath will eventually change the speed, consistency, and probably the accuracy of oncology diagnoses.
Further reading
Prov-GigaPath code and weights are available on GitHub and Hugging Face under non-commercial license. For an overview of other pathology foundation models, see the 2024 review by Zhang et al. in npj Digital Medicine. For the debate on FDA regulation of foundation models in imaging, the FDA's 2024 report on "AI/ML-Enabled Medical Devices" is openly available.