Support us

Towards robust foundation models for digital pathology.

Abstract

Biomedical Foundation Models (FMs) are transforming AI-enabled healthcare research and entering clinical validation. However, their susceptibility to learning non-biological features - including variations in laboratory procedures and scanner hardware - poses risks for clinical deployment. We introduce PathoROB, a public benchmark quantifying FM robustness to non-biological features. Representation-level robustness is assessed using the robustness index, while output-level robustness is evaluated across clinically relevant settings, including patch- and slide-level prediction, case retrieval, and clustering tasks. Our experiments reveal robustness deficits across all 20 evaluated FMs, with substantial differences between them. We find that non-robust FM representations can cause major diagnostic downstream errors preventing safe clinical adoption. Using more robust FMs, vision-language alignment, and post-hoc robustification reduces (but does not yet eliminate) the risk of such errors. This work establishes that robustness evaluation is essential for validating pathology FMs before clinical adoption and provides a blueprint for assessing and improving robustness across biomedical domains.

More about this publication

Nature communications
  • Volume 17
  • Issue nr. 1
  • Publication date 11-06-2026

This site uses cookies

This website uses cookies to ensure you get the best experience on our website.