Support us

Crowdsourcing enables robust cell annotation for breast cancer pathology.

Abstract

The tumor microenvironment (TME) contains important morphological and molecular cues that help determine prognosis and therapeutic responses. Deep learning models can assist pathologists in assessing such biomarkers. However, generating high-quality annotations for computational pathology remains a significant bottleneck due to the level of expertise and considerable time required for fine-grained labeling. In this study, we evaluate the feasibility of using non-expert crowdsourcing for cell annotation in hematoxylin and eosin (H&E) samples of breast cancer tissue. Unlike prior work, performance assessment was based on high-fidelity ground-truth labels derived from multiplexed immunofluorescence using CODEX data, allowing for accurate benchmarking of both experts and non-experts. We collected cell annotations from experts, semi-experts, and non-experts through Tilly, a gamified annotation application designed to train and engage users in identifying major cell types within the TME. Overall, our results show that non-expert crowdsourcing is a scalable and effective strategy for generating training data for the classification of major cell types in H&E images: tumor cells, lymphocytes, and fibroblasts. Moreover, combining large, crowdsourced datasets with smaller, high-quality subsets annotated using spatial proteomics may offer a practical annotation approach to develop more robust models while minimizing biases.

More about this publication

Scientific reports
  • Publication date 11-05-2026

This site uses cookies

This website uses cookies to ensure you get the best experience on our website.