The tumor microenvironment (TME) contains important morphological and molecular cues that help determine prognosis and therapeutic responses. Deep learning models can assist pathologists in assessing such biomarkers. However, generating high-quality annotations for computational pathology remains a significant bottleneck due to the level of expertise and considerable time required for fine-grained labeling. In this study, we evaluate the feasibility of using non-expert crowdsourcing for cell annotation in hematoxylin and eosin (H&E) samples of breast cancer tissue. Unlike prior work, performance assessment was based on high-fidelity ground-truth labels derived from multiplexed immunofluorescence using CODEX data, allowing for accurate benchmarking of both experts and non-experts. We collected cell annotations from experts, semi-experts, and non-experts through Tilly, a gamified annotation application designed to train and engage users in identifying major cell types within the TME. Overall, our results show that non-expert crowdsourcing is a scalable and effective strategy for generating training data for the classification of major cell types in H&E images: tumor cells, lymphocytes, and fibroblasts. Moreover, combining large, crowdsourced datasets with smaller, high-quality subsets annotated using spatial proteomics may offer a practical annotation approach to develop more robust models while minimizing biases.
This website uses cookies to ensure you get the best experience on our website.