Accurate response evaluation is necessary to select complete responders (CRs) for a watch-and-wait approach. Deep learning may aid in this process, but so far has never been evaluated for this purpose. The aim was to evaluate the accuracy to assess response with deep learning methods based on endoscopic images in rectal cancer patients after neoadjuvant therapy.
This pilot study shows that deep learning has a modest accuracy (AUCs 0.76-0.83). This is not accurate enough for clinical decision making, and lower than what is generally reported by experienced endoscopists. Deep learning models can however be further improved and may become useful to assist endoscopists in evaluating the response. More well-designed prospective studies are required.
226 patients were included for the study (117(52%) were non-CRs; 109(48%) were CRs). The accuracy, AUC, positive- and negative predictive values, sensitivity and specificity of the different models varied from 0.67-0.75%, 0.76-0.83%, 67-74%, 70-78%, 68-79% to 66-75%, respectively. Overall, EfficientNet-B2 was the most successful model with the highest diagnostic performance.
Rectal cancer patients diagnosed between January 2012 and December 2015 and treated with neoadjuvant (chemo)radiotherapy were retrospectively selected from a single institute. All patients underwent flexible endoscopy for response evaluation. Diagnostic performance (accuracy, area under the receiver operator characteristics curve (AUC), positive- and negative predictive values, sensitivities and specificities) of different open accessible deep learning networks was calculated. Reference standard was histology after surgery, or long-term outcome (>2 years of follow-up) in a watch-and-wait policy.