Research |

A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy

Lists

Tools

Sanderson, Edward ORCID: 0000-0002-3794-5513 and Matuszewski, Bogdan ORCID: 0000-0001-7195-2509 (2024) A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy. IEEE Access, 12 . pp. 46181-46201.

Preview

PDF (VOR) - Published Version
Available under License Creative Commons Attribution.
5MB

Official URL: https://doi.org/10.1109/ACCESS.2024.3381517

Abstract

Solutions to vision tasks in gastrointestinal endoscopy (GIE) conventionally use image encoders pretrained in a supervised manner with ImageNet-1k as backbones. However, the use of modern self-supervised pretraining algorithms and a recent dataset of 100k unlabelled GIE images (Hyperkvasir-unlabelled) may allow for improvements. In this work, we study the fine-tuned performance of models with ResNet50 and ViT-B backbones pretrained in self-supervised and supervised manners with ImageNet-1k and Hyperkvasir-unlabelled (self-supervised only) in a range of GIE vision tasks. In addition to identifying the most suitable pretraining pipeline and backbone architecture for each task, out of those considered, our results suggest: that self-supervised pretraining generally produces more suitable backbones for GIE vision tasks than supervised pretraining; that self-supervised pretraining with ImageNet-1k is typically more suitable than pretraining with Hyperkvasir-unlabelled, with the notable exception of monocular depth estimation in colonoscopy; and that ViT-Bs are more suitable in polyp segmentation and monocular depth estimation in colonoscopy, ResNet50s are more suitable in polyp detection, and both architectures perform similarly in anatomical landmark recognition and pathological finding characterisation. We hope this work draws attention to the complexity of pretraining for GIE vision tasks, informs this development of more suitable approaches than the convention, and inspires further research on this topic to help advance this development.

Repository Staff Only: item control page

Altmetric

Summary Table

Item Type:	Article
Creators (Authors or editors):	Creators Email ORCID ORCID Put Code Sanderson, Edward esanderson4@uclan.ac.uk https://orcid.org/0000-0002-3794-5513 UNSPECIFIED Matuszewski, Bogdan bmatuszewski1@uclan.ac.uk https://orcid.org/0000-0001-7195-2509 UNSPECIFIED
Uncontrolled Keywords (separate with ;):	Gastrointestinal endoscopy; computer vision; self-supervised pretraining; anatomical landmark recognition; pathological finding characterisation; polyp detection; polyp segmentation; monocular depth estimation
Subjects:	I - Computer science > I440 - Computer vision
Schools:	School of Engineering and Computing > Engineering, Construction, Maths and Physics
Research Institutes:	Institute for Engineering & Technology Innovation (InETI)
Funders:	Name ID Science and Technology Facilities Council http://dx.doi.org/10.13039/501100000271
Projects:	Name ID Budget Code URL ST/S005404/1 UNSPECIFIED UNSPECIFIED UNSPECIFIED
ID Code:	50407
Depositing User ID:	Christopher Waddington
Date Deposited:	19 Jan 2024 10:45
Last Modified:	16 Jun 2025 20:00

CORE (COnnecting REpositories)

Search CLok

A Study on Self-Supervised Pretraining for Vision Problems in Gastrointestinal Endoscopy

Abstract

Follow Us