reviewBMJMar 25, 2020HYBRID OA

Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies

Imperial College London · University College London · +7 more institutions

PubMed
Indexed incrossrefpubmed

Abstract

Objective

To systematically examine the design, reporting standards, risk of bias, and claims of studies comparing the performance of diagnostic deep learning algorithms for medical imaging with that of expert clinicians.

Design

Systematic review. DATA SOURCES: Medline, Embase, Cochrane Central Register of Controlled Trials, and the World Health Organization trial registry from 2010 to June 2019. ELIGIBILITY CRITERIA FOR SELECTING STUDIES: Randomised trial registrations and non-randomised studies comparing the performance of a deep learning algorithm in medical imaging with a contemporary group of one or more expert clinicians. Medical imaging has seen a growing interest in deep learning research. The main distinguishing feature of convolutional neural networks (CNNs) in deep learning is that when CNNs are fed with raw data, they develop their own representations needed for pattern recognition. The algorithm learns for itself the features of an image that are important for classification rather than being told by humans which features to use. The selected studies aimed to use medical imaging for predicting absolute risk of existing disease or classification into diagnostic groups (eg, disease or non-disease). For example, raw chest radiographs tagged with a label such as pneumothorax or no pneumothorax and the CNN learning which pixel patterns suggest pneumothorax. REVIEW METHODS: Adherence to reporting standards was assessed by using CONSORT (consolidated standards of reporting trials) for randomised studies and TRIPOD (transparent reporting of a multivariable prediction model for individual prognosis or diagnosis) for non-randomised studies. Risk of bias was assessed by using the Cochrane risk of bias tool for randomised studies and PROBAST (prediction model risk of bias assessment tool) for non-randomised studies.

Citation impact

1,035
total citations
FWCI
40.22
Percentile
100%
References
40
Citations per year

Authors

10

Topics & keywords

Keywords
  • Artificial intelligence
  • Medicine
  • MEDLINE
  • Deep learning
  • Machine learning
  • Systematic review
  • Convolutional neural network
  • Medical physics
No related works found for this paper.