Designing Chest X-ray Datasets for Improving Lung Nodules Detection Through Convolutional Neural Networks
In this paper, we propose a method for building alternative training datasets for lung nodule detection from plain chest X-ray images. Our aim is to improve the classification quality of a state-of-the-art CNN by just selecting appropriate samples from the existing datasets. The hypothesis of this research is that high quality models need to learn by contrasting very clean images with those containing nodules, specially those difficult to identify by non-expert clinicians. Current chest X-ray datasets mostly include images where more than one pathology exist and/or contain devices like catheters. This is because most samples come from old people which are the usual patients subject to X-ray examinations. In this paper, we evaluate several combinations of samples from existing datasets in the literature. Results show a great gain in performance for some of the evaluated combinations, confirming our hypothesis. The achieved performance of these models allows a considerable speed-up in the screening of patients by radiologist.