scholarly journals Railway Subgrade Defect Automatic Recognition Method Based on Improved Faster R-CNN

2018 ◽  
Vol 2018 ◽  
pp. 1-12 ◽  
Author(s):  
Xinjun Xu ◽  
Yang Lei ◽  
Feng Yang

Railway subgrade defect is the serious threat to train safety. Vehicle-borne GPR method has become the main railway subgrade detection technology with its advantages of rapidness and nondestructiveness. However, due to the large amount of detection data and the variety in defect shape and size, defect recognition is a challenging task. In this work, the method based on deep learning is proposed to recognize defects from the ground penetrating radar (GPR) profile of subgrade detection data. Based on the Faster R-CNN framework, the improvement strategies of feature cascade, adversarial spatial dropout network (ASDN), Soft-NMS, and data augmentation have been integrated to improve recognition accuracy, according to the characteristics of subgrade defects. The experimental results indicates that compared with traditional SVM+HOG method and the baseline Faster R-CNN, the improved model can achieve better performance. The model robustness is demonstrated by a further comparison experiment of various defect types. In addition, the improvements to model performance of each improvement strategy are verified by an ablation experiment of improvement strategies. This paper tries to explore the new thinking for the application of deep learning method in the field of railway subgrade defect recognition.

2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Anfu Zhu ◽  
Shuaihao Chen ◽  
Fangfang Lu ◽  
Congxiao Ma ◽  
Fengrui Zhang

The defect identification of tunnel lining is a task with a lot of tasks and time-consuming work, and currently, it mainly relies on manual operation. This paper takes the ground-penetrating radar image of the internal defects of the lining as the research object, and chooses the popular VGG16, ResNet34 convolutional neural network (CNN) to build the automatic recognition model for comparative study, and proposes an improved ResNet34 defect-recognition model. In this paper, SGD and Adam training algorithms are used to update network parameters, and the PyTorch depth framework is used to train the network. The test results show that the ResNet34 network has faster convergence speed, higher accuracy rate, and shorter training time than the VGG16 network. The ResNet34 network using the Adam algorithm can achieve 99.08% accuracy. The improved ResNet34 network can achieve an accuracy of 99.25%, and at the same, reduce the parameter amount by 4.22% compared with the ResNet34 network, which can better identify defects in the lining. The research in this paper shows that the deep learning method can provide new ideas for the identification of tunnel lining defects.


2021 ◽  
Vol 39 (3) ◽  
pp. 408-418 ◽  
Author(s):  
Changro Lee

PurposePrior studies on the application of deep-learning techniques have focused on enhancing computation algorithms. However, the amount of data is also a key element when attempting to achieve a goal using a quantitative approach, which is often underestimated in practice. The problem of sparse sales data is well known in the valuation of commercial properties. This study aims to expand the limited data available to exploit the capability inherent in deep learning techniques.Design/methodology/approachThe deep learning approach is used. Seoul, the capital of South Korea is selected as a case study area. Second, data augmentation is performed for properties with low trade volume in the market using a variational autoencoder (VAE), which is a generative deep learning technique. Third, the generated samples are added into the original dataset of commercial properties to alleviate data insufficiency. Finally, the accuracy of the price estimation is analyzed for the original and augmented datasets to assess the model performance.FindingsThe results using the sales datasets of commercial properties in Seoul, South Korea as a case study show that the augmented dataset by a VAE consistently shows higher accuracy of price estimation for all 30 trials, and the capabilities inherent in deep learning techniques can be fully exploited, promoting the rapid adoption of artificial intelligence skills in the real estate industry.Originality/valueAlthough deep learning-based algorithms are gaining popularity, they are likely to show limited performance when data are insufficient. This study suggests an alternative approach to overcome the lack of data problem in property valuation.


2020 ◽  
Vol 12 (22) ◽  
pp. 3715 ◽  
Author(s):  
Minsoo Park ◽  
Dai Quoc Tran ◽  
Daekyo Jung ◽  
Seunghee Park

To minimize the damage caused by wildfires, a deep learning-based wildfire-detection technology that extracts features and patterns from surveillance camera images was developed. However, many studies related to wildfire-image classification based on deep learning have highlighted the problem of data imbalance between wildfire-image data and forest-image data. This data imbalance causes model performance degradation. In this study, wildfire images were generated using a cycle-consistent generative adversarial network (CycleGAN) to eliminate data imbalances. In addition, a densely-connected-convolutional-networks-based (DenseNet-based) framework was proposed and its performance was compared with pre-trained models. While training with a train set containing an image generated by a GAN in the proposed DenseNet-based model, the best performance result value was realized among the models with an accuracy of 98.27% and an F1 score of 98.16, obtained using the test dataset. Finally, this trained model was applied to high-quality drone images of wildfires. The experimental results showed that the proposed framework demonstrated high wildfire-detection accuracy.


2020 ◽  
Vol 15 (12) ◽  
pp. 1975-1988
Author(s):  
Luisa F. Sánchez-Peralta ◽  
Artzai Picón ◽  
Francisco M. Sánchez-Margallo ◽  
J. Blas Pagador

Abstract Purpose Data augmentation is a common technique to overcome the lack of large annotated databases, a usual situation when applying deep learning to medical imaging problems. Nevertheless, there is no consensus on which transformations to apply for a particular field. This work aims at identifying the effect of different transformations on polyp segmentation using deep learning. Methods A set of transformations and ranges have been selected, considering image-based (width and height shift, rotation, shear, zooming, horizontal and vertical flip and elastic deformation), pixel-based (changes in brightness and contrast) and application-based (specular lights and blurry frames) transformations. A model has been trained under the same conditions without data augmentation transformations (baseline) and for each of the transformation and ranges, using CVC-EndoSceneStill and Kvasir-SEG, independently. Statistical analysis is performed to compare the baseline performance against results of each range of each transformation on the same test set for each dataset. Results This basic method identifies the most adequate transformations for each dataset. For CVC-EndoSceneStill, changes in brightness and contrast significantly improve the model performance. On the contrary, Kvasir-SEG benefits to a greater extent from the image-based transformations, especially rotation and shear. Augmentation with synthetic specular lights also improves the performance. Conclusion Despite being infrequently used, pixel-based transformations show a great potential to improve polyp segmentation in CVC-EndoSceneStill. On the other hand, image-based transformations are more suitable for Kvasir-SEG. Problem-based transformations behave similarly in both datasets. Polyp area, brightness and contrast of the dataset have an influence on these differences.


Author(s):  
Ramaprasad Poojary ◽  
Roma Raina ◽  
Amit Kumar Mondal

<span id="docs-internal-guid-cdb76bbb-7fff-978d-961c-e21c41807064"><span>During the last few years, deep learning achieved remarkable results in the field of machine learning when used for computer vision tasks. Among many of its architectures, deep neural network-based architecture known as convolutional neural networks are recently used widely for image detection and classification. Although it is a great tool for computer vision tasks, it demands a large amount of training data to yield high performance. In this paper, the data augmentation method is proposed to overcome the challenges faced due to a lack of insufficient training data. To analyze the effect of data augmentation, the proposed method uses two convolutional neural network architectures. To minimize the training time without compromising accuracy, models are built by fine-tuning pre-trained networks VGG16 and ResNet50. To evaluate the performance of the models, loss functions and accuracies are used. Proposed models are constructed using Keras deep learning framework and models are trained on a custom dataset created from Kaggle CAT vs DOG database. Experimental results showed that both the models achieved better test accuracy when data augmentation is employed, and model constructed using ResNet50 outperformed VGG16 based model with a test accuracy of 90% with data augmentation &amp; 82% without data augmentation.</span></span>


2020 ◽  
Author(s):  
Xin He ◽  
Shihao Wang ◽  
Shaohuai Shi ◽  
Xiaowen Chu ◽  
Jiangping Tang ◽  
...  

AbstractCOVID-19 pandemic has spread all over the world for months. As its transmissibility and high pathogenicity seriously threaten people’s lives, the accurate and fast detection of the COVID-19 infection is crucial. Although many recent studies have shown that deep learning based solutions can help detect COVID-19 based on chest CT scans, there lacks a consistent and systematic comparison and evaluation on these techniques. In this paper, we first build a clean and segmented CT dataset called Clean-CC-CCII by fixing the errors and removing some noises in a large CT scan dataset CC-CCII with three classes: novel coronavirus pneumonia (NCP), common pneumonia (CP), and normal controls (Normal). After cleaning, our dataset consists of a total of 340,190 slices of 3,993 scans from 2,698 patients. Then we benchmark and compare the performance of a series of state-of-the-art (SOTA) 3D and 2D convolutional neural networks (CNNs). The results show that 3D CNNs outperform 2D CNNs in general. With extensive effort of hyperparameter tuning, we find that the 3D CNN model DenseNet3D121 achieves the highest accuracy of 88.63% (F1-score is 88.14% and AUC is 0.940), and another 3D CNN model ResNet3D34 achieves the best AUC of 0.959 (accuracy is 87.83% and F1-score is 86.04%). We further demonstrate that the mixup data augmentation technique can largely improve the model performance. At last, we design an automated deep learning methodology to generate a lightweight deep learning model MNas3DNet41 that achieves an accuracy of 87.14%, F1-score of 87.25%, and AUC of 0.957, which are on par with the best models made by AI experts. The automated deep learning design is a promising methodology that can help health-care professionals develop effective deep learning models using their private data sets. Our Clean-CC-CCII dataset and source code are available at:https://github.com/arthursdays/HKBU HPML COVID-19.


2020 ◽  
Vol 15 (1) ◽  
Author(s):  
Ward van Rooij ◽  
Max Dahele ◽  
Hanne Nijhuis ◽  
Berend J. Slotman ◽  
Wilko F. Verbakel

Abstract Background Deep learning-based delineation of organs-at-risk for radiotherapy purposes has been investigated to reduce the time-intensiveness and inter-/intra-observer variability associated with manual delineation. We systematically evaluated ways to improve the performance and reliability of deep learning for organ-at-risk segmentation, with the salivary glands as the paradigm. Improving deep learning performance is clinically relevant with applications ranging from the initial contouring process, to on-line adaptive radiotherapy. Methods Various experiments were designed: increasing the amount of training data (1) with original images, (2) with traditional data augmentation and (3) with domain-specific data augmentation; (4) the influence of data quality was tested by comparing training/testing on clinical versus curated contours, (5) the effect of using several custom cost functions was explored, and (6) patient-specific Hounsfield unit windowing was applied during inference; lastly, (7) the effect of model ensembles was analyzed. Model performance was measured with geometric parameters and model reliability with those parameters’ variance. Results A positive effect was observed from increasing the (1) training set size, (2/3) data augmentation, (6) patient-specific Hounsfield unit windowing and (7) model ensembles. The effects of the strategies on performance diminished when the base model performance was already ‘high’. The effect of combining all beneficial strategies was an increase in average Sørensen–Dice coefficient of about 4% and 3% and a decrease in standard deviation of about 1% and 1% for the submandibular and parotid gland, respectively. Conclusions A subset of the strategies that were investigated provided a positive effect on model performance and reliability. The clinical impact of such strategies would be an expected reduction in post-segmentation editing, which facilitates the adoption of deep learning for autonomous automated salivary gland segmentation.


2021 ◽  
Vol 13 (22) ◽  
pp. 4590
Author(s):  
Yunpeng Yue ◽  
Hai Liu ◽  
Xu Meng ◽  
Yinguang Li ◽  
Yanliang Du

Deep learning models have achieved success in image recognition and have shown great potential for interpretation of ground penetrating radar (GPR) data. However, training reliable deep learning models requires massive labeled data, which are usually not easy to obtain due to the high costs of data acquisition and field validation. This paper proposes an improved least square generative adversarial networks (LSGAN) model which employs the loss functions of LSGAN and convolutional neural networks (CNN) to generate GPR images. This model can generate high-precision GPR data to address the scarcity of labelled GPR data. We evaluate the proposed model using Frechet Inception Distance (FID) evaluation index and compare it with other existing GAN models and find it outperforms the other two models on a lower FID score. In addition, the adaptability of the LSGAN-generated images for GPR data augmentation is investigated by YOLOv4 model, which is employed to detect rebars in field GPR images. It is verified that inclusion of LSGAN-generated images in the training GPR dataset can increase the target diversity and improve the detection precision by 10%, compared with the model trained on the dataset containing 500 field GPR images.


Author(s):  
Jihun Ahn ◽  
Ye Chan Kim ◽  
So Youn Kim ◽  
Su-Mi Hur ◽  
Vikram Thapar

Sign in / Sign up

Export Citation Format

Share Document