Automated Threshold Selection for Cryo-EM Density Maps

Mapping Intimacies ◽

10.1101/657395 ◽

2019 ◽

Cited By ~ 1

Author(s):

Jonas Pfab ◽

Dong Si

Keyword(s):

Deep Learning ◽

Structure Prediction ◽

Selection Process ◽

Threshold Level ◽

Training Dataset ◽

Threshold Selection ◽

Threshold Levels ◽

Density Maps ◽

Selection For ◽

Density Threshold

AbstractRecent advances in cryo-EM have made it possible to create protein density maps with a near-atomic resolution. This has contributed to its wide popularity, resulting in a rapidly growing number of available cryo-EM density maps. In order to computationally process them, an electron density threshold level is required which defines a lower bound for density values. In the context of this paper the threshold level is required in a pre-processing step of the backbone structure prediction project which predicts the location of Cα atoms of the backbone of a protein based on its cryo-EM density map using deep learning techniques. A custom threshold level has to be selected for each prediction in order to reduce noise that could irritate the deep learning model. Automatizing this threshold selection process makes it easier to run predictions as well as it removes the dependency of the prediction accuracy to the ability of someone to choose the right threshold value. This paper presents a method to automatize the threshold selection for the previously mentioned project as well as for other problems which require a density threshold level. The method uses the surface area to volume ratio and the ratio of voxels that lie above the threshold level to non-zero voxels as metrics to derive characteristics about suitable threshold levels based on a training dataset. The threshold level selection was tested by integrating it in the backbone prediction project and evaluating the accuracy of predictions using automatically as well as manually selected thresholds. We found that there was no loss in accuracy using the automatically selected threshold levels indicating that they are equally good as manually selected ones. The source code related to this paper can be found at https://github.com/DrDongSi/Auto-Thresholding.

Download Full-text

Multi-Objective Evolutionary Instance Selection for Regression Tasks

Entropy ◽

10.3390/e20100746 ◽

2018 ◽

Vol 20 (10) ◽

pp. 746 ◽

Cited By ~ 5

Author(s):

Mirosław Kordos ◽

Krystian Łapa

Keyword(s):

Pareto Front ◽

Selection Process ◽

Training Dataset ◽

Instance Selection ◽

Redundant Information ◽

Nsga Ii ◽

Multi Objective ◽

Optimal Subset ◽

Dataset Size ◽

Selection For

The purpose of instance selection is to reduce the data size while preserving as much useful information stored in the data as possible and detecting and removing the erroneous and redundant information. In this work, we analyze instance selection in regression tasks and apply the NSGA-II multi-objective evolutionary algorithm to direct the search for the optimal subset of the training dataset and the k-NN algorithm for evaluating the solutions during the selection process. A key advantage of the method is obtaining a pool of solutions situated on the Pareto front, where each of them is the best for certain RMSE-compression balance. We discuss different parameters of the process and their influence on the results and put special efforts to reducing the computational complexity of our approach. The experimental evaluation proves that the proposed method achieves good performance in terms of minimization of prediction error and minimization of dataset size.

Download Full-text

Automated Threshold Selection for Cryo-EM Density Maps

Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics - BCB '19 ◽

10.1145/3307339.3342190 ◽

2019 ◽

Author(s):

Jonas Pfab ◽

Dong Si

Keyword(s):

Threshold Selection ◽

Density Maps ◽

Selection For

Download Full-text

Optimal Threshold Selection for Realized Volatility Forecasts in the Presence of Jumps

SSRN Electronic Journal ◽

10.2139/ssrn.1714744 ◽

2010 ◽

Author(s):

Yixiao Sun ◽

Benjamin Fissel

Keyword(s):

Realized Volatility ◽

Optimal Threshold ◽

Threshold Selection ◽

Selection For ◽

Optimal Threshold Selection

Download Full-text

Crystal Structure Prediction via Deep Learning

Journal of the American Chemical Society ◽

10.1021/jacs.8b03913 ◽

2018 ◽

Vol 140 (32) ◽

pp. 10158-10168 ◽

Cited By ~ 86

Author(s):

Kevin Ryan ◽

Jeff Lengyel ◽

Michael Shatruk

Keyword(s):

Crystal Structure ◽

Deep Learning ◽

Structure Prediction ◽

Crystal Structure Prediction

Download Full-text

Label3DMaize: toolkit for 3D point cloud data annotation of maize shoots

GigaScience ◽

10.1093/gigascience/giab031 ◽

2021 ◽

Vol 10 (5) ◽

Author(s):

Teng Miao ◽

Weiliang Wen ◽

Yinglun Li ◽

Sheng Wu ◽

Chao Zhu ◽

...

Keyword(s):

Deep Learning ◽

Point Cloud ◽

Training Dataset ◽

Growth Stages ◽

3D Point Cloud ◽

Plant Structure ◽

Data Annotation ◽

Cloud Data ◽

Point Cloud Segmentation ◽

Different Growth Stages

Abstract Background The 3D point cloud is the most direct and effective data form for studying plant structure and morphology. In point cloud studies, the point cloud segmentation of individual plants to organs directly determines the accuracy of organ-level phenotype estimation and the reliability of the 3D plant reconstruction. However, highly accurate, automatic, and robust point cloud segmentation approaches for plants are unavailable. Thus, the high-throughput segmentation of many shoots is challenging. Although deep learning can feasibly solve this issue, software tools for 3D point cloud annotation to construct the training dataset are lacking. Results We propose a top-to-down point cloud segmentation algorithm using optimal transportation distance for maize shoots. We apply our point cloud annotation toolkit for maize shoots, Label3DMaize, to achieve semi-automatic point cloud segmentation and annotation of maize shoots at different growth stages, through a series of operations, including stem segmentation, coarse segmentation, fine segmentation, and sample-based segmentation. The toolkit takes ∼4–10 minutes to segment a maize shoot and consumes 10–20% of the total time if only coarse segmentation is required. Fine segmentation is more detailed than coarse segmentation, especially at the organ connection regions. The accuracy of coarse segmentation can reach 97.2% that of fine segmentation. Conclusion Label3DMaize integrates point cloud segmentation algorithms and manual interactive operations, realizing semi-automatic point cloud segmentation of maize shoots at different growth stages. The toolkit provides a practical data annotation tool for further online segmentation research based on deep learning and is expected to promote automatic point cloud processing of various plants.

Download Full-text

Study on Radar Echo-Filling in an Occlusion Area by a Deep Learning Algorithm

Remote Sensing ◽

10.3390/rs13091779 ◽

2021 ◽

Vol 13 (9) ◽

pp. 1779

Author(s):

Xiaoyan Yin ◽

Zhiqun Hu ◽

Jiafeng Zheng ◽

Boyong Li ◽

Yuanyuan Zuo

Keyword(s):

Deep Learning ◽

Loss Function ◽

Learning Algorithm ◽

Weather Radar ◽

Loss Functions ◽

Training Dataset ◽

Echo Intensity ◽

Common Mean ◽

Deep Learning Algorithm ◽

Radar Beam

Radar beam blockage is an important error source that affects the quality of weather radar data. An echo-filling network (EFnet) is proposed based on a deep learning algorithm to correct the echo intensity under the occlusion area in the Nanjing S-band new-generation weather radar (CINRAD/SA). The training dataset is constructed by the labels, which are the echo intensity at the 0.5° elevation in the unblocked area, and by the input features, which are the intensity in the cube including multiple elevations and gates corresponding to the location of bottom labels. Two loss functions are applied to compile the network: one is the common mean square error (MSE), and the other is a self-defined loss function that increases the weight of strong echoes. Considering that the radar beam broadens with distance and height, the 0.5° elevation scan is divided into six range bands every 25 km to train different models. The models are evaluated by three indicators: explained variance (EVar), mean absolute error (MAE), and correlation coefficient (CC). Two cases are demonstrated to compare the effect of the echo-filling model by different loss functions. The results suggest that EFnet can effectively correct the echo reflectivity and improve the data quality in the occlusion area, and there are better results for strong echoes when the self-defined loss function is used.

Download Full-text

Protein tertiary structure prediction and refinement using deep learning and Rosetta in CASP14

Proteins Structure Function and Bioinformatics ◽

10.1002/prot.26194 ◽

2021 ◽

Author(s):

Ivan Anishchenko ◽

Minkyung Baek ◽

Hahnbeom Park ◽

Naozumi Hiranuma ◽

David E. Kim ◽

...

Keyword(s):

Deep Learning ◽

Structure Prediction ◽

Tertiary Structure ◽

Tertiary Structure Prediction ◽

Protein Tertiary Structure ◽

Protein Tertiary Structure Prediction

Download Full-text

Deep Learning Based Antenna Selection for Channel Extrapolation in FDD Massive MIMO

2020 International Conference on Wireless Communications and Signal Processing (WCSP) ◽

10.1109/wcsp49889.2020.9299795 ◽

2020 ◽

Author(s):

Yindi Yang ◽

Shun Zhang ◽

FeiFei Gao ◽

Chao Xu ◽

Jianpeng Ma ◽

...

Keyword(s):

Deep Learning ◽

Massive Mimo ◽

Antenna Selection ◽

Selection For

Download Full-text

Evaluation of the feasibility of explainable computer-aided detection of cardiomegaly on chest radiographs using deep learning

Scientific Reports ◽

10.1038/s41598-021-96433-1 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Mu Sook Lee ◽

Yong Soo Kim ◽

Minki Kim ◽

Muhammad Usman ◽

Shi Sub Byon ◽

...

Keyword(s):

Deep Learning ◽

Diagnostic Performance ◽

Absolute Error ◽

Training Dataset ◽

Computer Aided Detection ◽

Test Dataset ◽

Cardiothoracic Ratio ◽

Computer Aided ◽

Chest X Ray ◽

Public Datasets

AbstractWe examined the feasibility of explainable computer-aided detection of cardiomegaly in routine clinical practice using segmentation-based methods. Overall, 793 retrospectively acquired posterior–anterior (PA) chest X-ray images (CXRs) of 793 patients were used to train deep learning (DL) models for lung and heart segmentation. The training dataset included PA CXRs from two public datasets and in-house PA CXRs. Two fully automated segmentation-based methods using state-of-the-art DL models for lung and heart segmentation were developed. The diagnostic performance was assessed and the reliability of the automatic cardiothoracic ratio (CTR) calculation was determined using the mean absolute error and paired t-test. The effects of thoracic pathological conditions on performance were assessed using subgroup analysis. One thousand PA CXRs of 1000 patients (480 men, 520 women; mean age 63 ± 23 years) were included. The CTR values derived from the DL models and diagnostic performance exhibited excellent agreement with reference standards for the whole test dataset. Performance of segmentation-based methods differed based on thoracic conditions. When tested using CXRs with lesions obscuring heart borders, the performance was lower than that for other thoracic pathological findings. Thus, segmentation-based methods using DL could detect cardiomegaly; however, the feasibility of computer-aided detection of cardiomegaly without human intervention was limited.

Download Full-text

Effectiveness of transfer learning for enhancing tumor classification with a convolutional neural network on frozen sections

Scientific Reports ◽

10.1038/s41598-020-78129-0 ◽

2020 ◽

Vol 10 (1) ◽

Author(s):

Young-Gon Kim ◽

Sungchul Kim ◽

Cristina Eunbee Cho ◽

In Hye Song ◽

Hee Jin Lee ◽

...

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Transfer Learning ◽

Frozen Section ◽

Medical Center ◽

External Validation ◽

Model Performance ◽

Classification Model ◽

Training Dataset

AbstractFast and accurate confirmation of metastasis on the frozen tissue section of intraoperative sentinel lymph node biopsy is an essential tool for critical surgical decisions. However, accurate diagnosis by pathologists is difficult within the time limitations. Training a robust and accurate deep learning model is also difficult owing to the limited number of frozen datasets with high quality labels. To overcome these issues, we validated the effectiveness of transfer learning from CAMELYON16 to improve performance of the convolutional neural network (CNN)-based classification model on our frozen dataset (N = 297) from Asan Medical Center (AMC). Among the 297 whole slide images (WSIs), 157 and 40 WSIs were used to train deep learning models with different dataset ratios at 2, 4, 8, 20, 40, and 100%. The remaining, i.e., 100 WSIs, were used to validate model performance in terms of patch- and slide-level classification. An additional 228 WSIs from Seoul National University Bundang Hospital (SNUBH) were used as an external validation. Three initial weights, i.e., scratch-based (random initialization), ImageNet-based, and CAMELYON16-based models were used to validate their effectiveness in external validation. In the patch-level classification results on the AMC dataset, CAMELYON16-based models trained with a small dataset (up to 40%, i.e., 62 WSIs) showed a significantly higher area under the curve (AUC) of 0.929 than those of the scratch- and ImageNet-based models at 0.897 and 0.919, respectively, while CAMELYON16-based and ImageNet-based models trained with 100% of the training dataset showed comparable AUCs at 0.944 and 0.943, respectively. For the external validation, CAMELYON16-based models showed higher AUCs than those of the scratch- and ImageNet-based models. Model performance for slide feasibility of the transfer learning to enhance model performance was validated in the case of frozen section datasets with limited numbers.

Download Full-text