Prediction of the functional impact of missense variants in BRCA1 and BRCA2 with BRCA-ML

Early Prediction of Sepsis in the ICU using Machine Learning: A Systematic Review.

10.1101/2020.08.31.20185207 ◽

2020 ◽

Author(s):

Michael Moor ◽

Bastian Rieck ◽

Max Horn ◽

Catherine Jutzeler ◽

Karsten Borgwardt

Keyword(s):

Machine Learning ◽

Systematic Review ◽

Quality Assessment ◽

Biomarker Discovery ◽

Learning Algorithms ◽

Quality Criteria ◽

Machine Learning Algorithms ◽

Data Sources ◽

Synthesis Methods ◽

Early Prediction

Background: Sepsis is among the leading causes of death in intensive care units (ICU) worldwide and its recognition, particularly in the early stages of the disease, remains a medical challenge. The advent of an affluence of available digital health data has created a setting in which machine learning can be used for digital biomarker discovery, with the ultimate goal to advance the early recognition of sepsis. Objective: To systematically review and evaluate studies employing machine learning for the prediction of sepsis in the ICU. Data sources: Using Embase, Google Scholar, PubMed/Medline, Scopus, and Web of Science, we systematically searched the existing literature for machine learning-driven sepsis onset prediction for patients in the ICU. Study eligibility criteria: All peer-reviewed articles using machine learning for the prediction of sepsis onset in adult ICU patients were included. Studies focusing on patient populations outside the ICU were excluded. Study appraisal and synthesis methods: A systematic review was performed according to the PRISMA guidelines. Moreover, a quality assessment of all eligible studies was performed. Results: Out of 974 identified articles, 22 and 21 met the criteria to be included in the systematic review and quality assessment, respectively. A multitude of machine learning algorithms were applied to refine the early prediction of sepsis. The quality of the studies ranged from "poor" (satisfying less than 40% of the quality criteria) to "very good" (satisfying more than 90% of the quality criteria). The majority of the studies (n= 19, 86.4%) employed an offline training scenario combined with a horizon evaluation, while two studies implemented an online scenario (n= 2,9.1%). The massive inter-study heterogeneity in terms of model development, sepsis definition, prediction time windows, and outcomes precluded a meta-analysis. Last, only 2 studies provided publicly-accessible source code and data sources fostering reproducibility. Limitations: Articles were only eligible for inclusion when employing machine learning algorithms for the prediction of sepsis onset in the ICU. This restriction led to the exclusion of studies focusing on the prediction of septic shock, sepsis-related mortality, and patient populations outside the ICU. Conclusions and key findings: A growing number of studies employs machine learning to31optimise the early prediction of sepsis through digital biomarker discovery. This review, however, highlights several shortcomings of the current approaches, including low comparability and reproducibility. Finally, we gather recommendations how these challenges can be addressed before deploying these models in prospective analyses. Systematic review registration number: CRD42020200133

Download Full-text

Machine Learning Approach to Forecast Work Zone Mobility using Probe Vehicle Data

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198120927401 ◽

2020 ◽

Vol 2674 (9) ◽

pp. 157-167

Author(s):

Mohsen Kamyab ◽

Stephen Remias ◽

Erfan Najmi ◽

Sanaz Rabinia ◽

Jonathan M. Waddell

Keyword(s):

Machine Learning ◽

Traffic Congestion ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Work Zone ◽

Data Sources ◽

Work Zones ◽

Probe Vehicle ◽

Vehicle Data ◽

Lane Closures

The aim of deploying intelligent transportation systems (ITS) is often to help engineers and operators identify traffic congestion. The future of ITS-based traffic management is the prediction of traffic conditions using ubiquitous data sources. There are currently well-developed prediction models for recurrent traffic congestion such as during peak hour. However, there is a need to predict traffic congestion resulting from non-recurring events such as highway lane closures. As agencies begin to understand the value of collecting work zone data, rich data sets will emerge consisting of historical work zone information. In the era of big data, rich mobility data sources are becoming available that enable the application of machine learning to predict mobility for work zones. The purpose of this study is to utilize historical lane closure information with supervised machine learning algorithms to forecast spatio-temporal mobility for future lane closures. Various traffic data sources were collected from 1,160 work zones on Michigan interstates between 2014 and 2017. This study uses probe vehicle data to retrieve a mobility profile for these historical observations, and uses these profiles to apply random forest, XGBoost, and artificial neural network (ANN) classification algorithms. The mobility prediction results showed that the ANN model outperformed the other models by reaching up to 85% accuracy. The objective of this research was to show that machine learning algorithms can be used to capture patterns for non-recurrent traffic congestion even when hourly traffic volume is not available.

Download Full-text

Comparison of Machine Learning Algorithms for Wildland-Urban Interface Fuelbreak Planning Integrating ALS and UAV-Borne LiDAR Data and Multispectral Images

Drones ◽

10.3390/drones4020021 ◽

2020 ◽

Vol 4 (2) ◽

pp. 21 ◽

Cited By ~ 1

Author(s):

Francisco Rodríguez-Puerta ◽

Rafael Alonso Ponce ◽

Fernando Pérez-Rodríguez ◽

Beatriz Águeda ◽

Saray Martín-García ◽

...

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Random Forest ◽

Learning Algorithms ◽

Remote Sensing Data ◽

Machine Learning Algorithms ◽

Data Sources ◽

Lidar Data ◽

Sensing Data ◽

Sentinel 2

Controlling vegetation fuels around human settlements is a crucial strategy for reducing fire severity in forests, buildings and infrastructure, as well as protecting human lives. Each country has its own regulations in this respect, but they all have in common that by reducing fuel load, we in turn reduce the intensity and severity of the fire. The use of Unmanned Aerial Vehicles (UAV)-acquired data combined with other passive and active remote sensing data has the greatest performance to planning Wildland-Urban Interface (WUI) fuelbreak through machine learning algorithms. Nine remote sensing data sources (active and passive) and four supervised classification algorithms (Random Forest, Linear and Radial Support Vector Machine and Artificial Neural Networks) were tested to classify five fuel-area types. We used very high-density Light Detection and Ranging (LiDAR) data acquired by UAV (154 returns·m−2 and ortho-mosaic of 5-cm pixel), multispectral data from the satellites Pleiades-1B and Sentinel-2, and low-density LiDAR data acquired by Airborne Laser Scanning (ALS) (0.5 returns·m−2, ortho-mosaic of 25 cm pixels). Through the Variable Selection Using Random Forest (VSURF) procedure, a pre-selection of final variables was carried out to train the model. The four algorithms were compared, and it was concluded that the differences among them in overall accuracy (OA) on training datasets were negligible. Although the highest accuracy in the training step was obtained in SVML (OA=94.46%) and in testing in ANN (OA=91.91%), Random Forest was considered to be the most reliable algorithm, since it produced more consistent predictions due to the smaller differences between training and testing performance. Using a combination of Sentinel-2 and the two LiDAR data (UAV and ALS), Random Forest obtained an OA of 90.66% in training and of 91.80% in testing datasets. The differences in accuracy between the data sources used are much greater than between algorithms. LiDAR growth metrics calculated using point clouds in different dates and multispectral information from different seasons of the year are the most important variables in the classification. Our results support the essential role of UAVs in fuelbreak planning and management and thus, in the prevention of forest fires.

Download Full-text

Deep learning and taphonomy: high accuracy in the classification of cut marks made on fleshed and defleshed bones using convolutional neural networks

Scientific Reports ◽

10.1038/s41598-019-55439-6 ◽

2019 ◽

Vol 9 (1) ◽

Cited By ~ 6

Author(s):

Gabriel Cifuentes-Alcobendas ◽

Manuel Domínguez-Rodrigo

Keyword(s):

Machine Learning ◽

Accurate Method ◽

High Accuracy ◽

Machine Learning Algorithms ◽

Accurate Identification ◽

Cut Marks ◽

Eating Meat ◽

Microscopic Features ◽

Immense Potential

AbstractAccurate identification of bone surface modifications (BSM) is crucial for the taphonomic understanding of archaeological and paleontological sites. Critical interpretations of when humans started eating meat and animal fat or when they started using stone tools, or when they occupied new continents or interacted with predatory guilds impinge on accurate identifications of BSM. Until now, interpretations of Plio-Pleistocene BSM have been contentious because of the high uncertainty in discriminating among taphonomic agents. Recently, the use of machine learning algorithms has yielded high accuracy in the identification of BSM. A branch of machine learning methods based on imaging, computer vision (CV), has opened the door to a more objective and accurate method of BSM identification. The present work has selected two extremely similar types of BSM (cut marks made on fleshed an defleshed bones) to test the immense potential of artificial intelligence methods. This CV approach not only produced the highest accuracy in the classification of these types of BSM until present (95% on complete images of BSM and 88.89% of images of only internal mark features), but it also has enabled a method for determining which inconspicuous microscopic features determine successful BSM discrimination. The potential of this method in other areas of taphonomy and paleobiology is enormous.

Download Full-text

Missense Variants of Uncertain Significance (VUS) Altering the Phosphorylation Patterns of BRCA1 and BRCA2

PLoS ONE ◽

10.1371/journal.pone.0062468 ◽

2013 ◽

Vol 8 (5) ◽

pp. e62468 ◽

Cited By ~ 5

Author(s):

Eric Tram ◽

Sevtap Savas ◽

Hilmi Ozcelik

Keyword(s):

Brca1 And Brca2 ◽

Variants Of Uncertain Significance ◽

Missense Variants ◽

Uncertain Significance

Download Full-text

BRCA1 and BRCA2 mutations in breast and ovarian cancer families from south west Colombia

Colombia Medica ◽

10.25100/cm.v50i3.2385 ◽

2020 ◽

pp. 163-175

Author(s):

Laura Cifuentes-C ◽

Ana Lucia Rivera-Herrera ◽

Guillermo Barreto

Keyword(s):

Breast Cancer ◽

Ovarian Cancer ◽

In Silico ◽

Novel Mutation ◽

In Silico Analysis ◽

Moderate Increase ◽

Brca1 And Brca2 ◽

Variants Of Uncertain Significance ◽

Uncertain Significance ◽

Colombian Pacific

Introduction: Breast cancer is the most common neoplasia of women from all over the world especially women from Colombia. 5%10% of all cases are caused by hereditary factors, 25% of those cases have mutations in the BRCA1/BRCA2 genes. Objective: The purpose of this study was to identify the mutations associated with the risk of familial breast and/or ovarian cancer in a population of Colombian pacific. Methods: 58 high-risk breast and/or ovarian cancer families and 20 controls were screened for germline mutations in BRCA1 and BRCA2, by Single Strand Conformation Polymorphism (SSCP) and sequencing. Results: Four families (6.9%) were found to carry BRCA1 mutations and eight families (13.8%) had mutations in BRCA2. In BRCA1, we found three Variants of Uncertain Significance (VUS), of which we concluded, using in silico tools, that c.8112C>G and c.3119G>A (p.Ser1040Asn) are probably deleterious, and c.3083G>A (p.Arg1028His) is probably neutral. In BRCA2, we found three variants of uncertain significance: two were previously described and one novel mutation. Using in silico analysis, we concluded that c.865A>G (p.Asn289Asp) and c.6427T>C (p.Ser2143Pro) are probably deleterious and c.125A>G (p.Tyr42Cys) is probably neutral. Only one of them has previously been reported in Colombia. We also identified 13 polymorphisms (4 in BRCA1 and 9 in BRCA2), two of them are associated with a moderate increase in breast cancer risk (BRCA2 c.1114A>C and c.875566T>C). Conclusion: According to our results, the Colombian pacific population presents diverse mutational spectrum for BRCA genes that differs from the findings in other regions in the country.

Download Full-text

Accelerating Battery Manufacturing Optimization by Combining Experiments, In Silico Electrodes Generation and Machine Learning

10.26434/chemrxiv.12473501 ◽

2020 ◽

Author(s):

Marc Duquesnoy ◽

Teo Lombardo ◽

Mehdi Chouchane ◽

Emiliano Primo ◽

Alejandro A. Franco

Keyword(s):

Machine Learning ◽

In Silico ◽

Active Material ◽

Current Collector ◽

Machine Learning Algorithms ◽

Material Surface ◽

High Performing ◽

Manufacturing Optimization ◽

Speed Up ◽

Li Ion

Both the society and the market calls for safer, high-performing and cheap Li-ion batteries (LIBs) in order to speed up the transition from oil-based to electric-based economy. One critical aspect to be taken into account in this modern challenge is LIBs manufacturing process, whose optimization is time and resources consuming due to the several interdependent physicochemical mechanisms involved. In order to tackle rapidly this challenge, digital tools able to accelerate LIBs manufacturing optimization are crucially needed for both well assessed and recently discovered chemistries. The methodology presented here encompasses experimental characterizations, in silico generation of electrode mesostructures and machine learning algorithms to track the effect of manufacturing over a wide array of mesoscale electrode properties critically linked to the electrochemical performance. Particularly, features as the interconnectivity of the particles network, the electrolyte tortuosity and effective ionic conductivity, the percentage of current collector surface covered by either active material or carbon-binder domain particles and the active material surface in contact with electrolyte were analysed and discussed in detail. This approach was tested and validated for the case of LiNi1/3Mn1/3Co1/3O2-based cathodes calendering, proving its capability to ease the process parameters-electrode properties interdependencies analysis, paving the way to deeper understanding and then faster optimization of LIBs manufacturing.

Download Full-text

Comprehensive Fitness Landscape of a Multi-Geometry Protein Capsid Informs Machine Learning Models of Assembly

10.1101/2021.12.21.473721 ◽

2021 ◽

Author(s):

Daniel D. Brauer ◽

Celine B. Santiago ◽

Zoe N. Merz ◽

Esther McCarthy ◽

Danielle Tullman-Ercek ◽

...

Keyword(s):

Machine Learning ◽

In Silico ◽

Quaternary Structure ◽

De Novo ◽

Fitness Landscape ◽

Machine Learning Algorithms ◽

Training Data ◽

Particle Assembly ◽

Complex Particle ◽

Self Assembled

Virus-like particles (VLPs) are non-infections viral-derived nanomaterials poised for biotechnological applications due to their well-defined, modular self-assembling architecture. Although progress has been made in understanding the complex effects that mutations may have on VLPs, nuanced understanding of the influence particle mutability has on quaternary structure has yet to be achieved. Here, we generate and compare the apparent fitness landscapes of two capsid geometries (T=3 and T=1 icosahedral) of the bacteriophage MS2 VLP. We find significant shifts in mutability at the symmetry interfaces of the T=1 capsid when compared to the wildtype T=3 assembly. Furthermore, we use the generated landscapes to benchmark the performance of in silico mutational scanning tools in capturing the effect of missense mutation on complex particle assembly. Finding that predicted stability effects correlated relatively poorly with assembly phenotype, we used a combination of de novo features in tandem with in silico results to train machine learning algorithms for the classification of variant effects on assembly. Our findings not only reveal ways that assembly geometry affects the mutable landscape of a self-assembled particle, but also establish a template for the generation of predictive mutational models of self-assembled capsids using minimal empirical training data.

Download Full-text

Rare Missense Functional Variants at COL4A1 and COL4A2 in Sporadic Intracerebral Hemorrhage

Neurology ◽

10.1212/wnl.0000000000012227 ◽

2021 ◽

pp. 10.1212/WNL.0000000000012227

Author(s):

Jaeyoon Chung ◽

Graham Hamilton ◽

Minsup Kim ◽

Sandro Marini ◽

Bailey Montgomery ◽

...

Keyword(s):

Intracerebral Hemorrhage ◽

In Silico ◽

Minor Allele ◽

Sequencing Data ◽

Functional Impact ◽

Impact Prediction ◽

Missense Variants ◽

Association Testing ◽

Rare Variant Analysis ◽

Functional Variants

ObjectiveTo test the genetic contribution of rare missense variants in COL4A1 and COL4A2 in which common variants are genetically associated with sporadic intracerebral hemorrhage (ICH), we performed rare variant analysis in multiple sequencing data for the risk for sporadic ICH.MethodsWe performed sequencing across 559Kbp at 13q34 including COL4A1 and COL4A2 among 2,133 individuals (1,055 ICH cases; 1,078 controls) in US-based and 1,492 individuals (192 ICH cases; 1,189 controls) from Scotland-based cohorts, followed by sequence annotation, functional impact prediction, genetic association testing, and in silico thermodynamic modeling.ResultsWe identified 107 rare nonsynonymous variants in sporadic ICH, of which two missense variants, rs138269346 (COL4A1I110T) and rs201716258 (COL4A2H203L), were predicted to be highly functional and occurred in multiple ICH cases but not in controls from the US-based cohort. The minor allele of rs201716258 was also present in Scottish ICH patients, and rs138269346 was observed in two ICH-free controls with a history of hypertension and myocardial infarction. Rs138269346 was nominally associated with non-lobar ICH risk (P=0.05), but not with lobar ICH (P=0.08), while associations between rs201716258 and ICH subtypes were non-significant (P>0.12). Both variants were considered pathogenic based on minor allele frequency (<0.00035 in EUR), predicted functional impact (deleterious or probably damaging), and in silico modeling studies (substantially altered physical length and thermal stability of collagen).ConclusionsWe identified rare missense variants in COL4A1/A2 in association with sporadic ICH. Our annotation and simulation studies suggest that these variants are highly functional and may represent targets for translational follow-up.

Download Full-text

Accelerating Battery Manufacturing Optimization by Combining Experiments, In Silico Electrodes Generation and Machine Learning

10.26434/chemrxiv.12473501.v2 ◽

2020 ◽

Author(s):

Marc Duquesnoy ◽

Teo Lombardo ◽

Mehdi Chouchane ◽

Emiliano Primo ◽

Alejandro A. Franco

Keyword(s):

Machine Learning ◽

In Silico ◽

Active Material ◽

Current Collector ◽

Machine Learning Algorithms ◽

Material Surface ◽

High Performing ◽

Manufacturing Optimization ◽

Speed Up ◽

Li Ion

Both the society and the market calls for safer, high-performing and cheap Li-ion batteries (LIBs) in order to speed up the transition from oil-based to electric-based economy. One critical aspect to be taken into account in this modern challenge is LIBs manufacturing process, whose optimization is time and resources consuming due to the several interdependent physicochemical mechanisms involved. In order to tackle rapidly this challenge, digital tools able to accelerate LIBs manufacturing optimization are crucially needed for both well assessed and recently discovered chemistries. The methodology presented here encompasses experimental characterizations, in silico generation of electrode mesostructures and machine learning algorithms to track the effect of manufacturing over a wide array of mesoscale electrode properties critically linked to the electrochemical performance. Particularly, features as the interconnectivity of the particles network, the electrolyte tortuosity and effective ionic conductivity, the percentage of current collector surface covered by either active material or carbon-binder domain particles and the active material surface in contact with electrolyte were analysed and discussed in detail. This approach was tested and validated for the case of LiNi1/3Mn1/3Co1/3O2-based cathodes calendering, proving its capability to ease the process parameters-electrode properties interdependencies analysis, paving the way to deeper understanding and then faster optimization of LIBs manufacturing.

Download Full-text