Automated Angiographic Labeling Pipeline

Jacob Cantrell; Kolten Kersey; Anush Motaganahalli; Amy Li; Hunter Maxwell; Shantanu Dev; Andrew A. Gonzalez

doi:10.18060/25890

Automated Angiographic Labeling Pipeline

Proceedings of IMPRS ◽

10.18060/25890 ◽

2021 ◽

Vol 4 (1) ◽

Author(s):

Jacob Cantrell ◽

Kolten Kersey ◽

Anush Motaganahalli ◽

Amy Li ◽

Hunter Maxwell ◽

...

Keyword(s):

Machine Learning ◽

Active Learning ◽

Learning Algorithm ◽

Axial Flow ◽

Training Data ◽

Convenience Sample ◽

Hybrid Approaches ◽

Machine Learning Approach ◽

Picture Archiving And Communication ◽

Artery Disease

Background & Hypothesis: Treatment decisions for medical management, endovascular therapy, open surgery, and hybrid approaches for peripheral artery disease (PAD) are largely driven by imaging. While catheter-directed angiography remains the gold-standard for endoluminal vessel analysis, currently, there is not widespread clinical use of machine learning to provide automated segmentation. This project aims to develop an active learning pipeline to automate the labeling of vascular structures in angiographic images. Methods: We queried the picture archiving and communication system (PACS) database for Indiana University Health and Eskenazi Health to identify studies with catheter-directed angiograms of the extremities. From this dataset we randomly selected an initial convenience sample of 50 angiograms to manually label using the 3D Slicer software. We compared three workflow approaches for labeling this training data - (1) human-only single-pass labelling whereby one person labels each image; (2) human-only multi-pass labelling whereby three humans label a vessel with increasing precision; (3) “human-in-the-middle” approach using NVIDIA’s AI-Assisted Annotation client whereby the image is auto-segmented and then manually checked for accuracy. Results: We are currently evaluating speed and accuracy for each of these approaches. However, our preliminary data suggests that human-only multi-pass labeling is most efficientappreac We will be validating the following three-step process. First, thresholding tool was used to leverage differences in contrast gradations to approximate the location of vascular structure. Second, the eraser tool was utilized to refine the vessel boundaries. Finally, major blood vessels contributing to axial flow to the foot were manually labeled. These labeled angiograms will be used to develop an active learning algorithm to automate future labeling of the remaining dataset. Conclusion: A machine learning approach to interpreting lower extremity images can dramatically improve the efficiency of triaging patients with PAD. Further work is underway to develop and implement this program clinically.

Download Full-text

scikit-activeml: A Library and Toolbox for Active Learning Algorithms

10.20944/preprints202103.0194.v1 ◽

2021 ◽

Author(s):

Daniel Kottke ◽

Marek Herde ◽

Tuan Pham Minh ◽

Alexander Benz ◽

Pascal Mergard ◽

...

Keyword(s):

Machine Learning ◽

Active Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Unlabeled Data ◽

Training Data ◽

Partially Labeled Data ◽

Difficult Time ◽

Machine Learning Applications ◽

Data Points

Machine learning applications often need large amounts of training data to perform well. Whereas unlabeled data can be easily gathered, the labeling process is difficult, time-consuming, or expensive in most applications. Active learning can help solve this problem by querying labels for those data points that will improve the performance the most. Thereby, the goal is that the learning algorithm performs sufficiently well with fewer labels. We provide a library called scikit-activeml that covers the most relevant query strategies and implements tools to work with partially labeled data. It is programmed in Python and builds on top of scikit-learn.

Download Full-text

Scalable Approach to High Coverages on Oxides via Iterative Training of a Machine-Learning Algorithm

10.26434/chemrxiv.10288514.v1 ◽

2019 ◽

Author(s):

Andrew Medford ◽

Shengchun Yang ◽

Fuzhu Liu

Keyword(s):

Machine Learning ◽

Chemical Potential ◽

Learning Algorithm ◽

Absolute Error ◽

Low Energy ◽

Training Data ◽

High Coverage ◽

Metal Compounds ◽

Adsorption Energies ◽

The Stability

Understanding the interaction of multiple types of adsorbate molecules on solid surfaces is crucial to establishing the stability of catalysts under various chemical environments. Computational studies on the high coverage and mixed coverages of reaction intermediates are still challenging, especially for transition-metal compounds. In this work, we present a framework to predict differential adsorption energies and identify low-energy structures under high- and mixed-adsorbate coverages on oxide materials. The approach uses Gaussian process machine-learning models with quantified uncertainty in conjunction with an iterative training algorithm to actively identify the training set. The framework is demonstrated for the mixed adsorption of CHx, NHx and OHx species on the oxygen vacancy and pristine rutile TiO2(110) surface sites. The results indicate that the proposed algorithm is highly efficient at identifying the most valuable training data, and is able to predict differential adsorption energies with a mean absolute error of ~0.3 eV based on <25% of the total DFT data. The algorithm is also used to identify 76% of the low-energy structures based on <30% of the total DFT data, enabling construction of surface phase diagrams that account for high and mixed coverage as a function of the chemical potential of C, H, O, and N. Furthermore, the computational scaling indicates the algorithm scales nearly linearly (N1.12) as the number of adsorbates increases. This framework can be directly extended to metals, metal oxides, and other materials, providing a practical route toward the investigation of the behavior of catalysts under high-coverage conditions.

Download Full-text

PREDICTING MORTALITY IN PATIENTS WITH CORONARY ARTERY DISEASE REFERRED TO CARDIAC REHABILITATION: A MACHINE LEARNING APPROACH

Journal of the American College of Cardiology ◽

10.1016/s0735-1097(21)02885-0 ◽

2021 ◽

Vol 77 (18) ◽

pp. 1527

Author(s):

Christina G. De Souza E Silva ◽

Edmundo de Souza e Silva ◽

Ross Arena ◽

Jonathan Myers

Keyword(s):

Machine Learning ◽

Coronary Artery Disease ◽

Coronary Artery ◽

Cardiac Rehabilitation ◽

Learning Approach ◽

Machine Learning Approach ◽

Artery Disease

Download Full-text

Brain Activity-Based Metrics for Assessing Learning States in VR under Stress among Firefighters: An Explorative Machine Learning Approach in Neuroergonomics

Brain Sciences ◽

10.3390/brainsci11070885 ◽

2021 ◽

Vol 11 (7) ◽

pp. 885

Author(s):

Maher Abujelala ◽

Rohith Karthikeyan ◽

Oshin Tyagi ◽

Jing Du ◽

Ranjana K. Mehta

Keyword(s):

Machine Learning ◽

Environmental Conditions ◽

Brain Activity ◽

Memory Task ◽

Classification Problem ◽

Brain Regions ◽

Training Data ◽

Information Encoding ◽

Machine Learning Approach ◽

Encoding And Retrieval

The nature of firefighters` duties requires them to work for long periods under unfavorable conditions. To perform their jobs effectively, they are required to endure long hours of extensive, stressful training. Creating such training environments is very expensive and it is difficult to guarantee trainees’ safety. In this study, firefighters are trained in a virtual environment that includes virtual perturbations such as fires, alarms, and smoke. The objective of this paper is to use machine learning methods to discern encoding and retrieval states in firefighters during a visuospatial episodic memory task and explore which regions of the brain provide suitable signals to solve this classification problem. Our results show that the Random Forest algorithm could be used to distinguish between information encoding and retrieval using features extracted from fNIRS data. Our algorithm achieved an F-1 score of 0.844 and an accuracy of 79.10% if the training and testing data are obtained at similar environmental conditions. However, the algorithm’s performance dropped to an F-1 score of 0.723 and accuracy of 60.61% when evaluated on data collected under different environmental conditions than the training data. We also found that if the training and evaluation data were recorded under the same environmental conditions, the RPM, LDLPFC, RDLPFC were the most relevant brain regions under non-stressful, stressful, and a mix of stressful and non-stressful conditions, respectively.

Download Full-text

Roboterbasierte spanende Bearbeitung*/Robot-Based Milling Operation. Machine Learning Algorithm for a model-based feed-forward torque control

wt Werkstattstechnik online ◽

10.37544/1436-4980-2019-05-54 ◽

2019 ◽

Vol 109 (05) ◽

pp. 352-357

Author(s):

C. Brecher ◽

L. Gründel ◽

L. Lienenlüke ◽

S. Storms

Keyword(s):

Machine Learning ◽

Position Control ◽

Learning Algorithm ◽

Milling Process ◽

Industrial Robots ◽

Torque Control ◽

Feed Forward ◽

Control Loops ◽

Model Based ◽

Machine Learning Approach

Die Lageregelung von konventionellen Industrierobotern ist nicht auf den dynamischen Fräsprozess ausgelegt. Eine Möglichkeit, das Verhalten der Regelkreise zu optimieren, ist eine modellbasierte Momentenvorsteuerung, welche in dieser Arbeit aufgrund vieler Vorteile durch einen Machine-Learning-Ansatz erweitert wird. Hierzu wird die Umsetzung in Matlab und die simulative Evaluation erläutert, die im Anschluss das Potenzial dieses Konzeptes bestätigt.   The position control of conventional industrial robots is not designed for the dynamic milling process. One possibility to optimize the behavior of the control loops is a model-based feed-forward torque control which is supported by a machine learning approach due to many advantages. The implementation in Matlab and the simulative evaluation are explained, which subsequently confirms the potential of this concept.

Download Full-text

Headnote Prediction Using Machine Learning

The International Arab Journal of Information Technology ◽

10.34028/iajit/18/5/7 ◽

2021 ◽

Vol 18 (5) ◽

Author(s):

Sarmad Mahar ◽

Sahar Zafar ◽

Kamran Nishat

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Active Learning ◽

Text Classification ◽

Extraction Methods ◽

Text Summarization ◽

Training Data ◽

Second Step ◽

Support Vector ◽

Classification Algorithms

Headnotes are the precise explanation and summary of legal points in an issued judgment. Law journals hire experienced lawyers to write these headnotes. These headnotes help the reader quickly determine the issue discussed in the case. Headnotes comprise two parts. The first part comprises the topic discussed in the judgment, and the second part contains a summary of that judgment. In this thesis, we design, develop and evaluate headnote prediction using machine learning, without involving human involvement. We divided this task into a two steps process. In the first step, we predict law points used in the judgment by using text classification algorithms. The second step generates a summary of the judgment using text summarization techniques. To achieve this task, we created a Databank by extracting data from different law sources in Pakistan. We labelled training data generated based on Pakistan law websites. We tested different feature extraction methods on judiciary data to improve our system. Using these feature extraction methods, we developed a dictionary of terminology for ease of reference and utility. Our approach achieves 65% accuracy by using Linear Support Vector Classification with tri-gram and without stemmer. Using active learning our system can continuously improve the accuracy with the increased labelled examples provided by the users of the system.

Download Full-text

AN IMPROVED AUTOMATIC POINTWISE SEMANTIC SEGMENTATION OF A 3D URBAN SCENE FROM MOBILE TERRESTRIAL AND AIRBORNE LIDAR POINT CLOUDS: A MACHINE LEARNING APPROACH

ISPRS Annals of Photogrammetry Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-annals-iv-4-w8-139-2019 ◽

2019 ◽

Vol IV-4/W8 ◽

pp. 139-146

Author(s):

X.-F. Xing ◽

M. A. Mostafavi ◽

G. Edwards ◽

N. Sabo

Keyword(s):

Machine Learning ◽

Urban Areas ◽

Learning Algorithm ◽

Semantic Segmentation ◽

Point Clouds ◽

Airborne Lidar ◽

Urban Scenes ◽

Machine Learning Approach ◽

Urban Scene ◽

Point Level

Abstract. Automatic semantic segmentation of point clouds observed in a 3D complex urban scene is a challenging issue. Semantic segmentation of urban scenes based on machine learning algorithm requires appropriate features to distinguish objects from mobile terrestrial and airborne LiDAR point clouds in point level. In this paper, we propose a pointwise semantic segmentation method based on our proposed features derived from Difference of Normal and the features “directional height above” that compare height difference between a given point and neighbors in eight directions in addition to the features based on normal estimation. Random forest classifier is chosen to classify points in mobile terrestrial and airborne LiDAR point clouds. The results obtained from our experiments show that the proposed features are effective for semantic segmentation of mobile terrestrial and airborne LiDAR point clouds, especially for vegetation, building and ground classes in an airborne LiDAR point clouds in urban areas.

Download Full-text

A machine learning approach to predicting short-term mortality risk in patients starting chemotherapy

10.1101/204081 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aymen A. Elfiky ◽

Maximilian J. Pany ◽

Ravi B. Parikh ◽

Ziad Obermeyer

Keyword(s):

Machine Learning ◽

Mortality Risk ◽

Palliative Chemotherapy ◽

Learning Algorithm ◽

Cancer Center ◽

Short Term ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Short Term Mortality ◽

And Performance

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.

Download Full-text

Adapting SVM for data sparseness and imbalance: a case study in information extraction

Natural Language Engineering ◽

10.1017/s1351324908004968 ◽

2009 ◽

Vol 15 (2) ◽

pp. 241-271 ◽

Cited By ~ 31

Author(s):

YAOYONG LI ◽

KALINA BONTCHEVA ◽

HAMISH CUNNINGHAM

Keyword(s):

Active Learning ◽

Language Learning ◽

Information Extraction ◽

Language Processing ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Training Data ◽

Support Vector ◽

Passive Learning ◽

Wide Range

AbstractSupport Vector Machines (SVM) have been used successfully in many Natural Language Processing (NLP) tasks. The novel contribution of this paper is in investigating two techniques for making SVM more suitable for language learning tasks. Firstly, we propose an SVM with uneven margins (SVMUM) model to deal with the problem of imbalanced training data. Secondly, SVM active learning is employed in order to alleviate the difficulty in obtaining labelled training data. The algorithms are presented and evaluated on several Information Extraction (IE) tasks, where they achieved better performance than the standard SVM and the SVM with passive learning, respectively. Moreover, by combining SVMUM with the active learning algorithm, we achieve the best reported results on the seminars and jobs corpora, which are benchmark data sets used for evaluation and comparison of machine learning algorithms for IE. In addition, we also evaluate the token based classification framework for IE with three different entity tagging schemes. In comparison to previous methods dealing with the same problems, our methods are both effective and efficient, which are valuable features for real-world applications. Due to the similarity in the formulation of the learning problem for IE and for other NLP tasks, the two techniques are likely to be beneficial in a wide range of applications1.

Download Full-text

Using Supervised Machine Learning Algorithms for Automated Lithology Prediction from Wireline Log Data

10.2118/208559-ms ◽

2021 ◽

Author(s):

Marian Popescu ◽

Rebecca Head ◽

Tim Ferriday ◽

Kate Evans ◽

Jose Montero ◽

...

Keyword(s):

Machine Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Training Dataset ◽

Depth Interval ◽

Log Data ◽

Machine Learning Approach ◽

Lithology Prediction ◽

Logging While Drilling

Abstract This paper presents advancements in machine learning and cloud deployment that enable rapid and accurate automated lithology interpretation. A supervised machine learning technique is described that enables rapid, consistent, and accurate lithology prediction alongside quantitative uncertainty from large wireline or logging-while-drilling (LWD) datasets. To leverage supervised machine learning, a team of geoscientists and petrophysicists made detailed lithology interpretations of wells to generate a comprehensive training dataset. Lithology interpretations were based on applying determinist cross-plotting by utilizing and combining various raw logs. This training dataset was used to develop a model and test a machine learning pipeline. The pipeline was applied to a dataset previously unseen by the algorithm, to predict lithology. A quality checking process was performed by a petrophysicist to validate new predictions delivered by the pipeline against human interpretations. Confidence in the interpretations was assessed in two ways. The prior probability was calculated, a measure of confidence in the input data being recognized by the model. Posterior probability was calculated, which quantifies the likelihood that a specified depth interval comprises a given lithology. The supervised machine learning algorithm ensured that the wells were interpreted consistently by removing interpreter biases and inconsistencies. The scalability of cloud computing enabled a large log dataset to be interpreted rapidly; >100 wells were interpreted consistently in five minutes, yielding >70% lithological match to the human petrophysical interpretation. Supervised machine learning methods have strong potential for classifying lithology from log data because: 1) they can automatically define complex, non-parametric, multi-variate relationships across several input logs; and 2) they allow classifications to be quantified confidently. Furthermore, this approach captured the knowledge and nuances of an interpreter's decisions by training the algorithm using human-interpreted labels. In the hydrocarbon industry, the quantity of generated data is predicted to increase by >300% between 2018 and 2023 (IDC, Worldwide Global DataSphere Forecast, 2019–2023). Additionally, the industry holds vast legacy data. This supervised machine learning approach can unlock the potential of some of these datasets by providing consistent lithology interpretations rapidly, allowing resources to be used more effectively.

Download Full-text