Forecasting Internally Displaced Population Migration Patterns in Syria and Yemen

Benjamin Q. Huynh; Sanjay Basu

doi:10.1017/dmp.2019.73

Forecasting Internally Displaced Population Migration Patterns in Syria and Yemen

Disaster Medicine and Public Health Preparedness ◽

10.1017/dmp.2019.73 ◽

2019 ◽

Vol 14 (3) ◽

pp. 302-307

Author(s):

Benjamin Q. Huynh ◽

Sanjay Basu

Keyword(s):

Machine Learning ◽

Food Prices ◽

Internally Displaced ◽

Fuel Prices ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Using Data ◽

Diverse Data ◽

Internally Displaced Population ◽

Persistence Model

ABSTRACTObjectives:Armed conflict has contributed to an unprecedented number of internally displaced persons (IDPs), individuals who are forced out of their homes but remain within their country. IDPs often urgently require shelter, food, and healthcare, yet prediction of when IDPs will migrate to an area remains a major challenge for aid delivery organizations. We sought to develop an IDP migration forecasting framework that could empower humanitarian aid groups to more effectively allocate resources during conflicts.Methods:We modeled monthly IDP migration between provinces within Syria and within Yemen using data on food prices, fuel prices, wages, location, time, and conflict reports. We compared machine learning methods with baseline persistence methods of forecasting.Results:We found a machine learning approach that more accurately forecast migration trends than baseline persistence methods. A random forest model outperformed the best persistence model in terms of root mean square error of log migration by 26% and 17% for the Syria and Yemen datasets, respectively.Conclusions:Integrating diverse data sources into a machine learning model appears to improve IDP migration prediction. Further work should examine whether implementation of such models can enable proactive aid allocation for IDPs in anticipation of forecast arrivals.

Download Full-text

A COMPARATIVE ANALYSIS OF COVID–19 BETWEEN INDIA AND OTHERS USING DATA MINING AND MACHINE LEARNING APPROACH

International Journal of Engineering Applied Sciences and Technology ◽

10.33564/ijeast.2020.v05i08.015 ◽

2020 ◽

Vol 5 (8) ◽

Author(s):

Kanika Bhalla ◽

Ashish Kumar

Keyword(s):

Machine Learning ◽

Initial Phase ◽

Respiratory Disorder ◽

Learning Approach ◽

Virus Spread ◽

Global Pandemic ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Using Data ◽

Novel Coronavirus

Novel coronavirus has caused a global pandemic which leads to acute respiratory disorder in humans. In this study, analysis of the transmission of communicable COVID19 disease in India is done. The machine learning model presents the comparison of India with other countries during initial phase of virus spread in India. After that its comparison with initial hard-hit countries is also done. Finally, we also performed time series analysis for prediction using prophet for the next seven days showing confirmed, recovered and deaths that will happen.

Download Full-text

A machine learning approach to predicting short-term mortality risk in patients starting chemotherapy

10.1101/204081 ◽

2017 ◽

Cited By ~ 2

Author(s):

Aymen A. Elfiky ◽

Maximilian J. Pany ◽

Ravi B. Parikh ◽

Ziad Obermeyer

Keyword(s):

Machine Learning ◽

Mortality Risk ◽

Palliative Chemotherapy ◽

Learning Algorithm ◽

Cancer Center ◽

Short Term ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Short Term Mortality ◽

And Performance

ABSTRACTBackgroundCancer patients who die soon after starting chemotherapy incur costs of treatment without benefits. Accurately predicting mortality risk from chemotherapy is important, but few patient data-driven tools exist. We sought to create and validate a machine learning model predicting mortality for patients starting new chemotherapy.MethodsWe obtained electronic health records for patients treated at a large cancer center (26,946 patients; 51,774 new regimens) over 2004-14, linked to Social Security data for date of death. The model was derived using 2004-11 data, and performance measured on non-overlapping 2012-14 data.Findings30-day mortality from chemotherapy start was 2.1%. Common cancers included breast (21.1%), colorectal (19.3%), and lung (18.0%). Model predictions were accurate for all patients (AUC 0.94). Predictions for patients starting palliative chemotherapy (46.6% of regimens), for whom prognosis is particularly important, remained highly accurate (AUC 0.92). To illustrate model discrimination, we ranked patients initiating palliative chemotherapy by model-predicted mortality risk, and calculated observed mortality by risk decile. 30-day mortality in the highest-risk decile was 22.6%; in the lowest-risk decile, no patients died. Predictions remained accurate across all primary cancers, stages, and chemotherapies—even for clinical trial regimens that first appeared in years after the model was trained (AUC 0.94). The model also performed well for prediction of 180-day mortality (AUC 0.87; mortality 74.8% in the highest risk decile vs. 0.2% in the lowest). Predictions were more accurate than data from randomized trials of individual chemotherapies, or SEER estimates.InterpretationA machine learning algorithm accurately predicted short-term mortality in patients starting chemotherapy using EHR data. Further research is necessary to determine generalizability and the feasibility of applying this algorithm in clinical settings.

Download Full-text

Quantifying changes in bicycle volumes using crowdsourced data

Environment and Planning B Urban Analytics and City Science ◽

10.1177/23998083211066103 ◽

2022 ◽

pp. 239980832110661

Author(s):

Ali Al-Ramini ◽

Mohammad A Takallou ◽

Daniel P Piatkowski ◽

Fadi Alsaleem

Keyword(s):

Machine Learning ◽

The United States ◽

Crowdsourced Data ◽

Machine Learning Approach ◽

Bicycle Infrastructure ◽

The Difference ◽

Infrastructure Investments ◽

Using Data ◽

The Impact ◽

The City

Most cities in the United States lack comprehensive or connected bicycle infrastructure; therefore, inexpensive and easy-to-implement solutions for connecting existing bicycle infrastructure are increasingly being employed. Signage is one of the promising solutions. However, the necessary data for evaluating its effect on cycling ridership is lacking. To overcome this challenge, this study tests the potential of using readily-available crowdsourced data in concert with machine-learning methods to provide insight into signage intervention effectiveness. We do this by assessing a natural experiment to identify the potential effects of adding or replacing signage within existing bicycle infrastructure in 2019 in the city of Omaha, Nebraska. Specifically, we first visually compare cycling traffic changes in 2019 to those from the previous two years (2017–2018) using data extracted from the Strava fitness app. Then, we use a new three-step machine-learning approach to quantify the impact of signage while controlling for weather, demographics, and street characteristics. The steps are as follows: Step 1 (modeling and validation) build and train a model from the available 2017 crowdsourced data (i.e., Strava, Census, and weather) that accurately predicts the cycling traffic data for any street within the study area in 2018; Step 2 (prediction) use the model from Step 1 to predict bicycle traffic in 2019 while assuming new signage was not added; Step 3 (impact evaluation) use the difference in prediction from actual traffic in 2019 as evidence of the likely impact of signage. While our work does not demonstrate causality, it does demonstrate an inexpensive method, using readily-available data, to identify changing trends in bicycling over the same time that new infrastructure investments are being added.

Download Full-text

A Comparative Study to analyze crime threats using data mining and machine learning approach

10.1109/icscan53069.2021.9526489 ◽

2021 ◽

Author(s):

Puninder Kaur ◽

Geeta Rani ◽

Taruna Sharma ◽

Avinash Sharma

Keyword(s):

Machine Learning ◽

Data Mining ◽

Comparative Study ◽

Learning Approach ◽

Machine Learning Approach ◽

Using Data

Download Full-text

Modeling of apartment prices in a Colombian context from a machine learning approach with stable-important attributes

DYNA ◽

10.15446/dyna.v87n212.80202 ◽

2020 ◽

Vol 87 (212) ◽

pp. 63-72

Author(s):

Jorge Iván Pérez Rave ◽

Favián González Echavarría ◽

Juan Carlos Correa Morales

Keyword(s):

Machine Learning ◽

Random Forest ◽

Learning Approach ◽

Predictive Capability ◽

Predictive Capacity ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Property Price ◽

Object Of Study ◽

Online Pricing

The objective of this work is to develop a machine learning model for online pricing of apartments in a Colombian context. This article addresses three aspects: i) it compares the predictive capacity of linear regression, regression trees, random forest and bagging; ii) it studies the effect of a group of text attributes on the predictive capability of the models; and iii) it identifies the more stable-important attributes and interprets them from an inferential perspective to better understand the object of study. The sample consists of 15,177 observations of real estate. The methods of assembly (random forest and bagging) show predictive superiority with respect to others. The attributes derived from the text had a significant relationship with the property price (on a log scale). However, their contribution to the predictive capacity was almost nil, since four different attributes achieved highly accurate predictions and remained stable when the sample change.

Download Full-text

A Study of Cross-National Differences in Happiness Factors Using Machine Learning Approach

International Journal of Software Engineering and Knowledge Engineering ◽

10.1142/s0218194015710023 ◽

2015 ◽

Vol 25 (09n10) ◽

pp. 1699-1702 ◽

Cited By ~ 2

Author(s):

Theresia Ratih Dewi Saputri ◽

Seok-Won Lee

Keyword(s):

Machine Learning ◽

Information Gain ◽

Development Project ◽

Support Vector ◽

National Differences ◽

Dimensionality Reduction Technique ◽

Machine Learning Approach ◽

National Happiness ◽

Using Data ◽

Cross National

National happiness has been actively studied throughout the past years. The happiness factor varies due to different human perspectives. The factors used in this work include both physical needs and the mental needs of humanity, for example, the educational factor. This work identified more than 90 features that can be used to predict the country happiness. Due to numerous features, it is unwise to rely on the prediction of national happiness by manual analysis. Therefore, this work used a machine learning technique called Support Vector Machine (SVM) to learn and predict the country happiness. In order to improve the prediction accuracy, dimensionality reduction technique which is the information gain was also used in this work. This technique was chosen due to its ability to explore the interrelationships among a set of variables. Using data of 187 countries from the UN Development Project, this work is able to identify which factor needed to be improved by a certain country to increase the happiness of their citizens.

Download Full-text

Covid-19 Analysis and Prediction using Data Science and Machine Learning

International Journal for Research in Applied Science and Engineering Technology ◽

10.22214/ijraset.2021.39272 ◽

2021 ◽

Vol 9 (12) ◽

pp. 303-307

Author(s):

Akshata Kulkarni

Keyword(s):

Machine Learning ◽

Data Science ◽

Information Dissemination ◽

Prediction Models ◽

Learning Model ◽

Future Trend ◽

Control Measures ◽

Machine Learning Model ◽

Using Data ◽

Novel Coronavirus

Abstract: Officials around the world are using several COVID-19 outbreak prediction models to make educated decisions and enact necessary control measures. In this study, we developed a Machine Learning model which predicts and forecasts the COVID-19 outbreak in India, with the goal of determining the best regression model for an in-depth examination of the novel coronavirus. Based on data available from January 31 to October 31, 2020, collected from Kaggle, this model predicts the number of confirmed cases in Maharashtra. We're using a Machine Learning model to foresee the future trend of these situations. The project has the potential to demonstrate the importance of information dissemination in improving response time and planning ahead of time to help reduce risk.

Download Full-text

Automatic subtyping of individuals with Primary Progressive Aphasia

10.1101/2020.04.04.025593 ◽

2020 ◽

Author(s):

Charalambos Themistocleous ◽

Bronte Ficek ◽

Kimberly Webster ◽

Dirk-Bart den Ouden ◽

Argye E. Hillis ◽

...

Keyword(s):

Machine Learning ◽

Classification Accuracy ◽

Primary Progressive Aphasia ◽

Support Vector ◽

Progressive Aphasia ◽

Primary Progressive ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Automated Machine Learning

AbstractBackgroundThe classification of patients with Primary Progressive Aphasia (PPA) into variants is time-consuming, costly, and requires combined expertise by clinical neurologists, neuropsychologists, speech pathologists, and radiologists.ObjectiveThe aim of the present study is to determine whether acoustic and linguistic variables provide accurate classification of PPA patients into one of three variants: nonfluent PPA, semantic PPA, and logopenic PPA.MethodsIn this paper, we present a machine learning model based on Deep Neural Networks (DNN) for the subtyping of patients with PPA into three main variants, using combined acoustic and linguistic information elicited automatically via acoustic and linguistic analysis. The performance of the DNN was compared to the classification accuracy of Random Forests, Support Vector Machines, and Decision Trees, as well as expert clinicians’ classifications.ResultsThe DNN model outperformed the other machine learning models with 80% classification accuracy, providing reliable subtyping of patients with PPA into variants and it even outperformed auditory classification of patients into variants by clinicians.ConclusionsWe show that the combined speech and language markers from connected speech productions provide information about symptoms and variant subtyping in PPA. The end-to-end automated machine learning approach we present can enable clinicians and researchers to provide an easy, quick and inexpensive classification of patients with PPA.

Download Full-text

Quantitative Toxicity Prediction via Ensembling of Heterogeneous Predictors

10.21203/rs.2.19338/v1 ◽

2019 ◽

Author(s):

Abdul Karim ◽

Vahid Riahi ◽

Avinash Mishra ◽

Abdollah Dehzangi ◽

M. A. Hakim Newton ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Prediction Models ◽

Individual Performance ◽

Learning Model ◽

Data Representation ◽

Toxicity Prediction ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Benchmark Datasets

Abstract Representing molecules in the form of only one type of features and using those features to predict their activities is one of the most important approaches for machine-learning-based chemical-activity-prediction. For molecular activities like quantitative toxicity prediction, the performance depends on the type of features extracted and the machine learning approach used. For such cases, using one type of features and machine learning model restricts the prediction performance to specific representation and model used. In this paper, we study quantitative toxicity prediction and propose a machine learning model for the same. Our model uses an ensemble of heterogeneous predictors instead of typically using homogeneous predictors. The predictors that we use vary either on the type of features used or on the deep learning architecture employed. Each of these predictors presumably has its own strengths and weaknesses in terms of toxicity prediction. Our motivation is to make a combined model that utilizes different types of features and architectures to obtain better collective performance that could go beyond the performance of each individual predictor. We use six predictors in our model and test the model on four standard quantitative toxicity benchmark datasets. Experimental results show that our model outperforms the state-of-the-art toxicity prediction models in 8 out of 12 accuracy measures. Our experiments show that ensembling heterogeneous predictor improves the performance over single predictors and homogeneous ensembling of single predictors.The results show that each data representation or deep learning based predictor has its own strengths and weaknesses, thus employing a model ensembling multiple heterogeneous predictors could go beyond individual performance of each data representation or each predictor type.

Download Full-text

Machine Learning Model for GSM BSC Control Plane Units

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.f1044.0886s19 ◽

2019 ◽

Vol 8 (6S) ◽

pp. 219-223

Keyword(s):

Machine Learning ◽

Back Propagation ◽

Back Propagation Neural Network ◽

Model Parameters ◽

Large Set ◽

Data Set ◽

Wide Acceptance ◽

Machine Learning Model ◽

Machine Learning Approach ◽

Accuracy Of Prediction

At maximum traffic intensity i.e. during the busy hour, the GSM BSC signalling units (BSU) measured CPU load will be at its peak. The BSUs CPU load is a function of the number of transceivers (TRXs) mapped to it and hence the volume of offered traffic being handled by the unit. The unit CPU load is also a function of the nature of the offered load, i.e. with the same volume of offered load, the CPU load with the nominal traffic profile would be different as compared to some other arbitrary traffic profile. To manage future traffic growth, a model to estimate the BSU unit CPU load is an essential need. In recent times, using Machine Learning (ML) to develop such a model is an approach that has gained wide acceptance. Since, the estimation of CPU load is difficult as it depends on large set of parameters, machine learning approach is more scalable. In this paper, we describe a back-propagation neural network model that was developed to estimate the BSU unit CPU load. We describe the model parameters and choices and implementation architecture, and estimate its accuracy of prediction, based on an evaluation data set. We also discuss alternative ML architectures and compare their relative prediction accuracies, to the primary ML model

Download Full-text