scholarly journals Validating Machine Learning Algorithms for Twitter Data Against Established Measures of Suicidality

2016 ◽  
Vol 3 (2) ◽  
pp. e21 ◽  
Author(s):  
Scott R Braithwaite ◽  
Christophe Giraud-Carrier ◽  
Josh West ◽  
Michael D Barnes ◽  
Carl Lee Hanson

Background One of the leading causes of death in the United States (US) is suicide and new methods of assessment are needed to track its risk in real time. Objective Our objective is to validate the use of machine learning algorithms for Twitter data against empirically validated measures of suicidality in the US population. Methods Using a machine learning algorithm, the Twitter feeds of 135 Mechanical Turk (MTurk) participants were compared with validated, self-report measures of suicide risk. Results Our findings show that people who are at high suicidal risk can be easily differentiated from those who are not by machine learning algorithms, which accurately identify the clinically significant suicidal rate in 92% of cases (sensitivity: 53%, specificity: 97%, positive predictive value: 75%, negative predictive value: 93%). Conclusions Machine learning algorithms are efficient in differentiating people who are at a suicidal risk from those who are not. Evidence for suicidality can be measured in nonclinical populations using social media data.

2020 ◽  
pp. 1-11
Author(s):  
Jie Liu ◽  
Lin Lin ◽  
Xiufang Liang

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.


2021 ◽  
Author(s):  
Yingxian Liu ◽  
Cunliang Chen ◽  
Hanqing Zhao ◽  
Yu Wang ◽  
Xiaodong Han

Abstract Fluid properties are key factors for predicting single well productivity, well test interpretation and oilfield recovery prediction, which directly affect the success of ODP program design. The most accurate and direct method of acquisition is underground sampling. However, not every well has samples due to technical reasons such as excessive well deviation or high cost during the exploration stage. Therefore, analogies or empirical formulas have to be adopted to carry out research in many cases. But a large number of oilfield developments have shown that the errors caused by these methods are very large. Therefore, how to quickly and accurately obtain fluid physical properties is of great significance. In recent years, with the development and improvement of artificial intelligence or machine learning algorithms, their applications in the oilfield have become more and more extensive. This paper proposed a method for predicting crude oil physical properties based on machine learning algorithms. This method uses PVT data from nearly 100 wells in Bohai Oilfield. 75% of the data is used for training and learning to obtain the prediction model, and the remaining 25% is used for testing. Practice shows that the prediction results of the machine learning algorithm are very close to the actual data, with a very small error. Finally, this method was used to apply the preliminary plan design of the BZ29 oilfield which is a new oilfield. Especially for the unsampled sand bodies, the fluid physical properties prediction was carried out. It also compares the influence of the analogy method on the scheme, which provides potential and risk analysis for scheme design. This method will be applied in more oil fields in the Bohai Sea in the future and has important promotion value.


2022 ◽  
Vol 14 (1) ◽  
pp. 0-0

Identifying chronic obstructive pulmonary disease (COPD) severity stages is of great importance to control the related mortality rates and reduce the associated costs. This study aims to build prediction models for COPD stages and, to compare the relative performance of five machine learning algorithms to determine the optimal prediction algorithm. This research is based on data collected from a private hospital in Egypt for the two calendar years 2018 and 2019. Five machine learning algorithms were used for the comparison. The F1 score, specificity, sensitivity, accuracy, positive predictive value and negative predictive value were the performance measures used for algorithms comparison. Analysis included 211 patients’ records. Our results show that the best performing algorithm in most of the disease stages is the PNN with the optimal prediction accuracy and hence it can be considered as a powerful prediction tool used by decision makers in predicting severity stages of COPD.


The aim of this research is to do risk modelling after analysis of twitter posts based on certain sentiment analysis. In this research we analyze posts of several users or a particular user to check whether they can be cause of concern to the society or not. Every sentiment like happy, sad, anger and other emotions are going to provide scaling of severity in the conclusion of final table on which machine learning algorithm is applied. The data which is put under the machine learning algorithms are been monitored over a period of time and it is related to a particular topic in an area


InterConf ◽  
2021 ◽  
pp. 393-403
Author(s):  
Olexander Shmatko ◽  
Volodimir Fedorchenko ◽  
Dmytro Prochukhan

Today the banking sector offers its clients many different financial services such as ATM cards, Internet banking, Debit card, and Credit card, which allows attracting a large number of new customers. This article proposes an information system for detecting credit card fraud using a machine learning algorithm. Usually, credit cards are used by the customer around the clock, so the bank's server can track all transactions using machine learning algorithms. It must find or predict fraud detection. The dataset contains characteristics for each transaction and fraudulent transactions need to be classified and detected. For these purposes, the work proposes the use of the Random Forest algorithm.


Author(s):  
Virendra Tiwari ◽  
Balendra Garg ◽  
Uday Prakash Sharma

The machine learning algorithms are capable of managing multi-dimensional data under the dynamic environment. Despite its so many vital features, there are some challenges to overcome. The machine learning algorithms still requires some additional mechanisms or procedures for predicting a large number of new classes with managing privacy. The deficiencies show the reliable use of a machine learning algorithm relies on human experts because raw data may complicate the learning process which may generate inaccurate results. So the interpretation of outcomes with expertise in machine learning mechanisms is a significant challenge in the machine learning algorithm. The machine learning technique suffers from the issue of high dimensionality, adaptability, distributed computing, scalability, the streaming data, and the duplicity. The main issue of the machine learning algorithm is found its vulnerability to manage errors. Furthermore, machine learning techniques are also found to lack variability. This paper studies how can be reduced the computational complexity of machine learning algorithms by finding how to make predictions using an improved algorithm.


2021 ◽  
Author(s):  
Jason Williams ◽  
Sally Potter-McIntyre ◽  
Justin Filiberto ◽  
Shaunna Morrison ◽  
Daniel Hummer

<p>Indicator minerals have special physical and chemical properties that can be analyzed to glean information concerning the composition of host rocks and formational (or altering) fluids. Clay, zeolite, and tourmaline mineral groups are all ubiquitous at the Earth’s surface and shallow crust and distributed through a wide variety of sedimentary, igneous, metamorphic, and hydrothermal systems. Traditional studies of indicator mineral-bearing deposits have provided a wealth of data that could be integral to discovering new insights into the formation and evolution of naturally occurring systems. This study evaluates the relationships that exist between different environmental indicator mineral groups through the implementation of machine learning algorithms and network diagrams. Mineral occurrence data for thousands of localities hosting clay, zeolite, and tourmaline minerals were retrieved from mineral databases. Clustering techniques (e.g., agglomerative hierarchical clustering and density based spatial clustering of applications with noise) combined with network analyses were used to analyze the compiled dataset in an effort to characterize and identify geological processes operating at different localities across the United States. Ultimately, this study evaluates the ability of machine learning algorithms to act as supplementary diagnostic and interpretive tools in geoscientific studies.</p>


Author(s):  
Namrata Dhanda ◽  
Stuti Shukla Datta ◽  
Mudrika Dhanda

Human intelligence is deeply involved in creating efficient and faster systems that can work independently. Creation of such smart systems requires efficient training algorithms. Thus, the aim of this chapter is to introduce the readers with the concept of machine learning and the commonly employed learning algorithm for developing efficient and intelligent systems. The chapter gives a clear distinction between supervised and unsupervised learning methods. Each algorithm is explained with the help of suitable example to give an insight to the learning process.


2022 ◽  
pp. 34-46
Author(s):  
Amtul Waheed ◽  
Jana Shafi ◽  
Saritha V.

In today's world of advanced technologies in IoT and ITS in smart cities scenarios, there are many different projections such as improved data propagation in smart roads and cooperative transportation networks, autonomous and continuously connected vehicles, and low latency applications in high capacity environments and heterogeneous connectivity and speed. This chapter presents the performance of the speed of vehicles on roadways employing machine learning methods. Input variable for each learning algorithm is the density that is measured as vehicle per mile and volume that is measured as vehicle per hour. And the result shows that the output variable is the speed that is measured as miles per hour represent the performance of each algorithm. The performance of machine learning algorithms is calculated by comparing the result of predictions made by different machine learning algorithms with true speed using the histogram. A result recommends that speed is varying according to the histogram.


Sign in / Sign up

Export Citation Format

Share Document