Identifying Fraudulent Behaviors in Healthcare Claims Using Random Forest Classifier With SMOTEchnique

2020 ◽  
Vol 16 (4) ◽  
pp. 30-47
Author(s):  
Naga Jyothi P. ◽  
Rajya Lakshmi D. ◽  
Rama Rao K. V. S. N.

Detecting fraudulent and abusive cases in healthcare is one of the most challenging problems for data mining studies. Existing studies have a lack of real data for analysis and focus on a very partial version of the problem by covering only a specific actor, healthcare service, or disease. In this article, the proposed strategy identifies fraudulent behaviors in Medicare claims data using several predictors as model inputs. The methodology involves preprocessing and model development phases. At the initial phase, the feature mining is done by estimating their feature importance score. The labeling of instances by using the classification rules to the whole dataset. Thus, a transformed dataset is obtained by the model. In the development phase, the RF with SMOTE is applied against the training and testing data. Specifically, SMOTE adapted to balance data and sorts misclassified instances and finds the interesting instances. The results of the proposed model improvises the classifier performance RF with SMOTE when contrast with RF method.

Author(s):  
Olga Mikhaylovna Tikhonova ◽  
Alexander Fedorovich Rezchikov ◽  
Vladimir Andreevich Ivashchenko ◽  
Vadim Alekseevich Kushnikov

The paper presents the system of predicting the indicators of accreditation of technical universities based on J. Forrester mechanism of system dynamics. According to analysis of cause-and-effect relationships between selected variables of the system (indicators of accreditation of the university) there was built the oriented graph. The complex of mathematical models developed to control the quality of training engineers in Russian higher educational institutions is based on this graph. The article presents an algorithm for constructing a model using one of the simulated variables as an example. The model is a system of non-linear differential equations, the modelling characteristics of the educational process being determined according to the solution of this system. The proposed algorithm for calculating these indicators is based on the system dynamics model and the regression model. The mathematical model is constructed on the basis of the model of system dynamics, which is further tested for compliance with real data using the regression model. The regression model is built on the available statistical data accumulated during the period of the university's work. The proposed approach is aimed at solving complex problems of managing the educational process in universities. The structure of the proposed model repeats the structure of cause-effect relationships in the system, and also provides the person responsible for managing quality control with the ability to quickly and adequately assess the performance of the system.


2019 ◽  
Vol XVI (2) ◽  
pp. 1-11
Author(s):  
Farrukh Jamal ◽  
Hesham Mohammed Reyad ◽  
Soha Othman Ahmed ◽  
Muhammad Akbar Ali Shah ◽  
Emrah Altun

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


Polymers ◽  
2021 ◽  
Vol 13 (9) ◽  
pp. 1393
Author(s):  
Xiaochang Duan ◽  
Hongwei Yuan ◽  
Wei Tang ◽  
Jingjing He ◽  
Xuefei Guan

This study develops a general temperature-dependent stress–strain constitutive model for polymer-bonded composite materials, allowing for the prediction of deformation behaviors under tension and compression in the testing temperature range. Laboratory testing of the material specimens in uniaxial tension and compression at multiple temperatures ranging from −40 ∘C to 75 ∘C is performed. The testing data reveal that the stress–strain response can be divided into two general regimes, namely, a short elastic part followed by the plastic part; therefore, the Ramberg–Osgood relationship is proposed to build the stress–strain constitutive model at a single temperature. By correlating the model parameters with the corresponding temperature using a response surface, a general temperature-dependent stress–strain constitutive model is established. The effectiveness and accuracy of the proposed model are validated using several independent sets of testing data and third-party data. The performance of the proposed model is compared with an existing reference model. The validation and comparison results show that the proposed model has a lower number of parameters and yields smaller relative errors. The proposed constitutive model is further implemented as a user material routine in a finite element package. A simple structural example using the developed user material is presented and its accuracy is verified.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 772 ◽  
Author(s):  
Houshyar Honar Pajooh ◽  
Mohammad Rashid ◽  
Fakhrul Alam ◽  
Serge Demidenko

The proliferation of smart devices in the Internet of Things (IoT) networks creates significant security challenges for the communications between such devices. Blockchain is a decentralized and distributed technology that can potentially tackle the security problems within the 5G-enabled IoT networks. This paper proposes a Multi layer Blockchain Security model to protect IoT networks while simplifying the implementation. The concept of clustering is utilized in order to facilitate the multi-layer architecture. The K-unknown clusters are defined within the IoT network by applying techniques that utillize a hybrid Evolutionary Computation Algorithm while using Simulated Annealing and Genetic Algorithms. The chosen cluster heads are responsible for local authentication and authorization. Local private blockchain implementation facilitates communications between the cluster heads and relevant base stations. Such a blockchain enhances credibility assurance and security while also providing a network authentication mechanism. The open-source Hyperledger Fabric Blockchain platform is deployed for the proposed model development. Base stations adopt a global blockchain approach to communicate with each other securely. The simulation results demonstrate that the proposed clustering algorithm performs well when compared to the earlier reported approaches. The proposed lightweight blockchain model is also shown to be better suited to balance network latency and throughput as compared to a traditional global blockchain.


2021 ◽  
Vol 10 (s1) ◽  
Author(s):  
Said Gounane ◽  
Yassir Barkouch ◽  
Abdelghafour Atlas ◽  
Mostafa Bendahmane ◽  
Fahd Karami ◽  
...  

Abstract Recently, various mathematical models have been proposed to model COVID-19 outbreak. These models are an effective tool to study the mechanisms of coronavirus spreading and to predict the future course of COVID-19 disease. They are also used to evaluate strategies to control this pandemic. Generally, SIR compartmental models are appropriate for understanding and predicting the dynamics of infectious diseases like COVID-19. The classical SIR model is initially introduced by Kermack and McKendrick (cf. (Anderson, R. M. 1991. “Discussion: the Kermack–McKendrick Epidemic Threshold Theorem.” Bulletin of Mathematical Biology 53 (1): 3–32; Kermack, W. O., and A. G. McKendrick. 1927. “A Contribution to the Mathematical Theory of Epidemics.” Proceedings of the Royal Society 115 (772): 700–21)) to describe the evolution of the susceptible, infected and recovered compartment. Focused on the impact of public policies designed to contain this pandemic, we develop a new nonlinear SIR epidemic problem modeling the spreading of coronavirus under the effect of a social distancing induced by the government measures to stop coronavirus spreading. To find the parameters adopted for each country (for e.g. Germany, Spain, Italy, France, Algeria and Morocco) we fit the proposed model with respect to the actual real data. We also evaluate the government measures in each country with respect to the evolution of the pandemic. Our numerical simulations can be used to provide an effective tool for predicting the spread of the disease.


2020 ◽  
Vol 70 (4) ◽  
pp. 953-978
Author(s):  
Mustafa Ç. Korkmaz ◽  
G. G. Hamedani

AbstractThis paper proposes a new extended Lindley distribution, which has a more flexible density and hazard rate shapes than the Lindley and Power Lindley distributions, based on the mixture distribution structure in order to model with new distribution characteristics real data phenomena. Its some distributional properties such as the shapes, moments, quantile function, Bonferonni and Lorenz curves, mean deviations and order statistics have been obtained. Characterizations based on two truncated moments, conditional expectation as well as in terms of the hazard function are presented. Different estimation procedures have been employed to estimate the unknown parameters and their performances are compared via Monte Carlo simulations. The flexibility and importance of the proposed model are illustrated by two real data sets.


2021 ◽  
Vol 99 (Supplement_1) ◽  
pp. 55-56
Author(s):  
Christian D Ramirez-Camba ◽  
Crystal L Levesque

Abstract A mechanistic model was developed with the objective to characterize weight gain and essential amino acid (EAA) deposition in the different tissue pools that make up the pregnant sow: placenta, allantoic fluid, amniotic fluid, fetus, uterus, mammary gland, and maternal body were considered. The data used in this modelling approach were obtained from published scientific articles reporting weights, crude protein (CP), and EAA composition in the previously mentioned tissues; studies reporting not less than 5 datapoints across gestation were considered. A total of 12 scientific articles published between 1977 and 2020 were selected for the development of the model and the model was validated using 11 separate scientific papers. The model consists of three connected sub-models: protein deposition (Pd) model, weight gain model, and EAA deposition model. Weight gain, Pd, and EAA deposition curves were developed with nonparametric statistics using splines regression. The validation of the model showed a strong agreement between observed and predicted growth (r2 = 0.92, root mean square error = 3%). The proposed model also offered descriptive insights into the weight gain and Pd during gestation. The model suggests that the definition of time-dependent Pd is more accurately described as an increase in fluid deposition during mid-gestation coinciding with a reduction in Pd. In addition, due to differences in CP composition between pregnancy-related tissues and maternal body, Pd by itself may not be the best measurement criteria for the estimation of EAA requirement in pregnant sows. The proposed model also captures the negative maternal Pd that occurs in late gestation and indicates that litter size influences maternal tissue mobilization more than parity. The model predicts that the EAA requirements in early and mid-gestation are 75, 55 and 50% lower for primiparous sows than parity 2, 3 and 4+ sows, respectively, which suggest the potential benefits of parity segregated feeding.


Author(s):  
Moritz Berger ◽  
Gerhard Tutz

AbstractA flexible semiparametric class of models is introduced that offers an alternative to classical regression models for count data as the Poisson and Negative Binomial model, as well as to more general models accounting for excess zeros that are also based on fixed distributional assumptions. The model allows that the data itself determine the distribution of the response variable, but, in its basic form, uses a parametric term that specifies the effect of explanatory variables. In addition, an extended version is considered, in which the effects of covariates are specified nonparametrically. The proposed model and traditional models are compared in simulations and by utilizing several real data applications from the area of health and social science.


Polymers ◽  
2021 ◽  
Vol 13 (14) ◽  
pp. 2353
Author(s):  
Xiaochang Duan ◽  
Hongwei Yuan ◽  
Wei Tang ◽  
Jingjing He ◽  
Xuefei Guan

This study develops a unified phenomenological creep model for polymer-bonded composite materials, allowing for predicting the creep behavior in the three creep stages, namely the primary, the secondary, and the tertiary stages under sustained compressive stresses. Creep testing is performed using material specimens under several conditions with a temperature range of 20 °C–50 °C and a compressive stress range of 15 MPa–25 MPa. The testing data reveal that the strain rate–time response exhibits the transient, steady, and unstable stages under each of the testing conditions. A rational function-based creep rate equation is proposed to describe the full creep behavior under each of the testing conditions. By further correlating the resulting model parameters with temperature and stress and developing a Larson–Miller parameter-based rupture time prediction model, a unified phenomenological model is established. An independent validation dataset and third-party testing data are used to verify the effectiveness and accuracy of the proposed model. The performance of the proposed model is compared with that of an existing reference model. The verification and comparison results show that the model can describe all the three stages of the creep process, and the proposed model outperforms the reference model by yielding 28.5% smaller root mean squared errors on average.


Sign in / Sign up

Export Citation Format

Share Document