Statistical Analysis and Discovery of Heterogeneous Catalysts Based on Machine Learning from Diverse Published Data

Keisuke Suzuki; Takashi Toyao; Zen Maeno; Satoru Takakusagi; Ken‐ichi Shimizu; Ichigaku Takigawa

doi:10.1002/cctc.201901456

Front Cover: Statistical Analysis and Discovery of Heterogeneous Catalysts Based on Machine Learning from Diverse Published Data (ChemCatChem 18/2019)

ChemCatChem ◽

10.1002/cctc.201901455 ◽

2019 ◽

Vol 11 (18) ◽

pp. 4443-4443

Author(s):

Keisuke Suzuki ◽

Takashi Toyao ◽

Zen Maeno ◽

Satoru Takakusagi ◽

Ken‐ichi Shimizu ◽

...

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Heterogeneous Catalysts ◽

Published Data ◽

Front Cover

Download Full-text

Statistical Analysis and Discovery of Heterogeneous Catalysts Based on Machine Learning from Diverse Published Data

ChemCatChem ◽

10.1002/cctc.201900971 ◽

2019 ◽

Vol 11 (18) ◽

pp. 4537-4547 ◽

Cited By ~ 6

Author(s):

Keisuke Suzuki ◽

Takashi Toyao ◽

Zen Maeno ◽

Satoru Takakusagi ◽

Ken‐ichi Shimizu ◽

...

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Heterogeneous Catalysts ◽

Published Data

Download Full-text

Stiffness and Strength of Stabilized Organic Soils—Part I/II: Experimental Database and Statistical Description for Machine Learning Modelling

Geosciences ◽

10.3390/geosciences11060243 ◽

2021 ◽

Vol 11 (6) ◽

pp. 243

Author(s):

Hernandez-Martinez Francisco G. ◽

Al-Tabbaa Abir ◽

Medina-Cetina Zenon ◽

Yousefpour Negin

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Portland Cement ◽

Soil Stabilization ◽

Data Availability ◽

Data Repository ◽

Organic Soils ◽

Experimental Database ◽

Soil Mixing ◽

The Impact

This paper presents the experimental database and corresponding statistical analysis (Part I), which serves as a basis to perform the corresponding parametric analysis and machine learning modelling (Part II) of a comprehensive study on organic soil strength and stiffness, stabilized via the wet soil mixing method. The experimental database includes unconfined compression tests performed under laboratory-controlled conditions to investigate the impact of soil type, the soil’s organic content, the soil’s initial natural water content, binder type, binder quantity, grout to soil ratio, water to binder ratio, curing time, temperature, curing relative humidity and carbon dioxide content on the stabilized organic specimens’ stiffness and strength. A descriptive statistical analysis complements the description of the experimental database, along with a qualitative study on the stabilization hydration process via scanning electron microscopy images. Results confirmed findings on the use of Portland cement alone and a mix of Portland cement with ground granulated blast furnace slag as suitable binders for soil stabilization. Findings on mixes including lime and magnesium oxide cements demonstrated minimal stabilization. Specimen size affected stiffness, but not the strength for mixes of peat and Portland cement. The experimental database, along with all produced data analyses, are available at the Texas Data Repository as indicated in the Data Availability Statement below, to allow for data reproducibility and promote the use of artificial intelligence and machine learning competing modelling techniques as the ones presented in Part II of this paper.

Download Full-text

Finding Homogeneous Climate Zones in Bangladesh From Statistical Analysis of Climate Data Using Machine Learning Technique

2020 23rd International Conference on Computer and Information Technology (ICCIT) ◽

10.1109/iccit51783.2020.9392689 ◽

2020 ◽

Author(s):

Faisal Bin Ashraf ◽

Md Rayhan Kabir ◽

Md Shafiur Raihan Shafi ◽

Jubair Ibn Malik Rifat

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Climate Data ◽

Machine Learning Technique ◽

Climate Zones ◽

Learning Technique

Download Full-text

Advancement of Statistical Analysis, Machine Learning and Decision Analysis Based on the Fourteenth ICMSEM Proceedings

Proceedings of the Fourteenth International Conference on Management Science and Engineering Management - Advances in Intelligent Systems and Computing ◽

10.1007/978-3-030-49829-0_1 ◽

2020 ◽

pp. 1-9

Author(s):

Jiuping Xu

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Decision Analysis

Download Full-text

Evaluation of Biomarkers in Critical Care and Perioperative Medicine

Anesthesiology ◽

10.1097/aln.0000000000003600 ◽

2020 ◽

Vol 134 (1) ◽

pp. 15-25

Author(s):

Sabri Soussi ◽

Gary S. Collins ◽

Peter Jüni ◽

Alexandre Mebazaa ◽

Etienne Gayat ◽

...

Keyword(s):

Machine Learning ◽

Critical Care ◽

Statistical Analysis ◽

Research Methods ◽

Statistical Methods ◽

Perioperative Medicine ◽

Scientific Rigor ◽

Starting Point ◽

Novel Biomarkers

SUMMARY Interest in developing and using novel biomarkers in critical care and perioperative medicine is increasing. Biomarkers studies are often presented with flaws in the statistical analysis that preclude them from providing a scientifically valid and clinically relevant message for clinicians. To improve scientific rigor, the proper application and reporting of traditional and emerging statistical methods (e.g., machine learning) of biomarker studies is required. This Readers’ Toolbox article aims to be a starting point to nonexpert readers and investigators to understand traditional and emerging research methods to assess biomarkers in critical care and perioperative medicine.

Download Full-text

Techniques and Methods That Help to Make Big Data the Simplest Recipe for Success

Big Data Analytics for Entrepreneurial Success - Advances in Business Information Systems and Analytics ◽

10.4018/978-1-5225-7609-9.ch006 ◽

2019 ◽

pp. 161-194

Keyword(s):

Machine Learning ◽

Big Data ◽

Data Analysis ◽

Statistical Analysis ◽

Data Analytics ◽

Big Data Analysis ◽

Customer Segmentation ◽

Learning Context ◽

Feature Vectors

Data analytics has grown in a machine learning context. Whatever the reason data is used or exploited, customer segmentation or marketing targeting, it must be processed first and represented on feature vectors. Many algorithms, such as clustering, regression, classification, and others, need to be represented and clarified in order to facilitate processing and statistical analysis. If we have seen, through the previous chapters, the importance of big data analysis (the Why?), as with every major innovation, the biggest confusion lies in the exact scope (What?) and its implementation (How?). In this chapter, we will take a look at the different algorithms and techniques analytics that we can use in order to exploit the large amounts of data.

Download Full-text

A Statistical Analysis of Risk Factors and Biological Behavior in Canine Mammary Tumors: A Multicenter Study

Animals ◽

10.3390/ani10091687 ◽

2020 ◽

Vol 10 (9) ◽

pp. 1687

Author(s):

Giovanni P. Burrai ◽

Andrea Gabrieli ◽

Valentina Moccia ◽

Valentina Zappulli ◽

Ilaria Porcellato ◽

...

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Statistical Analysis ◽

Multicenter Study ◽

Malignant Tumors ◽

Mammary Tumors ◽

Supervised Machine Learning ◽

Biological Behavior ◽

Clinical Staging ◽

Canine Mammary Tumors

Canine mammary tumors (CMTs) represent a serious issue in worldwide veterinary practice and several risk factors are variably implicated in the biology of CMTs. The present study examines the relationship between risk factors and histological diagnosis of a large CMT dataset from three academic institutions by classical statistical analysis and supervised machine learning methods. Epidemiological, clinical, and histopathological data of 1866 CMTs were included. Dogs with malignant tumors were significantly older than dogs with benign tumors (9.6 versus 8.7 years, p < 0.001). Malignant tumors were significantly larger than benign counterparts (2.69 versus 1.7 cm, p < 0.001). Interestingly, 18% of malignant tumors were smaller than 1 cm in diameter, providing compelling evidence that the size of the tumor should be reconsidered during the assessment of the TNM-WHO clinical staging. The application of the logistic regression and the machine learning model identified the age and the tumor’s size as the best predictors with an overall diagnostic accuracy of 0.63, suggesting that these risk factors are sufficient but not exhaustive indicators of the malignancy of CMTs. This multicenter study increases the general knowledge of the main epidemiologica-clinical risk factors involved in the onset of CMTs and paves the way for further investigations of these factors in association with CMTs and in the application of machine learning technology.

Download Full-text

Prediction of Ultrasound-Mediated Disruption of Cell Membranes Using Machine Learning Techniques and Statistical Analysis of Acoustic Spectra

IEEE Transactions on Biomedical Engineering ◽

10.1109/tbme.2003.820323 ◽

2004 ◽

Vol 51 (1) ◽

pp. 82-89 ◽

Cited By ~ 10

Author(s):

E.K. Lee ◽

R.J. Gallagher ◽

A.M. Campbell ◽

M.R. Prausnitz

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Cell Membranes ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Acoustic Spectra

Download Full-text

Assessing the Heterogeneity of Complaints Related to Tinnitus and Hyperacusis from an Unsupervised Machine Learning Approach: An Exploratory Study

Audiology and Neurotology ◽

10.1159/000504741 ◽

2020 ◽

Vol 25 (4) ◽

pp. 174-189 ◽

Cited By ~ 1

Author(s):

Guillaume Palacios ◽

Arnaud Noreña ◽

Alain Londero

Keyword(s):

Machine Learning ◽

Statistical Analysis ◽

Language Processing ◽

Exploratory Study ◽

Latent Dirichlet Allocation ◽

Suicide Attempts ◽

Real Life ◽

Supervised Machine Learning ◽

Learning Approach ◽

Machine Learning Approach

Introduction: Subjective tinnitus (ST) and hyperacusis (HA) are common auditory symptoms that may become incapacitating in a subgroup of patients who thereby seek medical advice. Both conditions can result from many different mechanisms, and as a consequence, patients may report a vast repertoire of associated symptoms and comorbidities that can reduce dramatically the quality of life and even lead to suicide attempts in the most severe cases. The present exploratory study is aimed at investigating patients’ symptoms and complaints using an in-depth statistical analysis of patients’ natural narratives in a real-life environment in which, thanks to the anonymization of contributions and the peer-to-peer interaction, it is supposed that the wording used is totally free of any self-limitation and self-censorship. Methods: We applied a purely statistical, non-supervised machine learning approach to the analysis of patients’ verbatim exchanged on an Internet forum. After automated data extraction, the dataset has been preprocessed in order to make it suitable for statistical analysis. We used a variant of the Latent Dirichlet Allocation (LDA) algorithm to reveal clusters of symptoms and complaints of HA patients (topics). The probability of distribution of words within a topic uniquely characterizes it. The convergence of the log-likelihood of the LDA-model has been reached after 2,000 iterations. Several statistical parameters have been tested for topic modeling and word relevance factor within each topic. Results: Despite a rather small dataset, this exploratory study demonstrates that patients’ free speeches available on the Internet constitute a valuable material for machine learning and statistical analysis aimed at categorizing ST/HA complaints. The LDA model with K = 15 topics seems to be the most relevant in terms of relative weights and correlations with the capability to individualizing subgroups of patients displaying specific characteristics. The study of the relevance factor may be useful to unveil weak but important signals that are present in patients’ narratives. Discussion/Conclusion: We claim that the LDA non-supervised approach would permit to gain knowledge on the patterns of ST- and HA-related complaints and on patients’ centered domains of interest. The merits and limitations of the LDA algorithms are compared with other natural language processing methods and with more conventional methods of qualitative analysis of patients’ output. Future directions and research topics emerging from this innovative algorithmic analysis are proposed.

Download Full-text