Sports prediction and betting models in the machine learning age: The case of tennis

Journal of Sports Analytics ◽

10.3233/jsa-200463 ◽

2021 ◽

pp. 1-19

Author(s):

Sascha Wilkens

Keyword(s):

Machine Learning ◽

Management Strategies ◽

Relevant Information ◽

Sports Betting ◽

Machine Learning Techniques ◽

Sports Events ◽

Wide Range ◽

High Volatility ◽

Betting Markets ◽

Model Ensembles

Machine learning and its numerous variants have meanwhile become established tools in many areas of society. Several attempts have been made to apply machine learning to the prediction of the outcome of professional sports events and to exploit “inefficiencies” in the corresponding betting markets. On the example of tennis, this paper extends previous research by conducting one of the most extensive studies of its kind and applying a wide range of machine learning techniques to male and female professional singles matches. The paper shows that the average prediction accuracy cannot be increased to more than about 70%. Irrespective of the used model, most of the relevant information is embedded in the betting markets, and adding other match- and player-specific data does not lead to any significant improvement. Returns from applying predictions to the sports betting market are subject to high volatility and mainly negative over the longer term. This conclusion holds across most tested models, various money management strategies, and for backing the match favorites or outsiders. The use of model ensembles that combine the predictions from multiple approaches proves to be the most promising choice.

Download Full-text

Efficient Prediction of Structural and Electronic Properties of Hybrid 2D Materials Using DFT and Machine Learning

10.26434/chemrxiv.6254756.v1 ◽

2018 ◽

Author(s):

Sherif Tawfik ◽

Olexandr Isayev ◽

Catherine Stampfl ◽

Joseph Shapter ◽

David Winkler ◽

...

Keyword(s):

Machine Learning ◽

Band Gap ◽

Density Functional ◽

2D Materials ◽

Van Der Waals ◽

Building Blocks ◽

Machine Learning Techniques ◽

Interlayer Distance ◽

Computational Screening ◽

Wide Range

Materials constructed from different van der Waals two-dimensional (2D) heterostructures offer a wide range of benefits, but these systems have been little studied because of their experimental and computational complextiy, and because of the very large number of possible combinations of 2D building blocks. The simulation of the interface between two different 2D materials is computationally challenging due to the lattice mismatch problem, which sometimes necessitates the creation of very large simulation cells for performing density-functional theory (DFT) calculations. Here we use a combination of DFT, linear regression and machine learning techniques in order to rapidly determine the interlayer distance between two different 2D heterostructures that are stacked in a bilayer heterostructure, as well as the band gap of the bilayer. Our work provides an excellent proof of concept by quickly and accurately predicting a structural property (the interlayer distance) and an electronic property (the band gap) for a large number of hybrid 2D materials. This work paves the way for rapid computational screening of the vast parameter space of van der Waals heterostructures to identify new hybrid materials with useful and interesting properties.

Download Full-text

Recent Advances in Unmanned Aerial Vehicle Forest Remote Sensing—A Systematic Review. Part I: A General Framework

Forests ◽

10.3390/f12030327 ◽

2021 ◽

Vol 12 (3) ◽

pp. 327 ◽

Cited By ~ 1

Author(s):

Riccardo Dainelli ◽

Piero Toscano ◽

Salvatore Filippo Di Gennaro ◽

Alessandro Matese

Keyword(s):

Remote Sensing ◽

Unmanned Aerial Vehicle ◽

Relevant Information ◽

Forest Monitoring ◽

Machine Learning Techniques ◽

Planted Forests ◽

Systematic Analysis ◽

Aerial Photogrammetry ◽

Wide Range ◽

Aerial Vehicle

Natural, semi-natural, and planted forests are a key asset worldwide, providing a broad range of positive externalities. For sustainable forest planning and management, remote sensing (RS) platforms are rapidly going mainstream. In a framework where scientific production is growing exponentially, a systematic analysis of unmanned aerial vehicle (UAV)-based forestry research papers is of paramount importance to understand trends, overlaps and gaps. The present review is organized into two parts (Part I and Part II). Part II inspects specific technical issues regarding the application of UAV-RS in forestry, together with the pros and cons of different UAV solutions and activities where additional effort is needed, such as the technology transfer. Part I systematically analyzes and discusses general aspects of applying UAV in natural, semi-natural and artificial forestry ecosystems in the recent peer-reviewed literature (2018–mid-2020). The specific goals are threefold: (i) create a carefully selected bibliographic dataset that other researchers can draw on for their scientific works; (ii) analyze general and recent trends in RS forest monitoring (iii) reveal gaps in the general research framework where an additional activity is needed. Through double-step filtering of research items found in the Web of Science search engine, the study gathers and analyzes a comprehensive dataset (226 articles). Papers have been categorized into six main topics, and the relevant information has been subsequently extracted. The strong points emerging from this study concern the wide range of topics in the forestry sector and in particular the retrieval of tree inventory parameters often through Digital Aerial Photogrammetry (DAP), RGB sensors, and machine learning techniques. Nevertheless, challenges still exist regarding the promotion of UAV-RS in specific parts of the world, mostly in the tropical and equatorial forests. Much additional research is required for the full exploitation of hyperspectral sensors and for planning long-term monitoring.

Download Full-text

Integration of image segmentation and fuzzy theory to improve the accuracy of damage detection areas in traffic accidents

Journal Of Big Data ◽

10.1186/s40537-021-00539-2 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Majid Amirfakhrian ◽

Mahboub Parhizkar

Keyword(s):

Machine Learning ◽

Image Processing ◽

Machine Vision ◽

Traffic Accidents ◽

Fuzzy Theory ◽

Machine Learning Techniques ◽

Industrial Setting ◽

Technological Advances ◽

Learning Techniques ◽

Wide Range

AbstractIn the next decade, machine vision technology will have an enormous impact on industrial works because of the latest technological advances in this field. These advances are so significant that the use of this technology is now essential. Machine vision is the process of using a wide range of technologies and methods in providing automated inspections in an industrial setting based on imaging, process control, and robot guidance. One of the applications of machine vision is to diagnose traffic accidents. Moreover, car vision is utilized for detecting the amount of damage to vehicles during traffic accidents. In this article, using image processing and machine learning techniques, a new method is presented to improve the accuracy of detecting damaged areas in traffic accidents. Evaluating the proposed method and comparing it with previous works showed that the proposed method is more accurate in identifying damaged areas and it has a shorter execution time.

Download Full-text

Near real-time air quality forecasts using the NASA GEOS model

10.5194/egusphere-egu21-13587 ◽

2021 ◽

Author(s):

K. Emma Knowland ◽

Christoph Keller ◽

Krzysztof Wargan ◽

Brad Weir ◽

Pamela Wales ◽

...

Keyword(s):

Machine Learning ◽

Air Quality ◽

Real Time ◽

Weather Forecasting ◽

Atmospheric Composition ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Wide Range ◽

Reactive Trace Gases ◽

Assimilation System

<p>NASA's Global Modeling and Assimilation Office (GMAO) produces high-resolution global forecasts for weather, aerosols, and air quality. The NASA Global Earth Observing System (GEOS) model has been expanded to provide global near-real-time 5-day forecasts of atmospheric composition at unprecedented horizontal resolution of 0.25 degrees (~25 km). This composition forecast system (GEOS-CF) combines the operational GEOS weather forecasting model with the state-of-the-science GEOS-Chem chemistry module (version 12) to provide detailed analysis of a wide range of air pollutants such as ozone, carbon monoxide, nitrogen oxides, and fine particulate matter (PM2.5). Satellite observations are assimilated into the system for improved representation of weather and smoke. The assimilation system is being expanded to include chemically reactive trace gases. We discuss current capabilities of the GEOS Constituent Data Assimilation System (CoDAS) to improve atmospheric composition modeling and possible future directions, notably incorporating new observations (TROPOMI, geostationary satellites) and machine learning techniques. We show how machine learning techniques can be used to correct for sub-grid-scale variability, which further improves model estimates at a given observation site.</p>

Download Full-text

Tree‐Based Machine Learning to Identify and Understand Major Determinants for Stroke at the Neighborhood Level

Journal of the American Heart Association ◽

10.1161/jaha.120.016745 ◽

2020 ◽

Vol 9 (22) ◽

Cited By ~ 4

Author(s):

Liangyuan Hu ◽

Bian Liu ◽

Jiayi Ji ◽

Yan Li

Keyword(s):

Physical Activity ◽

Machine Learning ◽

Cardiovascular Disease ◽

Cardiovascular Health ◽

Black People ◽

The United States ◽

Machine Learning Techniques ◽

Data Set ◽

Wide Range ◽

Neighborhood Level

Background Stroke is a major cardiovascular disease that causes significant health and economic burden in the United States. Neighborhood community‐based interventions have been shown to be both effective and cost‐effective in preventing cardiovascular disease. There is a dearth of robust studies identifying the key determinants of cardiovascular disease and the underlying effect mechanisms at the neighborhood level. We aim to contribute to the evidence base for neighborhood cardiovascular health research. Methods and Results We created a new neighborhood health data set at the census tract level by integrating 4 types of potential predictors, including unhealthy behaviors, prevention measures, sociodemographic factors, and environmental measures from multiple data sources. We used 4 tree‐based machine learning techniques to identify the most critical neighborhood‐level factors in predicting the neighborhood‐level prevalence of stroke, and compared their predictive performance for variable selection. We further quantified the effects of the identified determinants on stroke prevalence using a Bayesian linear regression model. Of the 5 most important predictors identified by our method, higher prevalence of low physical activity, larger share of older adults, higher percentage of non‐Hispanic Black people, and higher ozone levels were associated with higher prevalence of stroke at the neighborhood level. Higher median household income was linked to lower prevalence. The most important interaction term showed an exacerbated adverse effect of aging and low physical activity on the neighborhood‐level prevalence of stroke. Conclusions Tree‐based machine learning provides insights into underlying drivers of neighborhood cardiovascular health by discovering the most important determinants from a wide range of factors in an agnostic, data‐driven, and reproducible way. The identified major determinants and the interactive mechanism can be used to prioritize and allocate resources to optimize community‐level interventions for stroke prevention.

Download Full-text

Proximal Methods for Plant Stress Detection Using Optical Sensors and Machine Learning

Biosensors ◽

10.3390/bios10120193 ◽

2020 ◽

Vol 10 (12) ◽

pp. 193

Author(s):

Alanna V. Zubler ◽

Jeong-Yeol Yoon

Keyword(s):

Machine Learning ◽

Optical Sensors ◽

Near Infrared ◽

Plant Stress ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Detection Methods ◽

Stress Detection ◽

Recent Advances ◽

Wide Range

Plant stresses have been monitored using the imaging or spectrometry of plant leaves in the visible (red-green-blue or RGB), near-infrared (NIR), infrared (IR), and ultraviolet (UV) wavebands, often augmented by fluorescence imaging or fluorescence spectrometry. Imaging at multiple specific wavelengths (multi-spectral imaging) or across a wide range of wavelengths (hyperspectral imaging) can provide exceptional information on plant stress and subsequent diseases. Digital cameras, thermal cameras, and optical filters have become available at a low cost in recent years, while hyperspectral cameras have become increasingly more compact and portable. Furthermore, smartphone cameras have dramatically improved in quality, making them a viable option for rapid, on-site stress detection. Due to these developments in imaging technology, plant stresses can be monitored more easily using handheld and field-deployable methods. Recent advances in machine learning algorithms have allowed for images and spectra to be analyzed and classified in a fully automated and reproducible manner, without the need for complicated image or spectrum analysis methods. This review will highlight recent advances in portable (including smartphone-based) detection methods for biotic and abiotic stresses, discuss data processing and machine learning techniques that can produce results for stress identification and classification, and suggest future directions towards the successful translation of these methods into practical use.

Download Full-text

Deep Learning Prediction of Adverse Drug Reactions in Drug Discovery Using Open TG–GATEs and FAERS Databases

Frontiers in Drug Discovery ◽

10.3389/fddsv.2021.768792 ◽

2021 ◽

Vol 1 ◽

Author(s):

Attayeb Mohsen ◽

Lokesh P. Tripathi ◽

Kenji Mizuguchi

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Adverse Drug Reactions ◽

Predictive Models ◽

Prediction Models ◽

Expression Profiles ◽

Fine Tuning ◽

Machine Learning Techniques ◽

Drug Reactions ◽

Wide Range

Machine learning techniques are being increasingly used in the analysis of clinical and omics data. This increase is primarily due to the advancements in Artificial intelligence (AI) and the build-up of health-related big data. In this paper we have aimed at estimating the likelihood of adverse drug reactions or events (ADRs) in the course of drug discovery using various machine learning methods. We have also described a novel machine learning-based framework for predicting the likelihood of ADRs. Our framework combines two distinct datasets, drug-induced gene expression profiles from Open TG–GATEs (Toxicogenomics Project–Genomics Assisted Toxicity Evaluation Systems) and ADR occurrence information from FAERS (FDA [Food and Drug Administration] Adverse Events Reporting System) database, and can be applied to many different ADRs. It incorporates data filtering and cleaning as well as feature selection and hyperparameters fine tuning. Using this framework with Deep Neural Networks (DNN), we built a total of 14 predictive models with a mean validation accuracy of 89.4%, indicating that our approach successfully and consistently predicted ADRs for a wide range of drugs. As case studies, we have investigated the performances of our prediction models in the context of Duodenal ulcer and Hepatitis fulminant, highlighting mechanistic insights into those ADRs. We have generated predictive models to help to assess the likelihood of ADRs in testing novel pharmaceutical compounds. We believe that our findings offer a promising approach for ADR prediction and will be useful for researchers in drug discovery.

Download Full-text

Classification of Macronutrient Deficiencies in Maize Plant Using Machine Learning

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v8i6.pp4197-4203 ◽

2018 ◽

Vol 8 (6) ◽

pp. 4197

Author(s):

Leena N ◽

K. K. Saju

Keyword(s):

Machine Learning ◽

Feature Extraction ◽

Nutrient Management ◽

Management Strategies ◽

Nutrient Deficiency ◽

Crop Productivity ◽

Machine Learning Techniques ◽

Nutritional Deficiencies ◽

Feature Sets ◽

Learning Techniques

<p>Detection of nutritional deficiencies in plants is vital for improving crop productivity. Timely identification of nutrient deficiency through visual symptoms in the plants can help farmers take quick corrective action by appropriate nutrient management strategies. The application of computer vision and machine learning techniques offers new prospects in non-destructive field-based analysis for nutrient deficiency. Color and shape are important parameters in feature extraction. In this work, two different techniques are used for image segmentation and feature extraction to generate two different feature sets from the same image sets. These are then used for classification using different machine learning techniques. The experimental results are analyzed and compared in terms of classification accuracy to find the best algorithm for the two feature sets.</p>

Download Full-text

Machine Learning:A Review

Semiconductor Science and Information Devices ◽

10.30564/ssid.v2i2.1931 ◽

2020 ◽

Vol 2 (2) ◽

Author(s):

Isonkobong Christopher Udousoro

Keyword(s):

Machine Learning ◽

Image Processing ◽

Data Interpretation ◽

Relevant Information ◽

Machine Learning Techniques ◽

Predictive Analysis ◽

Learning Approaches ◽

Processing Data ◽

Learning Techniques ◽

Machine Learning Applications

Due to the complexity of data, interpretation of pattern or extraction of information becomes difficult; therefore application of machine learning is used to teach machines how to handle data more efficiently. With the increase of datasets, various organizations now apply machine learning applications and algorithms. Many industries apply machine learning to extract relevant information for analysis purposes. Many scholars, mathematicians and programmers have carried out research and applied several machine learning approaches in order to find solution to problems. In this paper, we focus on general review of machine learning including various machine learning techniques. These techniques can be applied to different fields like image processing, data mining, predictive analysis and so on. The paper aims at reviewing machine learning techniques and algorithms. The research methodology is based on qualitative analysis where various literatures is being reviewed based on machine learning.

Download Full-text

Proposed Improvements for Automated Chemical Safety Evaluations Using In-Silico Techniques

10.20944/preprints202005.0408.v1 ◽

2020 ◽

Author(s):

Bryan Jordan

Keyword(s):

Machine Learning ◽

Drug Discovery ◽

Chemical Space ◽

Machine Learning Techniques ◽

Training Dataset ◽

Generative Adversarial Networks ◽

Chemical Safety ◽

Adversarial Networks ◽

Wide Range ◽

Traditional Drug

The vastness of chemical-space constrains traditional drug-discovery methods to the organic laws that are guiding the chemistry involved in filtering through candidates. Leveraging computing with machine-learning to intelligently generate compounds that meet a wide range of objectives can bring significant gains in time and effort needed to filter through a broad range of candidates. This paper details how the use of Generative-Adversarial-Networks, novel machine learning techniques to format the training dataset and the use of quantum computing offer new ways to expedite drug-discovery.

Download Full-text