scholarly journals Wineinformatics: Regression on the Grade and Price of Wines through Their Sensory Attributes

Fermentation ◽  
2018 ◽  
Vol 4 (4) ◽  
pp. 84 ◽  
Author(s):  
James Palmer ◽  
Bernard Chen

Wineinformatics is a field that uses machine-learning and data-mining techniques to glean useful information from wine. In this work, attributes extracted from a large dataset of over 100,000 wine reviews are used to make predictions on two variables: quality based on a “100-point scale”, and price per 750 mL bottle. These predictions were built using support vector regression. Several evaluation metrics were used for model evaluation. In addition, these regression models were compared to classification accuracies achieved in a prior work. When regression was used for classification, the results were somewhat poor; however, this was expected since the main purpose of the regression was not to classify the wines. Therefore, this paper also compares the advantages and disadvantages of both classification and regression. Regression models can successfully predict within a few points of the correct grade of a wine. On average, the model was only 1.6 points away from the actual grade and off by about $13 per bottle of wine. To the best of our knowledge, this is the first work to use a large-scale dataset of wine reviews to perform regression predictions on grade and price.

2021 ◽  
Author(s):  
Lance F Merrick ◽  
Dennis N Lozada ◽  
Xianming Chen ◽  
Arron H Carter

Most genomic prediction models are linear regression models that assume continuous and normally distributed phenotypes, but responses to diseases such as stripe rust (caused by Puccinia striiformis f. sp. tritici) are commonly recorded in ordinal scales and percentages. Disease severity (SEV) and infection type (IT) data in germplasm screening nurseries generally do not follow these assumptions. On this regard, researchers may ignore the lack of normality, transform the phenotypes, use generalized linear models, or use supervised learning algorithms and classification models with no restriction on the distribution of response variables, which are less sensitive when modeling ordinal scores. The goal of this research was to compare classification and regression genomic selection models for skewed phenotypes using stripe rust SEV and IT in winter wheat. We extensively compared both regression and classification prediction models using two training populations composed of breeding lines phenotyped in four years (2016-2018, and 2020) and a diversity panel phenotyped in four years (2013-2016). The prediction models used 19,861 genotyping-by-sequencing single-nucleotide polymorphism markers. Overall, square root transformed phenotypes using rrBLUP and support vector machine regression models displayed the highest combination of accuracy and relative efficiency across the regression and classification models. Further, a classification system based on support vector machine and ordinal Bayesian models with a 2-Class scale for SEV reached the highest class accuracy of 0.99. This study showed that breeders can use linear and non-parametric regression models within their own breeding lines over combined years to accurately predict skewed phenotypes.


Author(s):  
Sahil Sangani

Data generated in the past few years cannot be efficiently manipulated with the traditional way of storing techniques as it is a large-scale dataset, and it can be structured, semi-structured, or unstructured. To deal with this kind of enormous dataset Hadoop framework is used, which supports the processing of large dataset in a distributed computing environment. Hadoop uses a technique named as MapReduce for processing and generating a large dataset with a parallel distributed algorithm on a cluster. It automatically handles failures and data loss due to its fault-tolerance property. The scheduler is a pluggable component of the MapReduce framework. Hadoop MapReduce framework uses various scheduler as per the requirements of the task. FIFO (First In First Out) is a default algorithm used by Hadoop, in which the jobs are executed in the order of their arrival. This paper will discuss myriad of schedulers such as FIFO, Capacity Scheduler, LATE Scheduler, Fair Scheduler, Delay Scheduler, Deadline Constraint Scheduler, and Resource Aware Scheduler. Besides these schedulers, we also conducted study of comparison of schedulers like Round Robin, Weighted Round Robin, Self-adaptive Reduce Scheduling (SARS), Self-adaptive MapReduce Scheduling (SAMR), Dynamic Priority Scheduling, Learning Scheduling, Classification & Optimization-based Scheduler (COSHH), Network-Aware, Match-matching, and Energy-Aware Scheduler. Hopefully, this study will enhance the understanding of the specific schedulers and stimulate other developers and consumers to make accurate decisions for their specific research interests.


Sensors ◽  
2019 ◽  
Vol 19 (9) ◽  
pp. 2040 ◽  
Author(s):  
Antoine d’Acremont ◽  
Ronan Fablet ◽  
Alexandre Baussard ◽  
Guillaume Quin

Convolutional neural networks (CNNs) have rapidly become the state-of-the-art models for image classification applications. They usually require large groundtruthed datasets for training. Here, we address object identification and recognition in the wild for infrared (IR) imaging in defense applications, where no such large-scale dataset is available. With a focus on robustness issues, especially viewpoint invariance, we introduce a compact and fully convolutional CNN architecture with global average pooling. We show that this model trained from realistic simulation datasets reaches a state-of-the-art performance compared with other CNNs with no data augmentation and fine-tuning steps. We also demonstrate a significant improvement in the robustness to viewpoint changes with respect to an operational support vector machine (SVM)-based scheme.


Sensors ◽  
2019 ◽  
Vol 19 (18) ◽  
pp. 4008 ◽  
Author(s):  
Henry Griffith ◽  
Yan Shi ◽  
Subir Biswas

Various sensors have been proposed to address the negative health ramifications of inadequate fluid consumption. Amongst these solutions, motion-based sensors estimate fluid intake using the characteristics of drinking kinematics. This sensing approach is complicated due to the mutual influence of both the drink volume and the current fill level on the resulting motion pattern, along with differences in biomechanics across individuals. While motion-based strategies are a promising approach due to the proliferation of inertial sensors, previous studies have been characterized by limited accuracy and substantial variability in performance across subjects. This research seeks to address these limitations for a container-attachable triaxial accelerometer sensor. Drink volume is computed using support vector machine regression models with hand-engineered features describing the container’s estimated inclination. Results are presented for a large-scale data collection consisting of 1908 drinks consumed from a refillable bottle by 84 individuals. Per-drink mean absolute percentage error is reduced by 11.05% versus previous state-of-the-art results for a single wrist-wearable inertial measurement unit (IMU) sensor assessed using a similar experimental protocol. Estimates of aggregate consumption are also improved versus previously reported results for an attachable sensor architecture. An alternative tracking approach using the fill level from which a drink is consumed is also explored herein. Fill level regression models are shown to exhibit improved accuracy and reduced inter-subject variability versus volume estimators. A technique for segmenting the entire drink motion sequence into transport and sip phases is also assessed, along with a multi-target framework for addressing the known interdependence of volume and fill level on the resulting drink motion signature.


Author(s):  
Ramin Saedi ◽  
Rajat Verma ◽  
Ali Zockaie ◽  
Mehrnaz Ghamami ◽  
Timothy J. Gates

Estimation of vehicular emissions at network level is a prominent issue in transportation planning and management of urban areas. For large networks, macroscopic emission models are preferred because of their simplicity. However, these models do not consider traffic flow dynamics that significantly affect emissions production. This study proposes a network-level emission modeling framework based on the network-wide fundamental diagram (NFD), via integrating the NFD properties with an existing microscopic emission model. The NFD and microscopic emission models are estimated using microscopic and mesoscopic traffic simulation tools at different scales for various traffic compositions. The major contribution is to consider heterogeneous vehicle types with different emission generation rates in a network-level model. This framework is applied to the large-scale network of Chicago as well as its central business district. Non-linear and support vector regression models are developed using simulated trajectory data of 13 simulated scenarios. The results show a satisfactory calibration and successful validation with acceptable deviations from the underlying microscopic emissions model regardless of the simulation tool that is used to calibrate the network-level emissions model. The microscopic traffic simulation is appropriate for smaller networks, while mesoscopic traffic simulation is a proper means to calibrate models for larger networks. The proposed model is also used to demonstrate the relationship between macroscopic emissions and flow characteristics in the form of a network emissions diagram. The results of this study provide a tool for planners to analyze vehicular emissions in real time and find optimal policies to control the level of emissions in large cities.


Author(s):  
Stefano Vassanelli

Establishing direct communication with the brain through physical interfaces is a fundamental strategy to investigate brain function. Starting with the patch-clamp technique in the seventies, neuroscience has moved from detailed characterization of ionic channels to the analysis of single neurons and, more recently, microcircuits in brain neuronal networks. Development of new biohybrid probes with electrodes for recording and stimulating neurons in the living animal is a natural consequence of this trend. The recent introduction of optogenetic stimulation and advanced high-resolution large-scale electrical recording approaches demonstrates this need. Brain implants for real-time neurophysiology are also opening new avenues for neuroprosthetics to restore brain function after injury or in neurological disorders. This chapter provides an overview on existing and emergent neurophysiology technologies with particular focus on those intended to interface neuronal microcircuits in vivo. Chemical, electrical, and optogenetic-based interfaces are presented, with an analysis of advantages and disadvantages of the different technical approaches.


Author(s):  
Jin Zhou ◽  
Qing Zhang ◽  
Jian-Hao Fan ◽  
Wei Sun ◽  
Wei-Shi Zheng

AbstractRecent image aesthetic assessment methods have achieved remarkable progress due to the emergence of deep convolutional neural networks (CNNs). However, these methods focus primarily on predicting generally perceived preference of an image, making them usually have limited practicability, since each user may have completely different preferences for the same image. To address this problem, this paper presents a novel approach for predicting personalized image aesthetics that fit an individual user’s personal taste. We achieve this in a coarse to fine manner, by joint regression and learning from pairwise rankings. Specifically, we first collect a small subset of personal images from a user and invite him/her to rank the preference of some randomly sampled image pairs. We then search for the K-nearest neighbors of the personal images within a large-scale dataset labeled with average human aesthetic scores, and use these images as well as the associated scores to train a generic aesthetic assessment model by CNN-based regression. Next, we fine-tune the generic model to accommodate the personal preference by training over the rankings with a pairwise hinge loss. Experiments demonstrate that our method can effectively learn personalized image aesthetic preferences, clearly outperforming state-of-the-art methods. Moreover, we show that the learned personalized image aesthetic benefits a wide variety of applications.


Sign in / Sign up

Export Citation Format

Share Document