Conceptual K-Means Algorithm Based on Complex Features

Missing value or behaviour: how to increase the signal of social media data

METRON ◽

10.1007/s40300-021-00216-7 ◽

2021 ◽

Author(s):

Paolo Mariani ◽

Andrea Marletta

Keyword(s):

Social Media ◽

Missing Data ◽

Everyday Life ◽

Processing Technique ◽

Missing Value ◽

Social Media Data ◽

Practical Strategy ◽

Specific Behaviour ◽

Complex Features ◽

Media Data

AbstractSocial media has become a widespread element of people’s everyday life, which is used to communicate and generate contents. Among the several ways to express a reaction to social media contents, the “Likes” are critical. Indeed, they convey preferences, which drive existing markets or allow the creation of new ones. Nevertheless, the appreciation indicators have some complex features, as for example the interpretation of the absence of “Likes”. In this case, the lack of approval may be considered as a specific behaviour. The present study aimed to define whether the absence of Likes may indicate the presence of a specific behaviour through the contextualization of the treatment of missing data applied to real cases. We provided a practical strategy for extracting more knowledge from social media data, whose synthesis raises several measurement problems. We proposed an approach based on the disambiguation of missing data in two modalities: “Dislike” and “Nothing”. Finally, a data pre-processing technique was suggested to increase the signal of social media data.

Download Full-text

SMPLIP-Score: predicting ligand binding affinity from simple and interpretable on-the-fly interaction fingerprint pattern descriptors

Journal of Cheminformatics ◽

10.1186/s13321-021-00507-1 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Surendra Kumar ◽

Mi-hyun Kim

Keyword(s):

Ligand Binding ◽

Binding Affinity ◽

Scoring Functions ◽

Binding Affinities ◽

Ligand Interaction ◽

Fingerprint Pattern ◽

Comparable Performance ◽

Direct Interpretation ◽

Benchmark Datasets ◽

Complex Features

AbstractIn drug discovery, rapid and accurate prediction of protein–ligand binding affinities is a pivotal task for lead optimization with acceptable on-target potency as well as pharmacological efficacy. Furthermore, researchers hope for a high correlation between docking score and pose with key interactive residues, although scoring functions as free energy surrogates of protein–ligand complexes have failed to provide collinearity. Recently, various machine learning or deep learning methods have been proposed to overcome the drawbacks of scoring functions. Despite being highly accurate, their featurization process is complex and the meaning of the embedded features cannot directly be interpreted by human recognition without an additional feature analysis. Here, we propose SMPLIP-Score (Substructural Molecular and Protein–Ligand Interaction Pattern Score), a direct interpretable predictor of absolute binding affinity. Our simple featurization embeds the interaction fingerprint pattern on the ligand-binding site environment and molecular fragments of ligands into an input vectorized matrix for learning layers (random forest or deep neural network). Despite their less complex features than other state-of-the-art models, SMPLIP-Score achieved comparable performance, a Pearson’s correlation coefficient up to 0.80, and a root mean square error up to 1.18 in pK units with several benchmark datasets (PDBbind v.2015, Astex Diverse Set, CSAR NRC HiQ, FEP, PDBbind NMR, and CASF-2016). For this model, generality, predictive power, ranking power, and robustness were examined using direct interpretation of feature matrices for specific targets.

Download Full-text

A Lightweight Formalism for Reference Lifetimes and Borrowing in Rust

ACM Transactions on Programming Languages and Systems ◽

10.1145/3443420 ◽

2021 ◽

Vol 43 (1) ◽

pp. 1-73

Author(s):

David J. Pearce

Keyword(s):

Memory Management ◽

Programming Model ◽

Type System ◽

Control Flow ◽

Two Phase ◽

Type Checking ◽

Strong Focus ◽

Reference Implementation ◽

Two Phases ◽

Complex Features

Rust is a relatively new programming language that has gained significant traction since its v1.0 release in 2015. Rust aims to be a systems language that competes with C/C++. A claimed advantage of Rust is a strong focus on memory safety without garbage collection. This is primarily achieved through two concepts, namely, reference lifetimes and borrowing . Both of these are well-known ideas stemming from the literature on region-based memory management and linearity / uniqueness . Rust brings both of these ideas together to form a coherent programming model. Furthermore, Rust has a strong focus on stack-allocated data and, like C/C++ but unlike Java, permits references to local variables. Type checking in Rust can be viewed as a two-phase process: First, a traditional type checker operates in a flow-insensitive fashion; second, a borrow checker enforces an ownership invariant using a flow-sensitive analysis. In this article, we present a lightweight formalism that captures these two phases using a flow-sensitive type system that enforces “ type and borrow safety .” In particular, programs that are type and borrow safe will not attempt to dereference dangling pointers. Our calculus core captures many aspects of Rust, including copy- and move-semantics, mutable borrowing, reborrowing, partial moves, and lifetimes. In particular, it remains sufficiently lightweight to be easily digested and understood and, we argue, still captures the salient aspects of reference lifetimes and borrowing. Furthermore, extensions to the core can easily add more complex features (e.g., control-flow, tuples, method invocation). We provide a soundness proof to verify our key claims of the calculus. We also provide a reference implementation in Java with which we have model checked our calculus using over 500B input programs. We have also fuzz tested the Rust compiler using our calculus against 2B programs and, to date, found one confirmed compiler bug and several other possible issues.

Download Full-text

Complex features of the photoluminescence from ZnO nanorods grown by vapor-phase transport method

Materials Science in Semiconductor Processing ◽

10.1016/j.mssp.2021.105783 ◽

2021 ◽

Vol 128 ◽

pp. 105783

Author(s):

R.R. Jalolov ◽

Sh.Z. Urolov ◽

Z.Sh. Shaymardanov ◽

S.S. Kurbanov ◽

B.N. Rustamova

Keyword(s):

Vapor Phase ◽

Zno Nanorods ◽

Vapor Phase Transport ◽

Vapor Phase Transport Method ◽

Complex Features ◽

Phase Transport ◽

Transport Method

Download Full-text

Tomato Leaf Disease Diagnosis Based on Improved Convolution Neural Network by Attention Module

Agriculture ◽

10.3390/agriculture11070651 ◽

2021 ◽

Vol 11 (7) ◽

pp. 651

Author(s):

Shengyi Zhao ◽

Yun Peng ◽

Jizhan Liu ◽

Shuo Wu

Keyword(s):

Neural Network ◽

High Performance ◽

Model Comparison ◽

Research Direction ◽

Disease Diagnosis ◽

Tomato Leaf ◽

Identification Accuracy ◽

Main Research ◽

Proposed Model ◽

Complex Features

Crop disease diagnosis is of great significance to crop yield and agricultural production. Deep learning methods have become the main research direction to solve the diagnosis of crop diseases. This paper proposed a deep convolutional neural network that integrates an attention mechanism, which can better adapt to the diagnosis of a variety of tomato leaf diseases. The network structure mainly includes residual blocks and attention extraction modules. The model can accurately extract complex features of various diseases. Extensive comparative experiment results show that the proposed model achieves the average identification accuracy of 96.81% on the tomato leaf diseases dataset. It proves that the model has significant advantages in terms of network complexity and real-time performance compared with other models. Moreover, through the model comparison experiment on the grape leaf diseases public dataset, the proposed model also achieves better results, and the average identification accuracy of 99.24%. It is certified that add the attention module can more accurately extract the complex features of a variety of diseases and has fewer parameters. The proposed model provides a high-performance solution for crop diagnosis under the real agricultural environment.

Download Full-text

Unsupervised Anomaly Detection with Distillated Teacher-Student Network Ensemble

Entropy ◽

10.3390/e23020201 ◽

2021 ◽

Vol 23 (2) ◽

pp. 201

Author(s):

Qinfeng Xiao ◽

Jing Wang ◽

Youfang Lin ◽

Wenbo Gongsa ◽

Ganghui Hu ◽

...

Keyword(s):

Anomaly Detection ◽

Multivariate Data ◽

Failure Detection ◽

Superior Performance ◽

Detection Algorithms ◽

Teacher Student ◽

Model Complex ◽

Unsupervised Anomaly Detection ◽

Real World Datasets ◽

Complex Features

We address the problem of unsupervised anomaly detection for multivariate data. Traditional machine learning based anomaly detection algorithms rely on specific assumptions of normal patterns and fail to model complex feature interactions and relations. Recently, existing deep learning based methods are promising for extracting representations from complex features. These methods train an auxiliary task, e.g., reconstruction and prediction, on normal samples. They further assume that anomalies fail to perform well on the auxiliary task since they are never trained during the model optimization. However, the assumption does not always hold in practice. Deep models may also perform the auxiliary task well on anomalous samples, leading to the failure detection of anomalies. To effectively detect anomalies for multivariate data, this paper introduces a teacher-student distillation based framework Distillated Teacher-Student Network Ensemble (DTSNE). The paradigm of the teacher-student distillation is able to deal with high-dimensional complex features. In addition, an ensemble of student networks provides a better capability to avoid generalizing the auxiliary task performance on anomalous samples. To validate the effectiveness of our model, we conduct extensive experiments on real-world datasets. Experimental results show superior performance of DTSNE over competing methods. Analysis and discussion towards the behavior of our model are also provided in the experiment section.

Download Full-text

Local and Global Dynamics in a Discrete Time Growth Model with Nonconcave Production Function

Discrete Dynamics in Nature and Society ◽

10.1155/2012/536570 ◽

2012 ◽

Vol 2012 ◽

pp. 1-22 ◽

Cited By ~ 13

Author(s):

Serena Brianzoni ◽

Cristiana Mammana ◽

Elisabetta Michetti

Keyword(s):

Production Function ◽

Growth Model ◽

Discrete Time ◽

Production Functions ◽

Elasticity Of Substitution ◽

Global Dynamics ◽

Production Factors ◽

Local And Global Dynamics ◽

Complex Features

We study the dynamics shown by the discrete time neoclassical one-sector growth model with differential savings while assuming a nonconcave production function. We prove that complex features exhibited are related both to the structure of the coexixting attractors and to their basins. We also show that complexity emerges if the elasticity of substitution between production factors is low enough and shareholders save more than workers, confirming the results obtained while considering concave production functions.

Download Full-text

The MPS Reconstruction of Porous Media Using Multiple-Grid Templates

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.462-463.462 ◽

2013 ◽

Vol 462-463 ◽

pp. 462-465 ◽

Cited By ~ 1

Author(s):

Yi Du ◽

Ting Zhang

Keyword(s):

Porous Media ◽

Large Scale ◽

Grid Method ◽

Training Image ◽

Multiple Point ◽

Grid Data ◽

Large Scale Structures ◽

Complex Features ◽

Grid Nodes ◽

Data Template

It is difficult to reconstruct the unknown information only by some sparse known data in the reconstruction of porous media. Multiple-point geostatistics (MPS) has been proved to be a powerful tool to capture curvilinear structures or complex features in training images. One solution to capture large-scale structures while considering a data template with a reasonably small number of grid nodes is provided by the multiple-grid method. This method consists in scanning a training image using increasingly finer multiple-grid data templates instead of a big and dense data template. The experimental results demonstrate that multiple-grid data templates and MPS are practical in porous media reconstruction.

Download Full-text

Materials Matter: An Exploration of Text Complexity and Its Effects on Middle School Readers' Comprehension Processing

Language Speech and Hearing Services in Schools ◽

10.1044/2021_lshss-20-00117 ◽

2021 ◽

pp. 1-15

Author(s):

Amanda C. Dahl ◽

Sarah E. Carlson ◽

Maggie Renken ◽

Kathryn S. McCarthy ◽

Erin Reynolds

Keyword(s):

Middle School ◽

Science Classroom ◽

Deep Level ◽

Informational Text ◽

Think Aloud ◽

Text Complexity ◽

Science Texts ◽

Complex Syntax ◽

Abstract Words ◽

Complex Features

Purpose Complex features of science texts present idiosyncratic challenges for middle grade readers, especially in a post–Common Core educational world where students' learning is dependent on understanding informational text. The primary aim of this study was to explore how middle school readers process science texts and whether such comprehension processes differed due to features of complexity in two science texts. Method Thirty 7th grade students read two science texts with different profiles of text complexity in a think-aloud task. Think-aloud protocols were coded for six comprehension processes: connecting inferences, elaborative inferences, evaluative comments, metacognitive comments, and associations. We analyzed the quantity and type of comprehension processes generated across both texts in order to explore how features of text complexity contributed to the comprehension processes students produced while reading. Results Students made significantly more elaborative and connecting inferences when reading a text with deep cohesion, simple syntax, and concrete words, while students made more evaluative comments, paraphrases, and metacognitive comments when reading a text with referential cohesion, complex syntax, and abstract words. Conclusions The current study provides exploratory evidence for features of text complexity affecting the type of comprehension processes middle school readers generate while reading science texts. Accordingly, science classroom texts and materials can be evaluated for word, sentence, and passage features of text complexity in order to encourage deep level comprehension of middle school readers.

Download Full-text

Comparisons between high-resolution profiles of squared refractive index gradient <i>M</i><sup>2</sup> measured by the Middle and Upper Atmosphere Radar and unmanned aerial vehicles (UAVs) during the Shigaraki UAV-Radar Experiment 2015 campaign

Annales Geophysicae ◽

10.5194/angeo-35-423-2017 ◽

2017 ◽

Vol 35 (3) ◽

pp. 423-441 ◽

Cited By ~ 5

Author(s):

Hubert Luce ◽

Lakshmi Kantha ◽

Hiroyuki Hashiguchi ◽

Dale Lawrence ◽

Masanori Yabuki ◽

...

Keyword(s):

Refractive Index ◽

Convective Cloud ◽

Upper Atmosphere ◽

Refractive Index Gradient ◽

Very High Frequency ◽

Range Resolution ◽

Mu Radar ◽

Radar Echo ◽

Index Gradient ◽

Complex Features

Abstract. New comparisons between the square of the generalized potential refractive index gradient M2, estimated from the very high-frequency (VHF) Middle and Upper Atmosphere (MU) Radar, located at Shigaraki, Japan, and unmanned aerial vehicle (UAV) measurements are presented. These comparisons were performed at unprecedented temporal and range resolutions (1–4 min and ∼  20 m, respectively) in the altitude range ∼  1.27–4.5 km from simultaneous and nearly collocated measurements made during the ShUREX (Shigaraki UAV-Radar Experiment) 2015 campaign. Seven consecutive UAV flights made during daytime on 7 June 2015 were used for this purpose. The MU Radar was operated in range imaging mode for improving the range resolution at vertical incidence (typically a few tens of meters). The proportionality of the radar echo power to M2 is reported for the first time at such high time and range resolutions for stratified conditions for which Fresnel scatter or a reflection mechanism is expected. In more complex features obtained for a range of turbulent layers generated by shear instabilities or associated with convective cloud cells, M2 estimated from UAV data does not reproduce observed radar echo power profiles. Proposed interpretations of this discrepancy are presented.

Download Full-text