Self-Organising Map Approach to Individual Profiles: Age, Sex and                     Culture in Internet Dating

Teemu Suna; Michael Hardey; Jouni Huhtinen; Yrjö Hiltunen; Kimmo Kaski; Jukka Heikkonen; Mika Ala-Korpela

doi:10.5153/sro.1253

Self-Organising Map Approach to Individual Profiles: Age, Sex and Culture in Internet Dating

Sociological Research Online ◽

10.5153/sro.1253 ◽

2006 ◽

Vol 11 (1) ◽

pp. 114-129 ◽

Cited By ~ 2

Author(s):

Teemu Suna ◽

Michael Hardey ◽

Jouni Huhtinen ◽

Yrjö Hiltunen ◽

Kimmo Kaski ◽

...

Keyword(s):

Personal Information ◽

Analytical Techniques ◽

Original Data ◽

Data Sets ◽

Complex Data ◽

Internet Dating ◽

Visual Maps ◽

Unsupervised Neural Network ◽

Recent Developments ◽

Self Organising Map

A marked feature of recent developments in the networked society has been the growth in the number of people making use of Internet dating services. These services involve the accumulation of large amounts of personal information which individuals utilise to find others and potentially arrange offline meetings. The consequent data represent a challenge to conventional analysis, for example, the service that provided the data used in this paper had approximately 5,000 users all of whom completed an extensive questionnaire resulting in some 300 parameters. This creates an opportunity to apply innovative analytical techniques that may provide new sociological insights into complex data. In this paper we utilise the self-organising map (SOM), an unsupervised neural network methodology, to explore Internet dating data. The resulting visual maps are used to demonstrate the ability of SOMs to reveal interrelated parameters. The SOM process led to the emergence of correlations that were obscured in the original data and pointed to the role of what we call ‘cultural age’ in the profiles and partnership preferences of the individuals. Our results suggest that the SOM approach offers a well established methodology that can be easily applied to complex sociological data sets. The SOM outcomes are discussed in relation to other research about identifying others and forming relationships in a network society.

Download Full-text

The OXL format for the exchange of integrated datasets

Journal of Integrative Bioinformatics ◽

10.1515/jib-2007-62 ◽

2007 ◽

Vol 4 (3) ◽

pp. 27-40 ◽

Cited By ~ 5

Author(s):

Jan Taubert ◽

Klaus Peter Sieren ◽

Matthew Hindle ◽

Berend Hoekman ◽

Rainer Winnenburg ◽

...

Keyword(s):

Life Science ◽

Original Data ◽

Biological Information ◽

Data Sets ◽

Complex Data ◽

Scientific Publications ◽

Mining System ◽

Biological Domain ◽

Data Source ◽

Text Mining System

Abstract A prerequisite for systems biology is the integration and analysis of heterogeneous experimental data stored in hundreds of life-science databases and millions of scientific publications. Several standardised formats for the exchange of specific kinds of biological information exist. Such exchange languages facilitate the integration process; however they are not designed to transport integrated datasets. A format for exchanging integrated datasets needs to i) cover data from a broad range of application domains, ii) be flexible and extensible to combine many different complex data structures, iii) include metadata and semantic definitions, iv) include inferred information, v) identify the original data source for integrated entities and vi) transport large integrated datasets. Unfortunately, none of the exchange formats from the biological domain (e.g. BioPAX, MAGE-ML, PSI-MI, SBML) or the generic approaches (RDF, OWL) fulfil these requirements in a systematic way.We present OXL, a format for the exchange of integrated data sets, and detail how the aforementioned requirements are met within the OXL format. OXL is the native format within the data integration and text mining system ONDEX. Although OXL was developed with the ONDEX system in mind, it also has the potential to be used in several other biological and non-biological applications described in this paper.Availability: The OXL format is an integral part of the ONDEX system which is freely available under the GPL at http://ondex.sourceforge.net/. Sample files can be found at http://prdownloads.sourceforge.net/ondex/ and the XML Schema at http://ondex.svn.sf.net/viewvc/*checkout*/ondex/trunk/backend/data/xml/ondex.xsd.

Download Full-text

Recent Developments in CE-MS Based Metabolomics

Current Analytical Chemistry ◽

10.2174/1573411016999200709133339 ◽

2020 ◽

Vol 16 ◽

Author(s):

Mustafa Çelebier ◽

Merve Nenni

Keyword(s):

Sample Preparation ◽

Scientific Information ◽

Analytical Techniques ◽

Living Organism ◽

Analytical Technique ◽

Ionic Compounds ◽

Recent Sample ◽

Recent Developments ◽

Endogenous Metabolites ◽

Preparation Techniques

Background: Metabolomics has gained importance in clinical applications over the last decade. Metabolomics studies are significant because the systemic metabolome is directly affected by disease conditions. Metabolome-based biomarkers are actively being developed for early diagnosis and to indicate the stage of specific diseases. Additionally, understanding the effect of an intervention on a living organism at the molecular level is a crucial strategy for understanding novel or unexpected biological processes. Results: The simultaneous improvements in advanced analytical techniques, sample preparation techniques, computer technology, and databank contents has enabled more valuable scientific information to be gained from metabolomics than ever before. With over 15,000 known endogenous metabolites, there is no single analytical technique capable of analyzing the whole metabolome. However, capillary electrophoresis-mass spectrometry (CE-MS) is a unique technique used to analyze an important portion of metabolites not accessible by liquid chromatography or gas chromatography techniques. The analytical capability of CE, combined with recent sample preparation techniques focused on extracting polar-ionic compounds, make CE-MS a perfect technique for metabolomic studies. Conclusion: Here, previous reviews of CE-MS based metabolomics are evaluated to highlight recent improvements in this technique. Specifically, we review papers from the last two years (2018 and 2019) on CE-MS based metabolomics. The current situation and the challenges facing metabolomic studies are discussed to reveal the high potential of CE-MS for further studies, especially in biomarker development studies.

Download Full-text

TraceAll: A Real-Time Processing for Contact Tracing Using Indoor Trajectories

Information ◽

10.3390/info12050202 ◽

2021 ◽

Vol 12 (5) ◽

pp. 202

Author(s):

Louai Alarabi ◽

Saleh Basalamah ◽

Abdeltawab Hendawi ◽

Mohammed Abdalla

Keyword(s):

Infectious Diseases ◽

Infected Patient ◽

Public Health Problem ◽

Real Data ◽

Exposure Period ◽

Contact Tracing ◽

Data Sets ◽

Major Public Health Problem ◽

Real Time Processing ◽

Recent Developments

The rapid spread of infectious diseases is a major public health problem. Recent developments in fighting these diseases have heightened the need for a contact tracing process. Contact tracing can be considered an ideal method for controlling the transmission of infectious diseases. The result of the contact tracing process is performing diagnostic tests, treating for suspected cases or self-isolation, and then treating for infected persons; this eventually results in limiting the spread of diseases. This paper proposes a technique named TraceAll that traces all contacts exposed to the infected patient and produces a list of these contacts to be considered potentially infected patients. Initially, it considers the infected patient as the querying user and starts to fetch the contacts exposed to him. Secondly, it obtains all the trajectories that belong to the objects moved nearby the querying user. Next, it investigates these trajectories by considering the social distance and exposure period to identify if these objects have become infected or not. The experimental evaluation of the proposed technique with real data sets illustrates the effectiveness of this solution. Comparative analysis experiments confirm that TraceAll outperforms baseline methods by 40% regarding the efficiency of answering contact tracing queries.

Download Full-text

Surface wave attenuation of seismic records with the co-core trace transform filter

Geophysics ◽

10.1190/geo2010-0010.1 ◽

2011 ◽

Vol 76 (6) ◽

pp. V115-V128 ◽

Cited By ~ 3

Author(s):

Ning Wu ◽

Yue Li ◽

Baojun Yang

Keyword(s):

Surface Wave ◽

Surface Waves ◽

Field Data ◽

Wave Attenuation ◽

Seismic Events ◽

Data Sets ◽

Transform Domain ◽

Recent Developments ◽

Seismic Records ◽

Trace Transform

To remove surface waves from seismic records while preserving other seismic events of interest, we introduced a transform and a filter based on recent developments in image processing. The transform can be seen as a weighted Radon transform, in particular along linear trajectories. The weights in the transform are data dependent and designed to introduce large amplitude differences between surface waves and other events such that surface waves could be separated by a simple amplitude threshold. This is a key property of the filter and distinguishes this approach from others, such as conventional ones that use information on moveout ranges to apply a mask in the transform domain. Initial experiments with synthetic records and field data have demonstrated that, with the appropriate parameters, the proposed trace transform filter performs better both in terms of surface wave attenuation and reflected signal preservation than the conventional methods. Further experiments on larger data sets are needed to fully assess the method.

Download Full-text

Absorption of L-alanine and other dissolved nutrients by the spines of Paracentrotus lividus (Echinoidea)

Journal of the Marine Biological Association of the United Kingdom ◽

10.1017/s0025315400026102 ◽

1977 ◽

Vol 57 (4) ◽

pp. 1031-1045 ◽

Cited By ~ 18

Author(s):

M. E. de Burgh ◽

A. B. West ◽

F. Jeal

Keyword(s):

Amino Acids ◽

Organic Molecules ◽

Marine Invertebrates ◽

Analytical Techniques ◽

Paracentrotus Lividus ◽

Direct Absorption ◽

Dissolved Nutrients ◽

Recent Developments ◽

Modern Analytical

The possibility that marine invertebrates might obtain part of their nutritional requirements by direct absorption of dissolved molecules through the epidermis has recently received considerable attention. This revival of interest in a field which had been virtually abandoned since the early part of the century was led by the findings of Stephens & Schinske (1957, 1958, 1961). Modern analytical techniques have revealed that the amount of dissolved nutrients in coastal waters is much greater than was formerly realized; total amino acids have been recorded in concentrations of up to 10-4 mole/litre in south-east Alaskan waters (Schell, 1974) and 7 x 10-5 mole/litre off Helgoland (Bohling, 1970). Direct absorption of amino acids has been conclusively established in several phyla (see reviews by Stephens, 1968,1972), and one of the major aims of current research is to show that dissolved organic molecules taken up from available concentrations could be of nutritional significance. Recent developments concerning the possible roles of uptake in marine ecosystems have been reviewed by West, de Burgh & Jeal (1977).

Download Full-text

AI and privacy concerns: a smart meter case study

Journal of Information Communication and Ethics in Society ◽

10.1108/jices-04-2021-0042 ◽

2021 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Jillian Carmody ◽

Samir Shringarpure ◽

Gerhard Van de Venter

Keyword(s):

Private Information ◽

Personal Information ◽

Consumer Information ◽

Data Sets ◽

Smart Meter ◽

Smart Meters ◽

Content Type ◽

Privacy Concerns ◽

Privacy Legislation

Purpose The purpose of this paper is to demonstrate privacy concerns arising from the rapidly increasing advancements and use of artificial intelligence (AI) technology and the challenges of existing privacy regimes to ensure the on-going protection of an individual’s sensitive private information. The authors illustrate this through a case study of energy smart meters and suggest a novel combination of four solutions to strengthen privacy protection. Design/methodology/approach The authors illustrate how, through smart meter obtained energy data, home energy providers can use AI to reveal private consumer information such as households’ electrical appliances, their time and frequency of usage, including number and model of appliance. The authors show how this data can further be combined with other data to infer sensitive personal information such as lifestyle and household income due to advances in AI technologies. Findings The authors highlight data protection and privacy concerns which are not immediately obvious to consumers due to the capabilities of advanced AI technology and its ability to extract sensitive personal information when applied to large overlapping granular data sets. Social implications The authors question the adequacy of existing privacy legislation to protect sensitive inferred consumer data from AI-driven technology. To address this, the authors suggest alternative solutions. Originality/value The original value of this paper is that it illustrates new privacy issues brought about by advances in AI, failings in current privacy legislation and implementation and opens the dialog between stakeholders to protect vulnerable consumers.

Download Full-text

Great differences in performance and outcome of high-throughput sequencing data analysis platforms for fungal metabarcoding

MycoKeys ◽

10.3897/mycokeys.39.28109 ◽

2018 ◽

Vol 39 ◽

pp. 29-40 ◽

Cited By ~ 21

Author(s):

Sten Anslan ◽

R. Henrik Nilsson ◽

Christian Wurzbacher ◽

Petr Baldrian ◽

Leho Tedersoo ◽

...

Keyword(s):

High Throughput ◽

High Throughput Sequencing ◽

Computation Time ◽

Potential Effect ◽

Data Sets ◽

Sequencing Data ◽

Operational Taxonomic Units ◽

High Throughput Sequencing Data ◽

Recent Developments

Along with recent developments in high-throughput sequencing (HTS) technologies and thus fast accumulation of HTS data, there has been a growing need and interest for developing tools for HTS data processing and communication. In particular, a number of bioinformatics tools have been designed for analysing metabarcoding data, each with specific features, assumptions and outputs. To evaluate the potential effect of the application of different bioinformatics workflow on the results, we compared the performance of different analysis platforms on two contrasting high-throughput sequencing data sets. Our analysis revealed that the computation time, quality of error filtering and hence output of specific bioinformatics process largely depends on the platform used. Our results show that none of the bioinformatics workflows appears to perfectly filter out the accumulated errors and generate Operational Taxonomic Units, although PipeCraft, LotuS and PIPITS perform better than QIIME2 and Galaxy for the tested fungal amplicon dataset. We conclude that the output of each platform requires manual validation of the OTUs by examining the taxonomy assignment values.

Download Full-text

A Single Technically Consistent Design Formula for the Thickness of Cylindrical Sections Under Internal Pressure

Journal of Pressure Vessel Technology ◽

10.1115/1.2389035 ◽

2006 ◽

Vol 129 (1) ◽

pp. 211-215 ◽

Cited By ~ 2

Author(s):

John D. Fishburn

Keyword(s):

Experimental Data ◽

Internal Pressure ◽

State Of The Art ◽

Original Data ◽

Pressure Vessels ◽

Time Dependent ◽

Design Codes ◽

Recent Developments ◽

Current Design ◽

Single Formula

Within the current design codes for boilers, piping, and pressure vessels, there are many different equations for the thickness of a cylindrical section under internal pressure. A reassessment of these various formulations, using the original data, is described together with more recent developments in the state of the art. A single formula, which can be demonstrated to retain the same design margin in both the time-dependent and time-independent regimes, is shown to give the best correlation with the experimental data and is proposed for consideration for inclusion in the design codes.

Download Full-text

Recent Developments in Gradient Plasticity—Part II: Plastic Heterogeneity and Wavelets

Journal of Engineering Materials and Technology ◽

10.1115/1.1479696 ◽

2002 ◽

Vol 124 (3) ◽

pp. 358-364 ◽

Cited By ~ 13

Author(s):

Avraam A. Konstantinidis ◽

Elias C. Aifantis

Keyword(s):

Spatial Distribution ◽

Wavelet Analysis ◽

Gradient Theory ◽

Discrete Wavelet ◽

Experimental Measurements ◽

Data Sets ◽

Gradient Plasticity ◽

Slip Step ◽

Recent Developments ◽

Heterogeneous Deformation

Wavelet analysis is used for describing heterogeneous deformation in different scales. Slip step height experimental measurements of monocrystalline alloy specimens subjected to compression are considered. The experimental data are subjected to discrete wavelet transform and the spatial distribution of deformation in different scales (resolutions) is calculated. At the finer scale the wavelet analyzed data are identical to the experimental measurements, while at the coarser scale the profile predicted by the wavelet analysis resembles the shear band solution profile provided by gradient theory in agreement with experimental observations. The different data sets provided by wavelet analysis are used to train a neural network in order to predict the spatial distribution of strain at resolutions higher than those possible by the available experimental probes. In addition, applications of wavelet analysis to interpret size effect data in torsion and bending at the micron scale are examined by deriving scale-dependent constitutive equations which are used for this purpose.

Download Full-text

Bayesian Classifier for Sparsity-Promoting Feature Selection

International Journal of Pattern Recognition and Artificial Intelligence ◽

10.1142/s0218001415500226 ◽

2015 ◽

Vol 29 (06) ◽

pp. 1550022 ◽

Cited By ~ 1

Author(s):

Danlei Xu ◽

Lan Du ◽

Hongwei Liu ◽

Penghui Wang

Keyword(s):

Feature Selection ◽

Synthetic Data ◽

Original Data ◽

Radar Data ◽

Bayesian Classifier ◽

Classification Model ◽

Data Sets ◽

Data Set ◽

Classification Boundary ◽

Nonlinear Mappings

A Bayesian classifier for sparsity-promoting feature selection is developed in this paper, where a set of nonlinear mappings for the original data is performed as a pre-processing step. The linear classification model with such mappings from the original input space to a nonlinear transformation space can not only construct the nonlinear classification boundary, but also realize the feature selection for the original data. A zero-mean Gaussian prior with Gamma precision and a finite approximation of Beta process prior are used to promote sparsity in the utilization of features and nonlinear mappings in our model, respectively. We derive the Variational Bayesian (VB) inference algorithm for the proposed linear classifier. Experimental results based on the synthetic data set, measured radar data set, high-dimensional gene expression data set, and several benchmark data sets demonstrate the aggressive and robust feature selection capability and comparable classification accuracy of our method comparing with some other existing classifiers.

Download Full-text