Danube sediment contamination with polychlorinated biphenyls: New interpretation of sediment quality assessment

Acta periodica technologica ◽

10.2298/apt1950043b ◽

2019 ◽

pp. 43-50

Author(s):

Maja Brboric ◽

Borivoj Stepanov ◽

Jelena Radonic ◽

Maja Turk-Sekulic

Keyword(s):

Polychlorinated Biphenyls ◽

Sampling Site ◽

Sediment Quality ◽

Hierarchical Cluster ◽

Multivariate Techniques ◽

Clustering Methods ◽

Environmental Matrices ◽

Data Set ◽

Local Pollution ◽

New Interpretation

Among the contaminants of greatest concern, it is still possible to detect in aquatic systems "old" classics such as polychlorinated biphenyls (PCBs). Since PCBs are detected in all environmental matrices and have been identified as harmful substances due to their toxicity, persistence, and bioaccumulation in humans and wildlife, they are still one of the important groups of POPs. For this reason, this original approach studies the toxicological influence of PCBs, quantified in sediment samples collected at ten sites along the river Danube, by an application of advanced classification and clustering methods such as hierarchical cluster analysis (HCA) and Kohonen's self-organising maps (SOMs). Selected multivariate techniques were applied to the monitoring dataset in order to obtain visual images of the components distributed at each sampling site when all components are included in the classification and data projection procedure. After analyzing the data set using both techniques were isolated groups that exhibit similar behavior. In the hexagon and dendogram of variables three main clusters were distinguished. Towards the identification of pollutant spatial patterns, the SOM did not isolate a clear phenomenon probably due to the absence of local pollution sources contributing to the elevated concentrations of these compounds. The presented assumptions indicated that the supplemental application of SOM and HCA offers advantageous features over the usually rough interpretation of PCBs pattern and over the single use of the methods.

Download Full-text

Data on sediment quality and concentrations of polychlorinated biphenyls from the Lower Neponset River, Massachusetts, 2002-03

Open-File Report ◽

10.3133/ofr20041280 ◽

2004 ◽

Author(s):

Robert F. Breault ◽

Matthew G. Cooke ◽

Michael Merrill

Keyword(s):

Polychlorinated Biphenyls ◽

Sediment Quality

Download Full-text

Heterogeneity of government social spending in European Union countries

Future Business Journal ◽

10.1186/s43093-021-00084-7 ◽

2021 ◽

Vol 7 (1) ◽

Author(s):

Anna Magdalena Korzeniowska

Keyword(s):

European Union ◽

Social Factors ◽

Hierarchical Cluster ◽

Social Spending ◽

Clustering Methods ◽

Social Expenditure ◽

Homogeneous Groups ◽

Eu Countries ◽

Eu Member States ◽

More Care

AbstractSocial expenditure plays an important role in European Union (EU) countries. It improves the lives of citizens whose welfare is endangered due to poverty or illness. However, social expenditure represents a considerable share of the budgets of EU member states. Despite evident similarities in their levels of development, EU countries show apparent differences in social expenditure levels. Therefore, this work aims to determine the similarities and differences between EU countries in this regard. The analysis uses clustering methods, such as hierarchical cluster analysis and the k-means, to divide countries into homogeneous groups. The research demonstrates significant differences between EU countries in the years 2008–2018, which resulted in a low number of objects (countries) in the identified groups. In the case of 6 out of 28 countries, it was not possible to assign them to any group. The research proves that EU countries should take more care when organising their social policy, taking into consideration cultural and social factors.

Download Full-text

SEMIPARAMETRIC CLUSTERING METHOD FOR MICROARRAY DATA ANALYSIS

Journal of Bioinformatics and Computational Biology ◽

10.1142/s021972000800345x ◽

2008 ◽

Vol 06 (02) ◽

pp. 261-282 ◽

Cited By ~ 2

Author(s):

AO YUAN ◽

WENQING HE

Keyword(s):

Data Analysis ◽

Microarray Data ◽

Mixture Distribution ◽

Information Criterion ◽

Optimal Number ◽

Microarray Data Analysis ◽

Parametric Methods ◽

Clustering Methods ◽

Microarray Gene Expression ◽

Data Set

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.

Download Full-text

A simple clustering technique to extract subsets of data for function approximation

Journal of Hydroinformatics ◽

10.2166/hydro.2015.065 ◽

2015 ◽

Vol 17 (5) ◽

pp. 719-732

Author(s):

Dulakshi Santhusitha Kumari Karunasingha ◽

Shie-Yui Liong

Keyword(s):

Function Approximation ◽

Prediction Models ◽

Data Extraction ◽

Single Parameter ◽

Subtractive Clustering ◽

Data Sets ◽

Clustering Methods ◽

Clustering Method ◽

Data Set ◽

Functional Relationships

A simple clustering method is proposed for extracting representative subsets from lengthy data sets. The main purpose of the extracted subset of data is to use it to build prediction models (of the form of approximating functional relationships) instead of using the entire large data set. Such smaller subsets of data are often required in exploratory analysis stages of studies that involve resource consuming investigations. A few recent studies have used a subtractive clustering method (SCM) for such data extraction, in the absence of clustering methods for function approximation. SCM, however, requires several parameters to be specified. This study proposes a clustering method, which requires only a single parameter to be specified, yet it is shown to be as effective as the SCM. A method to find suitable values for the parameter is also proposed. Due to having only a single parameter, using the proposed clustering method is shown to be orders of magnitudes more efficient than using SCM. The effectiveness of the proposed method is demonstrated on phase space prediction of three univariate time series and prediction of two multivariate data sets. Some drawbacks of SCM when applied for data extraction are identified, and the proposed method is shown to be a solution for them.

Download Full-text

Constraint-Based Hierarchical Cluster Selection in Automotive Radar Data

Sensors ◽

10.3390/s21103410 ◽

2021 ◽

Vol 21 (10) ◽

pp. 3410

Author(s):

Claudia Malzer ◽

Marcus Baum

Keyword(s):

Moving Objects ◽

Hierarchical Cluster ◽

Autonomous Driving ◽

Radar Data ◽

Background Information ◽

Automotive Radar ◽

Data Set ◽

Radar Sensors ◽

Small Clusters ◽

Cluster Level

High-resolution automotive radar sensors play an increasing role in detection, classification and tracking of moving objects in traffic scenes. Clustering is frequently used to group detection points in this context. However, this is a particularly challenging task due to variations in number and density of available data points across different scans. Modified versions of the density-based clustering method DBSCAN have mostly been used so far, while hierarchical approaches are rarely considered. In this article, we explore the applicability of HDBSCAN, a hierarchical DBSCAN variant, for clustering radar measurements. To improve results achieved by its unsupervised version, we propose the use of cluster-level constraints based on aggregated background information from cluster candidates. Further, we propose the application of a distance threshold to avoid selection of small clusters at low hierarchy levels. Based on exemplary traffic scenes from nuScenes, a publicly available autonomous driving data set, we test our constraint-based approach along with other methods, including label-based semi-supervised HDBSCAN. Our experiments demonstrate that cluster-level constraints help to adjust HDBSCAN to the given application context and can therefore achieve considerably better results than the unsupervised method. However, the approach requires carefully selected constraint criteria that can be difficult to choose in constantly changing environments.

Download Full-text

Empirical Evaluation of Genetic Clustering Methods Using Multilocus Genotypes From 20 Chicken Breeds

Genetics ◽

10.1093/genetics/159.2.699 ◽

2001 ◽

Vol 159 (2) ◽

pp. 699-713

Author(s):

Noah A Rosenberg ◽

Terry Burke ◽

Kari Elo ◽

Marcus W Feldman ◽

Paul J Freidlin ◽

...

Keyword(s):

Cluster Analysis ◽

Population Structure ◽

Clustering Algorithm ◽

Empirical Evaluation ◽

Unknown Origin ◽

Clustering Methods ◽

Genetic Cluster ◽

Data Set ◽

Multilocus Genotypes ◽

Chicken Breeds

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved >95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.

Download Full-text

Cluster Analysis of Dynamic Parameters of Gene Expression

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720003000307 ◽

2003 ◽

Vol 01 (03) ◽

pp. 447-458 ◽

Cited By ~ 5

Author(s):

Xiwei Wu ◽

T. Gregory Dewey

Keyword(s):

Gene Expression ◽

Cell Cycle ◽

Cluster Analysis ◽

Hierarchical Cluster ◽

Specific Model ◽

Model Parameters ◽

Dynamic Parameters ◽

Clustering Methods ◽

Transition Matrices ◽

Great Utility

Cluster analysis has proven to be a valuable statistical method for analyzing whole genome expression data. Although clustering methods have great utility, they do represent a lower level statistical analysis that is not directly tied to a specific model. To extend such methods and to allow for more sophisticated lines of inference, we use cluster analysis in conjunction with a specific model of gene expression dynamics. This model provides phenomenological dynamic parameters on both linear and non-linear responses of the system. This analysis determines the parameters of two different transition matrices (linear and nonlinear) that describe the influence of one gene expression level on another. Using yeast cell cycle microarray data as test set, we calculated the transition matrices and used these dynamic parameters as a metric for cluster analysis. Hierarchical cluster analysis of this transition matrix reveals how a set of genes influence the expression of other genes activated during different cell cycle phases. Most strikingly, genes in different stages of cell cycle preferentially activate or inactivate genes in other stages of cell cycle, and this relationship can be readily visualized in a two-way clustering image. The observation is prior to any knowledge of the chronological characteristics of the cell cycle process. This method shows the utility of using model parameters as a metric in cluster analysis.

Download Full-text

Cluster analysis of the results of intraoperative optical spectroscopic diagnostics In brain glioma neurosurgery

Biomedical Photonics ◽

10.24931/2413-9432-2018-7-4-23-34 ◽

2019 ◽

Vol 7 (4) ◽

pp. 23-34

Author(s):

I. A. Osmakov ◽

T. A. Savelieva ◽

V. B. Loschenov ◽

S. A. Goryajnov ◽

A. A. Potapov

Keyword(s):

Cluster Analysis ◽

Optical Spectroscopy ◽

Protoporphyrin Ix ◽

Spectroscopy Data ◽

Clustering Methods ◽

Agglomerative Clustering ◽

Data Set ◽

Broadband Radiation ◽

Degree Of Malignancy ◽

Normal White Matter

The paper presents the results of a comparative study of methods of cluster analysis of optical intraoperative spectroscopy data during surgery of glial tumors with varying degree of malignancy. The analysis was carried out both for individual patients and for the entire dataset. The data were obtained using combined optical spectroscopy technique, which allowed simultaneous registration of diﬀuse reﬂectance spectra of broadband radiation in the 500–600 nm spectral range (for the analysis of tissue blood supply and the degree of hemoglobin oxygenation), ﬂuorescence spectra of 5‑ALA induced protoporphyrin IX (Pp IX) (for analysis of the malignancy degree) and signal of diffusely reﬂected laser light used to excite Pp IX ﬂuorescence (to take into account the scattering properties of tissues). To determine the threshold values of these parameters for the tumor, the infltration zone and the normal white matter, we searched for the natural clusters in the available intraoperative optical spectroscopy data and compared them with the results of the pathomorphology. It was shown that, among the considered clustering methods, EM‑algorithm and k‑means methods are optimal for the considered data set and can be used to build a decision support system (DSS) for spectroscopic intraoperative navigation in neurosurgery. Results of clustering relevant to thepathological studies were also obtained using the methods of spectral and agglomerative clustering. These methods can be used to postprocess combined spectroscopy data.

Download Full-text

Classification Connection of Twitter Data using K-Means Clustering

International Journal of Innovative Technology and Exploring Engineering - Special Issue ◽

10.35940/ijitee.f1004.0486s419 ◽

2019 ◽

Vol 8 (6S4) ◽

pp. 14-22

Keyword(s):

Social Media ◽

Real Time ◽

Clustering Methods ◽

Time Analysis ◽

Data Set ◽

Dynamic Data ◽

Real Time Analysis ◽

Twitter Data ◽

Social Media Platforms ◽

Insight Into

The rise of social media platforms like Twitter and the increasing adoption by people in order to stay connected provide a large source of data to perform analysis based on the various trends, events and even various personalities. Such analysis also provides insight into a person’s likes and inclinations in real time independent of the data size. Several techniques have been created to retrieve such data however the most efficient technique is clustering. This paper provides an overview of the algorithms of the various clustering methods as well as looking at their efficiency in determining trending information. The clustered data may be further classified by topics for real time analysis on a large dynamic data set. In this paper, data classification is performed and analyzed for flaws followed by another classification on the same data set.

Download Full-text

Soil Quality Assessment Based on a Minimum Data Set: A Case Study of a County in the Typical River Delta Wetlands

Sustainability ◽

10.3390/su12219033 ◽

2020 ◽

Vol 12 (21) ◽

pp. 9033

Author(s):

Mingliang Jiang ◽

Ligang Xu ◽

Xiaobing Chen ◽

Hua Zhu ◽

Hongxiang Fan

Keyword(s):

Soil Quality ◽

Yellow River ◽

Sampling Site ◽

Yellow River Delta ◽

River Delta ◽

The Yellow River ◽

Land Resources ◽

Data Set ◽

Land Type ◽

Natural Land

Purpose: The Yellow River delta boasts rich land resources but lacks fresh water and exhibits poor natural conditions. To rationally develop and utilize the land resources therein, it is necessary to evaluate the soil quality. Methods: Adopting specific screening conditions, principal component analysis (PCA) was used to construct a minimum data set (MDS) from 10 soil indicators. Then, a complete soil quality evaluation index system of the Yellow River delta was developed. The soil quality comprehensive index (SQI) method was used to assess the soil quality in the Kenli District, and the soil quality grades and spatial distribution were analyzed. Results: (1) The average SQI of the Kenli District is 0.523, and the best soil quality is concentrated near the Yellow River, especially in Huanghekou town. (2) The normalized difference vegetation index was positively correlate with SQI, whereas Dr (nearest distance between the sampling site and Yellow River) and Ds (nearest distance between the sampling site and Bohai Sea) were negatively correlated with SQI. Elev (sampling site elevation) was not correlated with SQI. (3) The SQI of agricultural planting is greater than that of the natural land type and significantly greater than that of nudation. The main factors limiting farmland soil quality are SK (water-soluble potassium) and pH, whereas the factor limiting the natural land type are the soil nutrient indicators. Conclusions: To improve soil quality and develop and utilize land resources, the towns should adopt systematic land development/utilization methods based on local conditions. These results have important guiding significance and practical value for the more objective and accurate evaluation of soil quality in coastal areas and the development and utilization of land resources.

Download Full-text