scholarly journals Danube sediment contamination with polychlorinated biphenyls: New interpretation of sediment quality assessment

2019 ◽  
pp. 43-50
Author(s):  
Maja Brboric ◽  
Borivoj Stepanov ◽  
Jelena Radonic ◽  
Maja Turk-Sekulic

Among the contaminants of greatest concern, it is still possible to detect in aquatic systems "old" classics such as polychlorinated biphenyls (PCBs). Since PCBs are detected in all environmental matrices and have been identified as harmful substances due to their toxicity, persistence, and bioaccumulation in humans and wildlife, they are still one of the important groups of POPs. For this reason, this original approach studies the toxicological influence of PCBs, quantified in sediment samples collected at ten sites along the river Danube, by an application of advanced classification and clustering methods such as hierarchical cluster analysis (HCA) and Kohonen's self-organising maps (SOMs). Selected multivariate techniques were applied to the monitoring dataset in order to obtain visual images of the components distributed at each sampling site when all components are included in the classification and data projection procedure. After analyzing the data set using both techniques were isolated groups that exhibit similar behavior. In the hexagon and dendogram of variables three main clusters were distinguished. Towards the identification of pollutant spatial patterns, the SOM did not isolate a clear phenomenon probably due to the absence of local pollution sources contributing to the elevated concentrations of these compounds. The presented assumptions indicated that the supplemental application of SOM and HCA offers advantageous features over the usually rough interpretation of PCBs pattern and over the single use of the methods.

2021 ◽  
Vol 7 (1) ◽  
Author(s):  
Anna Magdalena Korzeniowska

AbstractSocial expenditure plays an important role in European Union (EU) countries. It improves the lives of citizens whose welfare is endangered due to poverty or illness. However, social expenditure represents a considerable share of the budgets of EU member states. Despite evident similarities in their levels of development, EU countries show apparent differences in social expenditure levels. Therefore, this work aims to determine the similarities and differences between EU countries in this regard. The analysis uses clustering methods, such as hierarchical cluster analysis and the k-means, to divide countries into homogeneous groups. The research demonstrates significant differences between EU countries in the years 2008–2018, which resulted in a low number of objects (countries) in the identified groups. In the case of 6 out of 28 countries, it was not possible to assign them to any group. The research proves that EU countries should take more care when organising their social policy, taking into consideration cultural and social factors.


2008 ◽  
Vol 06 (02) ◽  
pp. 261-282 ◽  
Author(s):  
AO YUAN ◽  
WENQING HE

Clustering is a major tool for microarray gene expression data analysis. The existing clustering methods fall mainly into two categories: parametric and nonparametric. The parametric methods generally assume a mixture of parametric subdistributions. When the mixture distribution approximately fits the true data generating mechanism, the parametric methods perform well, but not so when there is nonnegligible deviation between them. On the other hand, the nonparametric methods, which usually do not make distributional assumptions, are robust but pay the price for efficiency loss. In an attempt to utilize the known mixture form to increase efficiency, and to free assumptions about the unknown subdistributions to enhance robustness, we propose a semiparametric method for clustering. The proposed approach possesses the form of parametric mixture, with no assumptions to the subdistributions. The subdistributions are estimated nonparametrically, with constraints just being imposed on the modes. An expectation-maximization (EM) algorithm along with a classification step is invoked to cluster the data, and a modified Bayesian information criterion (BIC) is employed to guide the determination of the optimal number of clusters. Simulation studies are conducted to assess the performance and the robustness of the proposed method. The results show that the proposed method yields reasonable partition of the data. As an illustration, the proposed method is applied to a real microarray data set to cluster genes.


2015 ◽  
Vol 17 (5) ◽  
pp. 719-732
Author(s):  
Dulakshi Santhusitha Kumari Karunasingha ◽  
Shie-Yui Liong

A simple clustering method is proposed for extracting representative subsets from lengthy data sets. The main purpose of the extracted subset of data is to use it to build prediction models (of the form of approximating functional relationships) instead of using the entire large data set. Such smaller subsets of data are often required in exploratory analysis stages of studies that involve resource consuming investigations. A few recent studies have used a subtractive clustering method (SCM) for such data extraction, in the absence of clustering methods for function approximation. SCM, however, requires several parameters to be specified. This study proposes a clustering method, which requires only a single parameter to be specified, yet it is shown to be as effective as the SCM. A method to find suitable values for the parameter is also proposed. Due to having only a single parameter, using the proposed clustering method is shown to be orders of magnitudes more efficient than using SCM. The effectiveness of the proposed method is demonstrated on phase space prediction of three univariate time series and prediction of two multivariate data sets. Some drawbacks of SCM when applied for data extraction are identified, and the proposed method is shown to be a solution for them.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3410
Author(s):  
Claudia Malzer ◽  
Marcus Baum

High-resolution automotive radar sensors play an increasing role in detection, classification and tracking of moving objects in traffic scenes. Clustering is frequently used to group detection points in this context. However, this is a particularly challenging task due to variations in number and density of available data points across different scans. Modified versions of the density-based clustering method DBSCAN have mostly been used so far, while hierarchical approaches are rarely considered. In this article, we explore the applicability of HDBSCAN, a hierarchical DBSCAN variant, for clustering radar measurements. To improve results achieved by its unsupervised version, we propose the use of cluster-level constraints based on aggregated background information from cluster candidates. Further, we propose the application of a distance threshold to avoid selection of small clusters at low hierarchy levels. Based on exemplary traffic scenes from nuScenes, a publicly available autonomous driving data set, we test our constraint-based approach along with other methods, including label-based semi-supervised HDBSCAN. Our experiments demonstrate that cluster-level constraints help to adjust HDBSCAN to the given application context and can therefore achieve considerably better results than the unsupervised method. However, the approach requires carefully selected constraint criteria that can be difficult to choose in constantly changing environments.


Genetics ◽  
2001 ◽  
Vol 159 (2) ◽  
pp. 699-713
Author(s):  
Noah A Rosenberg ◽  
Terry Burke ◽  
Kari Elo ◽  
Marcus W Feldman ◽  
Paul J Freidlin ◽  
...  

Abstract We tested the utility of genetic cluster analysis in ascertaining population structure of a large data set for which population structure was previously known. Each of 600 individuals representing 20 distinct chicken breeds was genotyped for 27 microsatellite loci, and individual multilocus genotypes were used to infer genetic clusters. Individuals from each breed were inferred to belong mostly to the same cluster. The clustering success rate, measuring the fraction of individuals that were properly inferred to belong to their correct breeds, was consistently ~98%. When markers of highest expected heterozygosity were used, genotypes that included at least 8–10 highly variable markers from among the 27 markers genotyped also achieved >95% clustering success. When 12–15 highly variable markers and only 15–20 of the 30 individuals per breed were used, clustering success was at least 90%. We suggest that in species for which population structure is of interest, databases of multilocus genotypes at highly variable markers should be compiled. These genotypes could then be used as training samples for genetic cluster analysis and to facilitate assignments of individuals of unknown origin to populations. The clustering algorithm has potential applications in defining the within-species genetic units that are useful in problems of conservation.


2003 ◽  
Vol 01 (03) ◽  
pp. 447-458 ◽  
Author(s):  
Xiwei Wu ◽  
T. Gregory Dewey

Cluster analysis has proven to be a valuable statistical method for analyzing whole genome expression data. Although clustering methods have great utility, they do represent a lower level statistical analysis that is not directly tied to a specific model. To extend such methods and to allow for more sophisticated lines of inference, we use cluster analysis in conjunction with a specific model of gene expression dynamics. This model provides phenomenological dynamic parameters on both linear and non-linear responses of the system. This analysis determines the parameters of two different transition matrices (linear and nonlinear) that describe the influence of one gene expression level on another. Using yeast cell cycle microarray data as test set, we calculated the transition matrices and used these dynamic parameters as a metric for cluster analysis. Hierarchical cluster analysis of this transition matrix reveals how a set of genes influence the expression of other genes activated during different cell cycle phases. Most strikingly, genes in different stages of cell cycle preferentially activate or inactivate genes in other stages of cell cycle, and this relationship can be readily visualized in a two-way clustering image. The observation is prior to any knowledge of the chronological characteristics of the cell cycle process. This method shows the utility of using model parameters as a metric in cluster analysis.


2019 ◽  
Vol 7 (4) ◽  
pp. 23-34
Author(s):  
I. A. Osmakov ◽  
T. A. Savelieva ◽  
V. B. Loschenov ◽  
S. A. Goryajnov ◽  
A. A. Potapov

The paper presents the results of a comparative study of methods of cluster analysis of optical intraoperative spectroscopy data during surgery of glial tumors with varying degree of malignancy. The analysis was carried out both for individual patients and for the entire dataset. The data were obtained using combined optical spectroscopy technique, which allowed simultaneous registration of diffuse reflectance spectra of broadband radiation in the 500–600 nm spectral range (for the analysis of tissue blood supply and the degree of hemoglobin oxygenation), fluorescence spectra of 5‑ALA induced protoporphyrin IX (Pp IX) (for analysis of the malignancy degree) and signal of diffusely reflected laser light used to excite Pp IX fluorescence (to take into account the scattering properties of tissues). To determine the threshold values of these parameters for the tumor, the infltration zone and the normal white matter, we searched for the natural clusters in the available intraoperative optical spectroscopy data and compared them with the results of the pathomorphology. It was shown that, among the considered clustering methods, EM‑algorithm and k‑means methods are optimal for the considered data set and can be used to build a decision support system (DSS) for spectroscopic intraoperative navigation in neurosurgery. Results of clustering relevant to thepathological studies were also obtained using the methods of spectral and agglomerative clustering. These methods can be used to postprocess combined spectroscopy data.


The rise of social media platforms like Twitter and the increasing adoption by people in order to stay connected provide a large source of data to perform analysis based on the various trends, events and even various personalities. Such analysis also provides insight into a person’s likes and inclinations in real time independent of the data size. Several techniques have been created to retrieve such data however the most efficient technique is clustering. This paper provides an overview of the algorithms of the various clustering methods as well as looking at their efficiency in determining trending information. The clustered data may be further classified by topics for real time analysis on a large dynamic data set. In this paper, data classification is performed and analyzed for flaws followed by another classification on the same data set.


2020 ◽  
Vol 12 (21) ◽  
pp. 9033
Author(s):  
Mingliang Jiang ◽  
Ligang Xu ◽  
Xiaobing Chen ◽  
Hua Zhu ◽  
Hongxiang Fan

Purpose: The Yellow River delta boasts rich land resources but lacks fresh water and exhibits poor natural conditions. To rationally develop and utilize the land resources therein, it is necessary to evaluate the soil quality. Methods: Adopting specific screening conditions, principal component analysis (PCA) was used to construct a minimum data set (MDS) from 10 soil indicators. Then, a complete soil quality evaluation index system of the Yellow River delta was developed. The soil quality comprehensive index (SQI) method was used to assess the soil quality in the Kenli District, and the soil quality grades and spatial distribution were analyzed. Results: (1) The average SQI of the Kenli District is 0.523, and the best soil quality is concentrated near the Yellow River, especially in Huanghekou town. (2) The normalized difference vegetation index was positively correlate with SQI, whereas Dr (nearest distance between the sampling site and Yellow River) and Ds (nearest distance between the sampling site and Bohai Sea) were negatively correlated with SQI. Elev (sampling site elevation) was not correlated with SQI. (3) The SQI of agricultural planting is greater than that of the natural land type and significantly greater than that of nudation. The main factors limiting farmland soil quality are SK (water-soluble potassium) and pH, whereas the factor limiting the natural land type are the soil nutrient indicators. Conclusions: To improve soil quality and develop and utilize land resources, the towns should adopt systematic land development/utilization methods based on local conditions. These results have important guiding significance and practical value for the more objective and accurate evaluation of soil quality in coastal areas and the development and utilization of land resources.


Sign in / Sign up

Export Citation Format

Share Document