scholarly journals Use of principal component analysis, factor analysis and discriminant analysis to evaluate spatial and temporal variations in water quality of the Mekong River

2008 ◽  
Vol 10 (1) ◽  
pp. 43-56 ◽  
Author(s):  
Sangam Shrestha ◽  
Futaba Kazama ◽  
Takashi Nakamura

Multivariate statistical techniques, such as principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA), were applied for the evaluation of temporal/spatial variations and the interpretation of a large complex water quality dataset of the Mekong River using data sets generated during 6 years (1995–2000) of monitoring of 18 parameters (16,848 observations) at 13 different sites. The results of PCA/FA revealed that most of the variations are explained by dissolved mineral salts along the whole Mekong River and in individual stations. Discriminant analysis showed the best results for data reduction and pattern recognition during both spatial and temporal analysis. Spatial DA revealed 8 parameters (total suspended solids, calcium, sodium, alkalinity, chloride, iron, nitrate nitrogen, total phosphorus) and 12 parameters (total suspended solids, calcium, sodium, potassium, alkalinity, chloride, sulfate, iron, nitrate nitrogen, total phosphorus, silicon, dissolved oxygen) are responsible for significant variations between monitoring regions and countries, respectively. Temporal DA revealed 3 parameters (conductivity, alkalinity, nitrate nitrogen) between monitoring regions; 3 parameters (total suspended solids, conductivity, silicon) in midstream region; and 2 parameters (conductivity, silicon) in upstream, lower stream and delta region which are the most significant parameters to discriminate between the four different seasons (spring, summer, autumn, winter). Thus, this study illustrates the usefulness of principal component analysis, factor analysis and discriminant analysis for the analysis and interpretation of complex datasets and in water quality assessment, identification of pollution sources/factors, and understanding of temporal and spatial variations of water quality for effective river water quality management.

2016 ◽  
Vol 2 (4) ◽  
pp. 211
Author(s):  
Girdhari Lal Chaurasia ◽  
Mahesh Kumar Gupta ◽  
Praveen Kumar Tandon

Water is an essential resource for all the organisms, plants and animals including the human beings. It is the backbone for agricultural and industrial sectors and all the small business units. Increase in human population and economic activities have tremendously increased the demand for large-scale suppliers of fresh water for various competing end users.The quality evaluation of water is represented in terms of physical, chemical and Biological parameters. A particular problem in the case of water quality monitoring is the complexity associated with analyzing the large number of measured variables. The data sets contain rich information about the behavior of the water resources. Multivariate statistical approaches allow deriving hidden information from the data sets about the possible influences of the environment on water quality. Classification, modeling and interpretation of monitored data are the most important steps in the assessment of water quality. The application of different multivariate statistical techniques, such as cluster analysis (CA), principal component analysis (PCA) and factor analysis (FA) help to identify important components or factors accounting for most of the variances of a system. In the present study water samples were analyzed for various physicochemical analyses by different methods following the standards of APHA, BIS and WHO and were subjected to further statistical analysis viz. the cluster analysis to understand the similarity and differences among the various sampling stations.  Three clusters were found. Cluster 1 was marked with 3 sampling locations 1, 3 & 5; Cluster-2 was marked with sampling location-2 and cluster-3 was marked with sampling location-4. Principal component analysis/factor analysis is a pattern reorganization technique which is used to assess the correlation between the observations in terms of different factors which are not observable. Observations correlated either positively or negatively, are likely to be affected by the same factors while the observations which are not correlated are influenced by different factors. In our study three factors explained 99.827% of variances. F1 marked  51.619% of total variances, high positive strong loading with TSS, TS, Temp, TDS, phosphate and moderate with electrical conductivity with loading values of 0.986, 0.970, 0.792, 0.744, 0.695,  0.701, respectively. Factor 2 marked 27.236% of the total variance with moderate positive loading with total alkalinity & temp. with loading values 0.723 & 0.606 respectively. It also explained the moderate negative loading with conductivity, TDS, and chloride with loading values -0.698, -0.690, -0.582. Factor F 3 marked 20.972 % of the variances with positive loading with PH, chloride, and phosphate with strong loading of pH 0.872 and moderate positive loading with chloride and phosphate with loading values 0.721, and 0.569 respectively. 


Sign in / Sign up

Export Citation Format

Share Document