scholarly journals Mutual Information Input Selector and Probabilistic Machine Learning Utilisation for Air Pollution Proxies

2019 ◽  
Vol 9 (20) ◽  
pp. 4475 ◽  
Author(s):  
Martha A. Zaidan ◽  
Lubna Dada ◽  
Mansour A. Alghamdi ◽  
Hisham Al-Jeelani ◽  
Heikki Lihavainen ◽  
...  

An air pollutant proxy is a mathematical model that estimates an unobserved air pollutant using other measured variables. The proxy is advantageous to fill missing data in a research campaign or to substitute a real measurement for minimising the cost as well as the operators involved (i.e., virtual sensor). In this paper, we present a generic concept of pollutant proxy development based on an optimised data-driven approach. We propose a mutual information concept to determine the interdependence of different variables and thus select the most correlated inputs. The most relevant variables are selected to be the best proxy inputs, where several metrics and data loss are also involved for guidance. The input selection method determines the used data for training pollutant proxies based on a probabilistic machine learning method. In particular, we use a Bayesian neural network that naturally prevents overfitting and provides confidence intervals around its output prediction. In this way, the prediction uncertainty could be assessed and evaluated. In order to demonstrate the effectiveness of our approach, we test it on an extensive air pollution database to estimate ozone concentration.

2019 ◽  
Vol 9 (19) ◽  
pp. 4069 ◽  
Author(s):  
Huixiang Liu ◽  
Qing Li ◽  
Dongbing Yu ◽  
Yu Gu

Air pollution has become an important environmental issue in recent decades. Forecasts of air quality play an important role in warning people about and controlling air pollution. We used support vector regression (SVR) and random forest regression (RFR) to build regression models for predicting the Air Quality Index (AQI) in Beijing and the nitrogen oxides (NOX) concentration in an Italian city, based on two publicly available datasets. The root-mean-square error (RMSE), correlation coefficient (r), and coefficient of determination (R2) were used to evaluate the performance of the regression models. Experimental results showed that the SVR-based model performed better in the prediction of the AQI (RMSE = 7.666, R2 = 0.9776, and r = 0.9887), and the RFR-based model performed better in the prediction of the NOX concentration (RMSE = 83.6716, R2 = 0.8401, and r = 0.9180). This work also illustrates that combining machine learning with air quality prediction is an efficient and convenient way to solve some related environment problems.


2020 ◽  
Author(s):  
Roland Stirnberg ◽  
Jan Cermak ◽  
Simone Kotthaus ◽  
Martial Haeffelin ◽  
Hendrik Andersen ◽  
...  

Abstract. Air pollution, in particular high concentrations of particulate matter smaller than 1 µm in diameter (PM1), continues to be a major health problem, and meteorology is known to substantially contribute to atmospheric PM concentrations. However, the scientific understanding of the complex mechanisms leading to high pollution episodes is inconclusive, as the effects of meteorological variables are not easy to separate and quantify. In this study, a novel, data-driven approach based on empirical relationships is used to characterise the role of meteorology on atmospheric concentrations of PM1. A tree-based machine learning model is set up to reproduce concentrations of speciated PM1 at a suburban site southwest of Paris, France, using meteorological variables as input features. The contributions of each meteorological feature to modeled PM1 concentrations are quantified using SHapley Additive exPlanation (SHAP) regression values. Meteorological contributions to PM1 concentrations are analysed in selected high-resolution case studies, contrasting season-specific processes. Model results suggest that winter pollution episodes are often driven by a combination of shallow mixed layer heights (MLH), low temperatures, low wind speeds or inflow from northeastern wind directions. Contributions of MLHs to the winter pollution episodes are quantified to be on average ~ 5 µg/m³ for MLHs below 500 m agl. Temperatures below freezing initiate formation processes and increase local emissions related to residential heating, amounting to a contribution of as much as ~ 9 µg/m³. Northeasterly winds are found to contribute ~ 5 µg/m³ to total PM1 concentrations (combined effects of u- and v-wind components), by advecting particles from source regions, e.g. central Europe or the Paris region. However, in calm conditions (i.e. wind speeds


2019 ◽  
Vol 2 (2) ◽  
pp. 42
Author(s):  
Paulo Andre Lima De Castro ◽  
Anderson R.B. Teodoro

Financial operations involve a significant amount of resources and can directly or indirectly affect the lives of virtually all people. For the efficiency and transparency in this context, it is essential to identify financial crimes and to punish the responsible. However, the large number of operations makes it unfeasible for analyzes made exclusively by humans. Thus, the application of automated data analysis techniques is essential. Within this scenario, this work presents a method that identifies anomalies that may be associated with operations in the stock exchange market prohibited by law. Specifically, we seek to find patterns related to insider trading. These types of operations can generate big losses for investors. In this work, publicly available information by the SEC and CVM, based on real cases on BOVESPA, NYSE and NASDAQ stock exchanges, is used as a training base. The method includes the creation of several candidate variables and the identification of relevant variables. With this definition, classifiers based on decision trees and Bayesian networks are constructed, and, after, evaluated and selected. The computational cost of performing such tasks can be quite significant, and it grows quickly with the amount of analyzed data. For this reason, the method considers the use of machine learning algorithms distributed in a computational cluster. In order to perform such tasks, we use the Weka framework with modules that allows the distribution of the processing load in a Hadoop cluster. The use of a computational cluster to execute learning algorithms in a large amount of data has been an active area of research, and this work contributes to the analysis of data in the specific context of financial operations. The obtained results show the feasibility of the approach, although the quality of the results is limited by the exclusive use of publicly available data.


Atmosphere ◽  
2021 ◽  
Vol 12 (10) ◽  
pp. 1295
Author(s):  
Tserenpurev Chuluunsaikhan ◽  
Menghok Heak ◽  
Aziz Nasridinov ◽  
Sanghyun Choi

Air pollution is a critical problem that is of major concern worldwide. South Korea is one of the countries most affected by air pollution. Rapid urbanization and industrialization in South Korea have induced air pollution in multiple forms, such as smoke from factories and exhaust from vehicles. In this paper, we perform a comparative analysis of predictive models for fine particulate matter in Daejeon, the fifth largest city in South Korea. This study is conducted for three purposes. The first purpose is to determine the factors that may cause air pollution. Two main factors are considered: meteorological and traffic. The second purpose is to find an optimal predictive model for air pollutant concentration. We apply machine learning and deep learning models to the collected dataset to predict hourly air pollutant concentrations. The accuracy of the deep learning models is better than that of the machine learning models. The third purpose is to analyze the influence of road conditions on predicting air pollutant concentration. Experimental results demonstrate that considering wind direction and wind speed could significantly decrease the error rate of the predictive models.


10.1596/33038 ◽  
2019 ◽  
Author(s):  
Lelia Croitoru ◽  
Jiyoun Christina Chang ◽  
Andrew Kelly
Keyword(s):  

2020 ◽  
Author(s):  
Rıdvan Karacan

<p>Today, production is carried out depending on fossil fuels. Fossil fuels pollute the air as they contain high levels of carbon. Many studies have been carried out on the economic costs of air pollution. However, in the present study, unlike the former ones, economic growth's relationship with the COVID-19 virus in addition to air pollution was examined. The COVID-19 virus, which was initially reported in Wuhan, China in December 2019 and affected the whole world, has caused many cases and deaths. Researchers have been going on studying how the virus is transmitted. Some of these studies suggest that the number of virus-related cases increases in regions with a high level of air pollution. Based on this fact, it is thought that air pollution will increase the number of COVID-19 cases in G7 Countries where industrial production is widespread. Therefore, the negative aspects of economic growth, which currently depends on fossil fuels, is tried to be revealed. The research was carried out for the period between 2000-2019. Panel cointegration test and panel causality analysis were used for the empirical analysis. Particulate matter known as PM2.5[1] was used as an indicator of air pollution. Consequently, a positive long-term relationship has been identified between PM2.5 and economic growth. This relationship also affects the number of COVID-19 cases.</p><p><br></p><p><br></p><p>[1] "Fine particulate matter (PM2.5) is an air pollutant that poses the greatest risk to health globally, affecting more people than any other pollutant (WHO, 2018). Chronic exposure to PM2.5 considerably increases the risk of respiratory and cardiovascular diseases in particular (WHO, 2018). For these reasons, population exposure to (outdoor or ambient) PM2.5 has been identified as an OECD Green Growth headline indicator" (OECD.Stat).</p>


2017 ◽  
Vol 68 (4) ◽  
pp. 858-863
Author(s):  
Mihaela Oprea ◽  
Marius Olteanu ◽  
Radu Teodor Ianache

Fine particulate matter with a diameter less than 2.5 �m (i.e. PM2.5) is an air pollutant of special concern for urban areas due to its potential significant negative effects on human health, especially on children and elderly people. In order to reduce these effects, new tools based on PM2.5 monitoring infrastructures tailored to specific urban regions are needed by the local and regional environmental management systems for the provision of an expert support to decision makers in air quality planning for cities and also, to inform in real time the vulnerable population when PM2.5 related air pollution episodes occur. The paper focuses on urban air pollution early warning based on PM2.5 prediction. It describes the methodology used, the prediction approach, and the experimental system developed under the ROKIDAIR project for the analysis of PM2.5 air pollution level, health impact assessment and early warning of sensitive people in the Ploiesti city. The PM2.5 concentration evolution prediction is correlated with PM2.5 air pollution and health effects analysis, and the final result is processed by the ROKIDAIR Early Warning System (EWS) and sent as a message to the affected population via email or SMS. ROKIDAIR EWS is included in the ROKIDAIR decision support system.


2019 ◽  
Vol 28 (1) ◽  
pp. 349-354 ◽  
Author(s):  
Ahmed Samy Abd El Aziz Moursi ◽  
Marwa Shouman ◽  
Ezz El-din Hemdan ◽  
Nawal El-Fishawy

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Zahra Khorrami ◽  
Mohsen Pourkhosravani ◽  
Maysam Rezapour ◽  
Koorosh Etemad ◽  
Seyed Mahmood Taghavi-Shahri ◽  
...  

AbstractLung cancer is the most rapidly increasing malignancy worldwide with an estimated 2.1 million cancer cases in the latest, 2018 World Health Organization (WHO) report. The objective of this study was to investigate the association of air pollution and lung cancer, in Tehran, Iran. Residential area information of the latest registered lung cancer cases that were diagnosed between 2014 and 2016 (N = 1,850) were inquired from the population-based cancer registry of Tehran. Long-term average exposure to PM10, SO2, NO, NO2, NOX, benzene, toluene, ethylbenzene, m-xylene, p-xylene, o-xylene (BTEX), and BTEX in 22 districts of Tehran were estimated using land use regression models. Latent profile analysis (LPA) was used to generate multi-pollutant exposure profiles. Negative binomial regression analysis was used to examine the association between air pollutants and lung cancer incidence. The districts with higher concentrations for all pollutants were mostly in downtown and around the railway station. Districts with a higher concentration for NOx (IRR = 1.05, for each 10 unit increase in air pollutant), benzene (IRR = 3.86), toluene (IRR = 1.50), ethylbenzene (IRR = 5.16), p-xylene (IRR = 9.41), o-xylene (IRR = 7.93), m-xylene (IRR = 2.63) and TBTEX (IRR = 1.21) were significantly associated with higher lung cancer incidence. Districts with a higher multiple air-pollution profile were also associated with more lung cancer incidence (IRR = 1.01). Our study shows a positive association between air pollution and lung cancer incidence. This association was stronger for, respectively, p-xylene, o-xylene, ethylbenzene, benzene, m-xylene and toluene.


Sign in / Sign up

Export Citation Format

Share Document