Prediction of PM2.5 pollution in Tehran air based on temperature and pressure using Markovian regime-switching non-parametric additive transitive regression model

Author(s):
Message:
Article Type:
Research/Original Article (دارای رتبه معتبر)
Abstract:

In this paper, we introduce the Markovian regime-switching regression model, which is a graphical model based on the hidden Markov model. This model can be viewed as a clustered regression model, in which a Markov process models the transition from one cluster to another. These clusters are indeed the hidden states of the process, in the hidden Markov model, which are assumed to be a Markov process of order one. Besides, other assumptions of the hidden Markov model are assumed in this model, while the emission distribution is assumed to be the conditional distribution of the response given the covariates and the states. As an application of this model, the problem of prediction of PM 2.5 pollution in Tehran's air based on temperature and pressure during 2015-2017 using the Markovian regime-switching non-parametric additive transitive model, is considered and studied. Furthermore, the package hhsmm in R software, is introduced as a powerful tool for modeling the stated model.

Introduction

State-switching models are models in which the distribution of a sequence of observations (usually during a time interval) is controlled by a sequence of hidden states, such that the conditional distribution of observations given each state is different from that given others. Hidden Markov and semi-Markov Models [27] are the most common instances of state-switching models, in which the hidden state is a Markov or semi-Markov process. Some other models, including the regime-switching models or Kalman-Filter model, are in this category. Various applications of such models are introduced by the researchers including, speech recognition [12], cognitive learning [24], brain performance modeling [15], modeling environmental processes [4, 5, 6], sequential analysis, reliability theory [7], biological analysis [8, 9, 27], and many other applications.

Main Results

A hidden Markov model is constructed by the following items:   (1) Transition Probability Matrix $\pmb{\Gamma} =(\gamma_{ij})$, where\begin{equation*}\gamma_{ij}=\Pr(S_{t+1}=j|S_t=i),   i,j=1,\ldots,J,\end{equation*}such that\begin{equation*}\sum_{i=1}^{J}{\gamma_{ij}}=1,   j=1,\ldots,J.\end{equation*}   (2) Initial State Probability $ \pmb{\delta}=(\delta_j) $, where\begin{equation*}\delta_j=\Pr(S_1=j),    j=1,\ldots,J;    \sum_{j=1}^{J}{\delta_j}=1.\end{equation*}   (3) Observation distributions $ f_1(y),\ldots,f_J(y) $, where$$ f_j(y)=\Pr(Y_t=y | S_t=j);   j=1,\ldots,J,$$which are also called state-dependent distribution or emission distribution. When $ y_t $ is a continuous random variable, $ f_j(y) $ is a probability density function, which is usually a normal distribution or mixture of normal distributions.The regime-switching regression model is introduced by [14] as follows:(2.1)                       $$ y_{t} = x_{t}^T \beta_{s_t} + \sigma_{s_t}\epsilon_t,$$in which $\{y_t\}$ is the sequence of responses, $\{x_t\}$ is the sequence of covariates, $\{\epsilon_t\}$ are sequence of (usually) i.i.d. normally distributed errors with zero mean and a variance equal to 1, and $\beta_{s_t}$ and $\sigma_{s_t}$ are the regression coefficients and the standard deviation of errors at state $s_t$, respectively. A generalization of the model (2.1) to the the additive regime-switching regression model is introduced by [20] as follows:(2.2)   $$y_{t} = \mu_{s_t} + \sum_{j=1}^p f_{j,s_t}(x_{j,t}) + \sigma_{s_t}\epsilon_t,$$Letting $x_t = (y_{t-\ell},\ldots,y_{t-L}, z_{t-\ell},\ldots,z_{t-L})$, for lags $L > \ell \geq 1$ in (2.2), the non-parametric additive transitive regime-switching regression model is obtained. All models in this paper and all necessary tools for modeling, initialization, fitting, and prediction of these models are included in the R package hhsmm, which can be downloaded from https://cran.r-project.org/package=hhsmm. The reader is also referred to [3] for more information and examples about hhsmm package.

Summary of Proofs/Conclusions

The data set for this paper is obtained from two sources. The AQI data set (PM2.5 values) are obtained from https://airnow.tehran.ir/, while the air temperature and pressure are obtained from Iran meteorological organization. Figure 3, shows the time-series plots of this data set. To visualize the additive regime-switching regression model, we first consider only the temperature as the covariate in the model. Figure 4, presents the prediction of PM2.5 in Tehran city air using a nonparametric additive regime-switching regression model, only based on the air temperature, in each of the four hidden states. The points in each state are colored by different colors and the curve of the prediction is drawn with the same color. As one can see from this figure, the predictive curves are fairly fitted to the points in each state. As a competitor model, we consider the single-state additive regression model. Figure 5, presents the result of the comparison of prediction precision of PM2.5 in Tehran city air, using two fitted models: the regime-switching regression model with four hidden states and the single state non-parametric additive regression model. The mean squared error in each model is presented in each plot. One can see that the non-parametric additive regime-switching regression model with four hidden states performs better than the single-state non-parametric additive regression model. Another introduced model is the additive transitive regime-switching regression model. Figure 6, shows the result of the prediction of PM2.5 in Tehran city air using a transitive regime switching non-parametric additive regression model with a 1-day lag. The mean squared error of the model is presented in the plot. One can see that this model performs better than the two other competitors. Finally, the additive transitive regime-switching model is used for the prediction of the future values of PM2.5. Figure 7, presents the out-of-sample prediction of PM2.5 in Tehran city air using a transitive regime switching non-parametric additive regression model with 10 days lag. The mean squared error of the model is equal to 99.3. The result of the prediction is satisfactory.

Language:
Persian
Published:
Journal of Mathematics and Society, Volume:8 Issue: 4, 2024
Pages:
1 to 21
https://magiran.com/p2695655  
دانلود و مطالعه متن این مقاله با یکی از روشهای زیر امکان پذیر است:
اشتراک شخصی
با عضویت و پرداخت آنلاین حق اشتراک یک‌ساله به مبلغ 1,390,000ريال می‌توانید 70 عنوان مطلب دانلود کنید!
اشتراک سازمانی
به کتابخانه دانشگاه یا محل کار خود پیشنهاد کنید تا اشتراک سازمانی این پایگاه را برای دسترسی نامحدود همه کاربران به متن مطالب تهیه نمایند!
توجه!
  • حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران می‌شود.
  • پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانه‌های چاپی و دیجیتال را به کاربر نمی‌دهد.
In order to view content subscription is required

Personal subscription
Subscribe magiran.com for 70 € euros via PayPal and download 70 articles during a year.
Organization subscription
Please contact us to subscribe your university or library for unlimited access!