1. Introduction
The rise of urban pollutants is a major concern due to their harmful effects on respiratory health, as it was seen in the deaths of Belgium (1930), Pennsylvania (1948), and London (1952) (Rodríguez et al., 2011). Furthermore, environmental pollution promotes the development of affective disorders and brain diseases, and significantly impacts mental health due to the toxicity of specific pollutants for the central nervous system (Ordóñez-Iriarte, 2020).
Due to their physical and chemical properties, particle size plays a fundamental role in solid pollutants, directly impacting climate and public health. Myridakis and Stephanou (2025) found a strong correlation between increased mortality and the presence of particles smaller than 10 μm (PM10) and 2.5 μm (PM2.5).
PM10 particles originate from various sources, such as dust, pollen, smoke, and soot, combined with liquid droplets of different substances (Paital and Agrawal, 2022). Their formation can result from both natural processes, such as volcanic eruptions, sandstorms, wildfires, soil erosion, or chemical reactions between gases released into the atmosphere, and anthropogenic activities, including industrial processes and the combustion of fossil fuels (Ukaogo et al., 2020).
Pollution affects the immune system and causes hormonal changes that can lead to immunosuppressive or autoimmune diseases (Huang et al., 2019). At the genetic level, exposure to solid pollutants has been observed to cause DNA protein destruction, promoting the development of various types of cancer (Vallabani et al., 2023), as well as altering proteins involved in the regulation of genotoxins, increasing their harmful effects (Badran et al., 2020).
Recently, Scapini et al. (2023) found a positive relationship between atmospheric pollutant concentrations and the spread of the SARS-CoV-2 virus, with a proportional increase in weekly cases corresponding to PM10 levels. On the other hand, Buoli et al. (2018) reported an increased risk of psychotropic drug prescriptions in children and adolescents for every 10 μm rise in solid pollutant concentrations. Higher pollution levels raise the risk of emergency room visits, making pollutant modeling and forecasting vital for public health and policy planning.
Additionally, studies using geostatistical techniques such as the nearest neighbor index (NNI) and nearest neighbor hierarchical clustering (NNHC) have linked high concentrations of population to an increased incidence of breast cancer (Gasca-Sánchez et al., 2021). These conditions contribute to health problems, especially in individuals with pre-existing respiratory or visual diseases. Inhaled particles can trigger both local and systemic effects by activating monocytes even at very low concentrations (ng mL-1). Such activation increases the expression of adhesion molecule receptors, enhancing interactions between monocytes and endothelial cells, potentially exacerbating systemic inflammation (Quintana-Belmares et al., 2018; Leal-Iga, 2019).
Some of the most used models for forecasting atmospheric pollutant concentrations include time series models, neural networks, and multivariate adaptive regression splines (MARS), which combine the flexibility of machine learning methods with the interpretability of traditional statistical models (Alvarado et al., 2010).
In this context, the use of time series is widely employed to model the behavior and prediction of PM10 on a large scale. An example of this is the case of China, where 272 cities were analyzed, and a relationship was found between pollution levels and mortality from non-accidental causes such as cardiopulmonary diseases (Renjie et al., 2025).
Silva et al. (1994) used time series models that incorporate transfer functions with meteorological variables, which account for up to 40% of the mean absolute error in their predictions. With the same kind of models, Alvarado et al. (2010) recorded peak values of 240 μm in large metropolises such as Santiago de Chile.
Due to its high population density, the Mexico City Metropolitan Area, which averages PM10 concentrations of 75 μg m-3 at a specific hourly frequency (Villaseñor et al., 2000), places the most significant emphasis on air pollution monitoring. In addition to the high levels of atmospheric pollutants, this area is considered one of the most polluted urban regions in the world (Cárdenas-Moreno et al., 2021).
In the case of the city of Monterrey, located in the northern state of Nuevo León, records on PM10 pollution have not been as extensive as in Mexico City, where air quality has been systematically recorded for over 40 years (Raga et al., 2001). Records in Monterrey were initiated in the early 2000s (Aguirre-López et al., 2022), and an annual PM10 concentration of 61 to 80 μg m-3 has been reported (Cong et al., 2019). Levels exceeding 60 μg m-3 have been associated with industrial activities, particularly the operation of cement factories and food processing plants in the region. On the other hand, the Mexican Official Standard NOM-025-SSA1-2014, which regulates the permissible air quality limits for PM10 at the national level, establishes maximum values of 75 μg m-3 as a 24-h average and of 40 μg m-3 as an annual average (SSA, 2014).
Likewise, it has been reported that PM10 exhibits a stationary behavior over long periods, while over short periods it shows non-stationary characteristics with a seven-day periodicity, fitting Gamma, Weibull, and logarithmic distributions (Cárdenas-Moreno et al., 2021).
The present study aims to model and forecast the behavior of PM10 with a daily frequency in the city of Monterrey, Nuevo León, Mexico, using ARIMA models with exogenous variables. Additionally, a descriptive analysis of PM10 concentrations over the study period is provided, detailing significant variations, general trends, and specific episodes of elevated levels to understand and characterize the behavior of PM10.
2. Materials and methods
2.1 Data
The PM10 data used in this study were obtained from the National Institute of Ecology and Climate Change (INECC, 2019), with hourly information available for the period 2002 to 2014. The selected monitoring station was Obispado (code: CE), located in Monterrey, Nuevo León, with geographic coordinates 25.67598 latitude and -100.3384 longitude.
The climatological variables were extracted on an hourly basis from the POWER project (v. 2.1.17) of NASA’s Langley Research Center (NASA, 2025). These variables include precipitation (PP, mm hr-1), wind speed at 2 m above the ground (WS2M, m s-1), wind speed at 10 m above the ground (WS10M, m s-1), and atmospheric pressure (PS, kPa).
2.2 Data processing
2.2.1 Imputation and removal of outliers
Data processing and analysis were conducted using R (R Core Team, 2023). Outliers were removed using the interquartile range method; then, the daily mean value of the pollutant was calculated, representing the average of the values recorded throughout the day.
Subsequently, missing data were imputed using the Kalman filter method, particularly suitable for time series with strong seasonality. This method utilizes a state-space model and the Kalman filter to estimate missing values, taking into account trends, noise, seasonal patterns, and underlying temporal relationships. Imputation and removal of outliers were performed with the imputeTS R library (Moritz and Bartz-Beielstein, 2017).
2.2.2 Yeo-Johnson transformation
Yeo-Johnson transformations are a generalization of Box-Cox transformations, designed to handle both positive and negative data using a parameter λ, which defines the transformation (Table I). The mathematical expression of the transformation varies depending on the sign of x and the value of λ, dividing into different cases according to these conditions, as shown in Eq. (1).
Table I Estimated λ values in the Yeo-Johnson transformation.
| Variable | λ |
| PM10 | 0.0990325 |
| WS2M | -0.1796091 |
| WS10M | -0.1654505 |
| PS | -2.847771 |
PM10: particles smaller than 10 μm; WS2M (WS10M): wind at 2 (10) m above ground; PS: atmospheric pressure.
where x represents the original variable to be transformed and λ is the transformation parameter. Likewise, inverse transformations (Eq. [2]) can be obtained with the best normalization library (Peterson, 2023)
PM10 and the climatological variables were transformed with the first term of Eq. (1). The corresponding λ values applied to the data are shown in Table I.
Inverse transformation of PM10 and the climatological variables were obtained with the first term of Eq. (2).
2.2.3 Training and testing data
The data were partitioned into training and testing sets using an 80-20% split, resulting in 3798 data points for training. The remaining 950 data points were used to compute an annual daily average for model testing. The training set covered the period from 2002 to 2012, while the test set spanned from 2012 to 2014.
2.3 ARIMA model
2.3.1 Univariate ARIMA
An ARIMA model is a statistical model used to analyze and predict future values of a time series based on its past values. Its structure is based on the combination of autoregressive (AR), differencing (d, D), and moving average (MA) components, allowing it to capture both trends and seasonal patterns in the data, as shown in Eq. (3).
where Y t represents the value of the series at time t (in this case, the PM10 pollutant). The parameters d and D indicate the regular and seasonal differencing orders, respectively, while p and q represent the orders of the AR and MA terms. P and Q correspond to the orders of the AR and MA terms in their seasonal component. S defines the periodicity of the time series, which in this case is 365, corresponding to the daily frequency, and μ represents the meaning of the output series. The coefficients ϕ i and Φ j are associated with the AR terms, both in their regular and seasonal versions. In contrast, θ k and Θ m correspond to the coefficients of the MA terms, also in their regular and seasonal variants. Finally, a t is the error term at time t, assuming a mean of zero.
2.3.2 ARIMA with transfer functions
A general transfer function model can AR terms for the output series, MA terms, and transfer function terms, representing the dynamic relationship between the input and output variables.
The general form of a transfer function model can be expressed as Eq. (4):
where ωk(B) = ω
k,0
+ ω
k,1
B + … + ω
k,rk
B
rk
is the numerator (or gain) polynomial for the exogenous variable
X
k,t
. δ
k
(B) = 1 - δ
k,1
B - … - δ_(
k,sk
) B
Sk
is the denominator (or feedback) polynomial. The term
B
dk
represents a pure lag (d
k
steps) before X
k,t
impacts Y
t
. If the effects of X
k,t
were instantaneous and had no memory, then ω
k
(B) and δ
k
(B) are reduced to constants (order 0), and
sometimes d
k
= 0;
represents the transfer component of each exogenous variable.
Y
t
can be explicitly written as Eq. (5):
The inverse operator of Eq. (5) can be expanded as an infinite series in B, reflecting the dependence of Y t on its lags (ARIMA) and the lags of the exogenous variables.
On the other hand, in the “static” transfer model or ARIMAX without exogenous lags, each variable X k,t influences Y t only with its contemporary value (at the same instant t). No polynomials in B are included to describe possible delays or attenuations over time, so the effect of each X k,t is reflected instantaneously in Y t , as shown in Eq. (6), based on the work of Box et al. (2015).
where N t is the noise represented by the ARIMA (p, d, q) model, as shown in Eq. (7).
where a t is the white noise of the model, which can finally be expressed as Eq. (8):
2.4 GARCH models
The generalized autoregressive conditional heteroskedasticity (GARCH) model is widely used in time series analysis, especially in finance, to model and predict volatility. It extends the autoregressive conditional heteroskedasticity (ARCH) model by incorporating past conditional variances into the equation. The GARCH model helps capture volatility clustering, a common feature in financial time series where periods of high volatility are followed by periods of even higher volatility, and calm periods follow each other. The volatility is modelled by the variance of the mean (μ) process (which can be modelled by an ARIMA model), and random shocks (Eq. [9]).
where σ t 2 is the conditional variance at time t, α 0 > 0, α i ≥ 0; β j ≥ 0 are parameters to be estimated; p represents GARCH terms (effect of past variances), and q represents ARCH terms (effect of past squared residuals). GARCH models are useful to model residuals of an ARIMA model.
2.5 Metrics for model selection and evaluation of prediction accuracy
The metrics used in time series analysis can be divided into two categories based on the phase in which they are applied: training and testing. During the training phase, criteria such as the Akaike information criterion (AIC), its corrected version (AICc), and the Bayesian information criterion (BIC) are used to assess the model’s fit to the observed data, penalizing complexity to avoid overfitting (Hyndman and Athanasopoulos, 2018).
On the other hand, during the testing phase, metrics are used to measure the error between the model’s predictions and the training values (which were not used to fit the model). These include the mean absolute error (MAE) (Cao, 2024) and the mean squared error (MSE), along with its square root (RMSE) (Gladkova and Saychenko, 2022). These metrics help evaluate the model’s ability to generate accurate predictions on unseen data, ensuring its generalization and usefulness in future scenarios.
Granger causality analysis is an econometric method based on the premise that if one variable contributes to the prediction of another, its past values must contain relevant information that improves the estimation of the second variable beyond what its past values can provide. This approach has been used in previous studies to analyze the relationship between pollutants and various economic variables (Xuan, 2024). In the context of this study, it will be applied to assess whether there is a causal relationship between meteorological variables and PM10 concentration, determining to what extent meteorological conditions can influence the variability of this pollutant.
3. Results and discussion
3.1 Descriptive statistics of the time series
The variables with the highest dispersion are precipitation (PP) and PM10 (Fig. 1). This is evident in the descriptive statistics (Table II), which reveal a considerable difference between minimum and maximum values. In the case of precipitation, it shows positive skewness coefficients, indicating a non-normal distribution. Additionally, its kurtosis values are also positive and high, notably significant due to the daily frequency used in this study. This contrasts with analyses based on annual averaged data, where variability tends to decrease (Villarreal-Macés and Díaz-Viera, 2018).

Fig. 1 Graphs of time series for the following variables: (a) precipitation, (b) wind speed at 2 m above the ground, (c) wind speed at 10 m above the ground, (d) atmospheric pressure, and (e) particles smaller than 10 μm.
Table II Descriptive statistics of climatology variables and PM10.
| Statistics | PP | WS2M | WS10M | PS | PM10 |
| Minimum | 0.00 | 0.65 | 1.11 | 82.73 | 9.76 |
| Maximum | 5.92 | 7.60 | 10.87 | 85.29 | 141.60 |
| Quartile 1 | 0.00 | 1.92 | 2.94 | 83.79 | 49.79 |
| Quartile 3 | 0.04 | 2.89 | 4.19 | 84.16 | 81.78 |
| Mean | 0.06 | 2.44 | 3.62 | 83.98 | 66.07 |
| Median | 0.00 | 2.38 | 3.53 | 83.98 | 65.54 |
| Variance | 0.0347 | 0.4975 | 0.9173 | 0.0853 | 501.7 |
| Standard Deviation | 0.186 | 0.705 | 0.958 | 0.292 | 22.4 |
| Skewness | 11.47 | 0.84 | 0.98 | 0.047 | 0.184 |
| Kurtosis | 258.43 | 2.25 | 3.12 | 0.46 | 0.404 |
PP: precipitation; WS2M (WS10M): wind at 2 (10) m above ground; PS: atmospheric pressure; PM10: particles smaller than 10 μm.
On the other hand, the variance of PM10 is moderate to high, which could be attributed to the use of only one year of records in such analyses. These results highlight the importance of considering both frequency and period of observation when analyzing dispersion and variability in time series (Contreras-Arreola and González, 1999).
On the other hand, the variables WS2M and WS10M exhibit relatively close values, with WS10M consistently higher, as expected due to the well-known increase of horizontal wind speed with altitude within the atmospheric surface layer. Finally, atmospheric pressure (PS) remains relatively constant throughout the period analyzed, exhibiting the lowest skewness and kurtosis values, which suggests an approximately normal distribution (Table II). This normality assumption is statistically relevant because it allows the use of parametric statistical methods that rely on normality, such as linear regression, correlation analyses, and hypothesis testing procedures, without requiring data transformation. Additionally, this characteristic simplifies modeling and forecasting procedures, improving the reliability and interpretability of results derived from standard parametric techniques.
3.2 Decomposition of the PM 10 series
Before any transformation is applied to the data, the time series is decomposed into its original components using the multiplicative method. This method captures the interaction between trend, seasonality, and random variability more precisely, improving its interpretation.
Figure 2a shows a significant decrease in PM10 levels, which becomes more pronounced toward the end of 2012. However, abrupt increases are identified during 2002-2004, 2006-2008, and 2011-2012. Figure 2b also reveals a well-defined seasonal pattern in the PM10 series, indicating recurring fluctuations throughout the year, likely associated with changes in meteorological conditions and other environmental factors. The random component in Figure 2c exhibits relatively constant variability, except for certain outliers, mainly in 2013 and 2014.
3.3 Breakpoints in the Time Series PM 10
Breakpoint analysis in time series is a technique that allows for detecting structural changes based on the lowest values of the Bayesian information criterion (BIC) and the residual sum of squares (RSS) (Bai and Perron, 1998, 2003; Zeileis et al., 2003).
Based on Figure 3, it was determined that the model with four structural breakpoints is appropriate. Adding a fifth breakpoint increases complexity without substantially improving model fit, making it unnecessary according to the principle of parsimony (Table III).

Fig. 3 Bayesian information criterion (BIC) and residual sum of squares (RSS) values for the time series of PM10.
Table III Model fit for different numbers of breakpoints.
| Number of breakpoints | 0 | 1 | 2 | 3 | 4 | 5 |
| RSS | 2419141 | 2092984 | 2070101 | 2055377 | 2048705 | 2050614 |
| BIC | 40200 | 39581 | 39550 | 39535 | 39537 | 39558 |
RSS: residual sum of squares; BIC: Bayesian information criterion.
Figures in bold correspond to the lowest RSS and BIC values.
Based on Table IV, breakpoints were identified in 2004, 2006, 2009, and 2012. Moderate variance between segments is observed at the initial breakpoints (Fig. 4); however, variance reaches its highest level in 2009 compared to other segments. Subsequently, there is a considerable decrease in variance at the last breakpoint. This suggests a significant reduction in pollutant concentrations starting from 2006, possibly linked to changes in environmental policies or specific meteorological conditions. Nevertheless, when analyzing the last segment, an increase in pollutant concentration becomes evident. Further data are required to determine whether this upward trend will persist or if atmospheric concentrations will decrease again.
Table IV Breakpoints based on RSS and BIC.
| m | Break years | ||||
| 1 | 2011 | ||||
| 2 | 2006 | 2012 | |||
| 3 | 2004 | 2006 | 2012 | ||
| 4 | 2004 | 2006 | 2009 | 2012 | |
| 5 | 2004 | 2006 | 2008 | 2009 | 2012 |
m: number of breakpoints; BIC: Bayesian information criterion; RSS: residual sum of squares.
Figures in bold represent the number of breakpoints in the time series selected as the best.
3.4 Transformation of time series
Based on the skewness and kurtosis statistics presented in Table II and the Jarque-Bera normality test results, which yielded significant p-values, the null hypothesis of normality is rejected. This indicates that the meteorological variables and PM10 pollutant variables do not follow a normal distribution.
In this context, series transformations were performed (Fig. 5) to reduce their variance, following the algorithm’s recommendations for ARIMA model selection. Although these transformations did not fully achieve normality, they significantly reduced the variance, which is essential for stabilizing the time series and facilitating model fitting, as stated by Hyndman and Athanasopoulos (2018).

Fig. 5 Transformed time series for (a) precipitation, (b) wind at 2 m above ground, (c) wind at 10 m above ground, (d) atmospheric pressure, and (e) PM10.
The autocorrelation (ACF) and partial autocorrelation (PACF) analysis of the PM10 pollutant reveals specific characteristics of the time series. In the ACF (Fig. 6a), a gradual decay is observed, along with significant peaks reflecting the influence of seasonality and persistence over time. This behavior suggests a prolonged correlation between past and future values of the pollutant.
On the other hand, the PACF (Fig. 6b) shows a more limited behavior, with only a few significant initial lags, indicating that direct effects from specific lags rapidly diminish. However, certain persistent lags reflect strong seasonal patterns, consistent with the observations from the ACF.
3.5 Granger causality tests
Significant F-values were obtained for all comparisons conducted (Table V), suggesting that meteorological conditions have a significant effect on the dynamics of PM10. This reinforces the importance of including these variables in the model, consistent with Granger’s (1969) hypothesis, which states:
H 0 : The lags of x do not provide significant information for predicting y.
H 1 : The past values of x contain useful information for predicting y.
Table V Results of Granger causality tests.
| Comparison | F-value | p-value |
| PM10 vs. PP | 144 | < 2.2 e - 16 |
| PM10 vs. WS2M | 242.08 | < 2.2 e - 16 |
| PM10 vs. WS10M | 169.14 | < 2.2 e - 16 |
| PM10 vs. PS | 371.59 | < 2.2 e - 16 |
PP: precipitation; WS2M (WS10M): wind at 2 (10) m above the ground; PS: atmospheric pressure; PM10: particles smaller than 10 μm.
Figures in bold indicate significance.
3.6 ARIMA Models
3.6.1 Univariate model
The ARIMA (1,1,2)(0,0,0)[365] model fitted to the training data for the PM10 variable was evaluated based on the significance of its coefficients, which were highly significant (Table VI), indicating that the model effectively captures the dynamics of the PM10 time series.
Table VI Coefficient test for the univariate ARIMA (1,1,2)(0,0,0)[365] model.
| Coefficient | Estimate | Standard error | z value | Pr (>|z|) |
| ϕ 1 | 0.278737 | 0.035123 | 7.9359 | 2.089e-15 |
| θ 1 | 0.719585 | 0.034952 | 20.5878 | < 2.2e-16 |
| θ 2 | 0.030916 | 7.7386 | 1.005e-14 |
Figures in bold indicate significance.
The non-seasonal component of the model included an autoregressive term, indicating that the pollutant levels on any given day are directly influenced by the previous day’s values. Differencing was necessary to stabilize the series and ensure stationarity. Additionally, the model included two moving average terms, reflecting the influence of past errors on current predictions.
The fitted ARIMA (1,1,2)(0,0,0)[365] model (Eq. [10]) satisfied the criteria of residual independence, as assessed by the Ljung-Box test, and stationarity, verified through the augmented Dickey-Fuller (ADF) test (Table VII).
Table VII Selection of ARIMA models and accuracy metrics for PM10.
| Accuracy metrics | |
| AIC | 9044.21 |
| AICc | 9044.22 |
| BIC | 9069.18 |
| Ljung-box (p-value) | 0.9658 |
| Augmented Dickey-Fuller (p-value) | > 0.01 |
| RMSE | 1.1328 |
| MSE | 1.2831 |
| MAE | 0.9208 |
| MAPE | 611.64 |
AIC: Akaike information criterion; AICc: Akaike information criterion corrected version; BIC: Bayesian information criterion; RMSE: root mean squared error; MSE: mean squared error; MAE: mean absolute error; MAPE: mean absolute percentage error.
Furthermore, when comparing the MAE obtained in this study with the results reported by Hernández et al. (2021), it is evident that the univariate ARIMA model exhibits a significantly lower MAE (1.8664 µg m-3) than the one obtained using the evolutionary algorithm (12.5 µg m-3). This difference might be attributed to the evolutionary algorithm being applied by Hernández-Vega et al. (2021), whose study covered only seven months with hourly frequency and was applied at a nearby, but not identical, monitoring station.
It is worth noting that an evolutionary algorithm is an optimization method inspired by the principles of natural selection and biological evolution, in which a population of candidate solutions evolves through multiple iterations using operators such as mutation, crossover, and selection, to progressively find optimal or near-optimal solutions for a given problem.
However, the mean absolute percentage error (MAPE) of the ARIMA model is considerably higher due to noise that is not fully explained by the model, distorting the calculations by inflating errors. This results in a higher percentage error compared to other metrics, reflecting unexplained variability (noise) within the series.
3.6.2 Residual analysis of univariate models
The residuals of the ARIMA models exhibited stationarity and independence, indicating that their mean and variance remain constant over time, and no evidence of significant autocorrelation between lags was found. This behavior further supports the overall validity of the model.
In the ACF plots (Fig. 7), showing residual correlations up to lag 40, most values fall within the confidence limits. This indicates that the fitted model accurately captures the dynamics of the time series and does not leave significant autocorrelations unaccounted for. However, two significant peaks are identified at lags 7 and 21, which could be associated with sporadic events.

Fig. 7 Autocorrelation function (ACF) and partial autocorrelation function (PACF) of residuals from the ARIMA (1,1,2)(0,0,0)[365] model.
The skewness observed in the residual histogram reflects a moderate asymmetry, while the kurtosis suggests a slight deviation from normality. Despite these deviations, residuals predominantly remain within acceptable statistical limits, supporting the validity of the model fit for the PM10 time series. Additionally, the PACF plot (Fig. 7) illustrates correlations between a value and its specific lag while controlling for intermediate lag effects. In our case, it indicates that most correlations remain within confidence bounds for residuals, which implies that the fitted model adequately captures the dynamics of the PM10 pollutant time series.
3.7 Transfer function ARIMA model for PM 10
A general transfer function model incorporates AR and MA terms to capture the internal dynamics of the output series, as well as transfer terms that represent the dynamic relationship between input variables and the output variable. In this case, the fitted model for the PM10 series is an ARIMA (1,1,1)(0,0,0)[365], which includes transfer function terms for the meteorological variables (precipitation, wind speed at 2 and 10 m above ground, and atmospheric pressure) (Eq. [11]).
The estimated coefficients for the transfer function model were significant (Table VIII), indicating that each term significantly contributes to the model’s fit, similar to the study by (Analitis et al., 2020), where meteorological variables were used to analyze their relationship with particulate matter.
Table VIII Estimated coefficients of the ARIMA PM10 model with transfer function terms.
| Coefficient | Estimate | Standard error | p-values |
| ϕ 1 | 0.4367 | 0.0185 | < 2.2e-16 |
| θ 1 | 0.9662 | 0.0069 | < 2.2e-16 |
| ω 0 | 0.0726 | 0.015 | 1.402e-6 |
| ω 1 | 1.471 | 0.0973 | < 2.2e-16 |
| ω 2 | 1.4729 | 0.0948 | < 2.2e-16 |
| ω 3 | 0.2699 | 0. 0169 | < 2.2e-16 |
ω 0 : precipitation; ω 1 : wind speed at 2 m above ground; ω 2 : wind speed at 10 m above ground; ω 3 : atmospheric pressure.
Figures in bold indicate significance.
The autoregressive coefficient (ϕ 1 ) represents the dependence of Y t on its own lagged value Y t-1 . A positive value indicates that if the difference (change) in the previous period was high and/or positive, the current period’s difference tends to follow the same pattern. A value close to one for a t-1 suggests that the error from the previous period strongly influences the current error, representing a “rebound” effect. This occurs when a large error in one period tends to be compensated in the following one, which is common in phenomena with rapid fluctuations. In this case, it indicates that exogenous variables cause abrupt adjustments in the model’s response, leading to sudden changes in the response to external shocks.
The model coefficients were transformed back to the original scale (Eqs. [12] to [15]). The suffix “tr” indicates that the coefficient is estimated on the original scale of the PM10 data, measured in µg m-3.
All the estimates for the effects of the variables are negative, suggesting that an increase in the values of these variables is associated with a reduction in PM10 concentrations. It is estimated that each 1-unit increase in precipitation reduces, on average, the concentration of PM10 by 0.064 µg m-3, keeping all other variables constant.
A stronger wind near the surface (2 m, ω 1 tr ) favors the dispersion of pollutants, reducing the concentration of PM10. Each increase in wind speed at 2 m is associated with a decrease of PM10 of 0.729 µg m-3.
On the other hand, wind at 10 m (ω 2 tr ) suggests that an increase in wind speed at this level correlates with a decrease of 0.938µg m-3 in PM10, which could reflect a more complex dynamic of vertical transport or advection, similar to what was reported by Contreras-Arreola and González (1999), where it was shown that wind direction and persistence influence PM10 concentrations, with more pronounced seasonal differences in summer than in winter.
Finally, the effect of atmospheric pressure (ω 3 tr ) indicates that for each increase in atmospheric pressure, the concentration of PM10 decreases by 0.18 µg m-3, which may be related to meteorological conditions that promote the dispersion or removal of particles in the atmosphere.
A detailed analysis of the model residuals is presented in Figure 8. The residuals do not exhibit clear patterns and remain randomly distributed, indicating that the model adequately captures the main dynamics of the series. The residual histogram suggests a degree of normality, while the absence of autocorrelation in the ACF plots confirms the independence of the errors, thereby validating the fitted model.
This analysis demonstrates that the fitted ARIMA model is robust and suitable for capturing temporal dynamics and interactions with meteorological variables (Cao, 2024), providing an effective tool for the analysis and prediction of PM10.
The transfer function model (Eq. [10]) produced results in terms of confidence intervals comparable to those reported by Alvarado et al. (2010), as they capture a significant portion of the actual values, as shown in Figure 9. Although extreme events are not adequately captured, the model successfully describes the average behavior of the data.

Fig. 9 Prediction on the test set with confidence intervals for PM10 for the average data of the test portion.
A key advantage of this approach is its simplicity compared to complex methods like MARS, which segment data and fit polynomials to model nonlinear relationships. The performance metrics show that the MSE was 1.0259 µg m-3, reflecting moderate variability in the model’s errors, while the RMSE indicates that the predictions deviate, on average, 1.0129 µg m-3 from the actual values. The MAE, which measures the average absolute errors without being influenced by extreme values, is estimated at 0.8083 µg m-3. This metric provides a more stable and reliable interpretation of the model’s average error magnitude, making it particularly useful in contexts where extreme values do not dominate the analysis (Table IX).
Table IX Performance metrics of the transfer function model.
| Accuracy metrics | |
| AIC | 8320.02 |
| AICc | 8320.05 |
| BIC | 8363.71 |
| Ljung-box (p-value) | 0.957 |
| Augmented Dickey-Fuller (p-value) | > 0.01 |
| RMSE | 1.0129 |
| MSE | 1.0259 |
| MAE | 0.8083 |
| MAPE | 992.54 |
AIC: Akaike information criterion; AICc: Akaike information criterion corrected version; BIC: Bayesian information criterion; RMSE: root mean squared error; MSE: mean squared error; MAE: mean absolute error; MAPE: mean absolute percentage error.
Regarding the model, it is suitable for capturing general trends and making average projections but requires additional adjustments to improve its performance in predicting extreme events, especially in contexts where these peaks have a significant impact (Fig. 10).
3.8 Outliers in the residuals of the models
The detection of outliers is based on the decomposition of the time series into trend, seasonality, and residual components, following Hyndman and Athanasopoulos (2021). The residuals of both the univariate model and the transfer function model show a low proportion of outliers, with 0.89 and 0.5%, respectively. However, the transfer function model exhibits fewer additive outliers (AO) and temporary changes (TC), whereas the univariate model captures more level shifts (LS) (Table X ). This suggests that the meteorological variables in the transfer function model influence the propagation of changes over time, while the univariate model better explains the individual behavior of the PM10 pollutant (Fig. 11).
Table X Types of outliers by model.
| Model | AO | LS | TC |
| Univariate | 17 | 10 | 7 |
| Transfer | 14 | 0 | 5 |
AO: additive outlier; LS: level shift; TC: temporary change.
3.9 GARCH models
Two GARCH models were applied to the residuals of the ARIMA models: one univariate and one with transfer functions. In both cases, the skewed generalized error distribution (sGED) was used. The two models share similar characteristics in terms of asymmetry (0.9399 and 0.8640, respectively) and kurtosis (1.7341 and 1.7108, respectively), which are typical properties of the sGED distribution.
For the conditional variance of the residuals from the univariate ARIMA model, a GARCH (2,2) process was applied (Fig. 12). This means that two lags were used for both the conditional variance and the shock term, providing a more detailed fit to the fluctuations in the residuals’ volatility, as described in Eq. (16).

Fig. 12 GARCH (2,2): (a) series with two conditional standard deviations overlaid; (b) autocorrelation function of standardized residuals; (c) autocorrelation function of squared standardized residuals, and (d) empirical density of standardized residuals.
For the transfer function model, the conditional variance follows a GARCH (1,2) process (Fig. 13). This means that one lag was used for the conditional variance and two lags for the shock terms, allowing for an efficient capture of fluctuations in the residuals’ volatility, as described in equation (17).

Fig. 13 GARCH (1,2): (a) series with two conditional standard deviations overlaid; (b) autocorrelation function of standardized residuals; (c) autocorrelation function of squared standardized residuals, and (d) empirical density of standardized residuals.
The estimated parameters for the GARCH (2,2) model are: α 1 ≈ 0.037, α 2 ≈ 0.045 and β 1 + β 2 ≈ 0.20 + 0.69 = 0.89, which results in a total sum of α 1 + α 2 + β 1 + β 2 ≈ 0.97. This value, close to 1, indicates very high persistence in volatility, suggesting that the effects of past shocks on the conditional variance diminish slowly over time.
Similarly, the GARCH (1,2) model associated with the transfer function model shows the following parameter estimates: α 1 ≈ 0.058 and β 1 + β 2 ≈ 0.44 + 0.47 = 0.91, which results in a total persistence of α 1 + β 1 + β 2 ≈ 0.97. This high persistence implies that future volatility strongly depends on past volatility, maintaining its effect over extended periods.
The joint interpretation of both models indicates a high persistence in volatility over time, with past volatilities (β) having a more significant influence compared to recent shocks (α), which reflects a behavior where episodes of high or low volatility tend to persist, suggesting that changes in conditional variance are smoother and more sustained over time, rather than responding abruptly to new shocks. Such persistent fluctuations in volatility indicate changes in the conditional variance that may be driven by extreme environmental events as demonstrated in the work of Alexis et al. (2022), such as sudden storms or abrupt shifts in wind patterns, which can affect PM10 concentrations.
Another significant difference between the two models is that the residuals of the univariate ARIMA model depend more on the volatility associated with the magnitude of squared errors from two periods ago, reflecting a longer memory regarding past shocks. In contrast, the residuals of the transfer function model exhibit a relatively shorter memory concerning the impact of past shocks, indicating that the effects of recent errors dissipate more quickly. This behavior can be explained by the fact that, in the transfer function model, the mean effect has been captured more efficiently through meteorological variables, which better reflect the deterministic and exogenous dynamics of the system. As a result, the number of unexplained residuals is reduced, allowing for a more precise fit and decreasing the need for a complex conditional variance model.
On the contrary, in the GARCH(2,2) model associated with the univariate ARIMA, there is a higher proportion of unexplained residuals in the mean. This necessitates a more flexible and complex conditional variance model to capture the remaining fluctuations. The need for greater flexibility is reflected in the GARCH(2,2) structure, which allows for a more effective representation of prolonged fluctuations and the effects of past shocks on volatility. In both cases, the residuals of the ARIMA + transfer + GARCH model achieved normal residuals (Figs. 12 and 13, respectively).
The GARCH component captures conditional heteroscedasticity, periods of high or low volatility, associated with abrupt changes in external variables, modeling how the residual variance evolves over time in response to climatic or environmental phenomena. By adding transfer functions to the ARIMA + GARCH framework, exogenous variables are explicitly included in the equation, which not only explains PM10 dynamics based on its own history but also directly links volatility to factors such as precipitation or wind speed, attributing a portion of the observed volatility to these external drivers. In contrast, a purely ARFIMA + GARCH (or SARFIMA + GARCH) model captures long-memory behavior and fractional seasonality in the PM10 series, modeling volatility through GARCH (Reisen et al., 2014); however, without exogenous variables, it cannot distinguish how much of the variability in the mean or volatility is caused by specific external events like a precipitation spike or a strong wind episode.
4. Conclusion
The ARIMA (1,1,1)(0,0,0)[365] model is more effective in explaining the behavior of the PM10 variable rather than forecasting it. Additionally, the selected meteorological variables indeed have a significant impact on PM10 levels, confirming their influence on the dynamics of pollutant concentrations. While all meteorological variables included in the model were statistically significant, precipitation stood out due to its pronounced variability, reflected in recurrent episodes of intense rainfall and extended droughts characteristic of this region. Consequently, changes in precipitation behavior notably influence PM10 concentrations, highlighting its critical role compared to other statistically significant meteorological variables. Pollutant data show moderate noise due to their nature, with a few outliers that do not significantly affect the analysis. The high persistence in GARCH models suggests that volatility changes are driven more by past trends than recent events, indicating that pollutant dynamics are shaped by structural factors and long-term patterns, reinforcing the stability and predictability of the time series.
It is recommended to use the model for short-term forecasts with a maximum horizon of seven days, as no significant peaks in PM10 concentrations are observed within this period. However, caution should be exercised when interpreting the results, considering that the use of confidence intervals allows for defining a more precise range in which PM10 concentrations may fluctuate, providing a more robust assessment of the forecasts.
As a proposal for future work, it would be beneficial to increase the number of monitoring stations that record additional meteorological variables, such as temperature, relative humidity, and traffic flow, along with information on industrial activities. Expanding the set of variables would allow for a more precise validation of the impact of anthropogenic activities on PM10 concentrations.
Additionally, it is recommended to use more detailed monitoring frequencies and to address outlier treatment more rigorously at the hourly frequency. This approach could enhance the accuracy and reliability of predictive models.










nueva página del texto (beta)








