1 Introduction
The Principal Components Analysis (PCA) and Factor Analysis (FA) have been the classic techniques used for extracting the underlying systematic risk factors of the generative multifactor model of returns in the statistical approach to the Arbitrage Pricing Theory (APT). Both techniques make a strong assumption about the multivariate Gaussianity of the observed variables; however, real life data sets, especially financial time series, are not normally distributed neither univariate nor multivariate, and this causes the application of a PCA or a FA to yield unreliable results.
A solution to this problem is to extract the components by means of the Independent Component Analysis (ICA), which is capable of extracting statistically independent components from a set of non-Gaussian data. In addition, the underlying risk factors extracted by an ICA represent better estimations than those extracted by a PCA and a FA, because the first are statistically independent, whereas the latter are only linearly uncorrelated.
Nevertheless, both the PCA-FA and the ICA make another strong assumption: the linearity of the model. In the present research we use a novel extraction technique which deals with the nonlinearity problem: the Nonlinear Principal Components Analysis (NLPCA). This technique has been used in many fields of science as a dimensionality reduction or as a feature extraction technique1.
For example, in [25] authors use a NLPCA to detect nonlinearities, extract features and classify spectral data from a set of stars, showing that the nonlinear principal components perform better than a standard PCA. They also apply it in the physiology field, analyzing data from electromyographic recordings of muscle activities and obtained similar results. In biochemistry and bioinformatics, in [26,27,21,23] the authors applied a NLPC in order to analyze molecular data from metabolite levels of a plant and from the reproductive cycle of a parasite.
Their findings demonstrate that the nonlinear components extracted by a NLPCA are more suitable for interpreting this kind of large multidimensional biological data as well.
Other fields of applications where there is an extensive list of studies are for instance: in Oceanography and Atmospheric Sciences, for extracting features from different atmospheric phenomena; in chemical and industrial engineering, for detecting faults in nonlinear industrial and chemical separation processes; in psychology, for dealing with nonlinear relationships applied to categorical data; and in robotics, for characterizing humanoid motion and for transferring human skills to robots.
In the field of finance, the application of NLPCA has been little developed. In [6] the authors used a NLPCA to determine the nonlinear principal components driving the variations of the implied volatility smile derived from FTSE-100 stock index options; in [18] a NLPCA is employed for bankruptcy prediction in banks, and in [30] it is used to analyze and predict the trend of withdrawals from an employment time guarantee fund.
On the other hand, some works have used related techniques to extract nonlinear components in the field of finance, e.g., [8] and [14], where the authors employ a Kernel PCA (KPCA) and a Curvilinear Component Analysis (CLCA), respectively, to reduce the dimension from a set of technical analysis indicators that they use for predicting stock prices and a market index. In addition, in [28] the authors used a KPCA to extract features from a set of stock prices with predictive purposes as well.
Applications in other related areas such as economics and business are limited too. In [13] the authors used NLPCA to evaluate the nonlinear relationship between budget rules and fiscal performance, and in [5] it is used as a dimensionality reduction technique to measure the perception of consumers about the quality of services.
As far as we are concerned, there is neither any reference using NLPCA to extract the underlying systematic risk factors affecting the returns on equities in the stock markets, nor any study using NLPCA applied to Mexico; consequently, the main objective of this research is to fill this gap in financial literature2.
The structure of this paper is as follows: Section 2 presents a brief review of the NLPCA, Section 3 explains the empirical study and Section 4 draws the conclusions. Finally, the last section presents the references consulted in this research.
2 Nonlinear Principal Component Analysis
The objective of a NLPCA is to extract nonlinear components from a data set.
The NLPCA represents a nonlinear generalization of the standard PCA, where the estimated principal components are taken from straight lines to curves, capable of handling and of discovering nonlinear relationships among variables and between components and variables, in other words, the subspace produced in the original data space is curved.
On the other hand, continuous financial time series, such as returns on equities, might present a nonlinear behavior, implying that possibly they might be better explained by curved lines rather than straight lines.
The relationship between the underlying systematic risk factors and the returns on equities might be nonlinear, too; thus, it could be better explained by a nonlinear model as well3.
2.1 Auto-Associative Neural Network
In this study we will focus on one approach to perform NLPCA based on artificial neural networks (ANN) 4.
This approach, known as Neural Networks Principal Component Analysis (NNPCA) or Principal Component Neural Networks (PCNN), is commonly performed via an auto-associative neural network architecture named autoencoder, replicator network, bottleneck or sandglass type network5. This neural network (NN) is a multilayer perceptron6, where the output layer of the network is required to be identical to the input layer (identity mapping) by minimizing the square error:
In the middle of the network there is a layer (bottleneck) where the reduction of dimension is done and represents the values of the principal components or scores. Figure 1 shows a diagram of this kind of NN.
The first part of the process is the extraction of the principal components (third layer) from the original data (first layer). The NN estimates a first matrix of weights to generate the second hidden layer, which will represent a previous layer before the one of nonlinear principal components (NLPC); then, the NN estimates a second matrix of weights, which will generate the third layer or principal components (Z).
The second part of the process is the reconstruction of the variables from the NLPC. The NN computes a third matrix of weights to produce a fourth hidden layer as a previous step to the reconstructed variables, which will be used together with to the fourth matrix of weights, in order to reproduce the original variables. Actually, the second and fourth hidden layers are the ones that perform the nonlinear mapping.
The formal expressions of the extraction and generation functions are:
Extraction function:
Generation function:
where z represents the scores or principal components;
W
1 and W
2, the matrices of weights in the extraction process;
There are several architectures for the autoassociative neural network approach, such as: the standard, the hierarchical, the circular and the inverse model, and all of them can be used in combination. The standard NLPCA is the naive model, where both of the extraction and generation processes are included and no additional constraint, regarding the order of components, is imposed. The use of this version is recommended for non-periodic or non-cyclic data when the main interest is only the reduction of the dimensionality and not the extraction of meaningful features. In the hierarchical NLPCA, the order of the nonlinear components is enforced to respect the hierarchical ranking obtained in the linear PCA, thus yielding more meaningful features for the analysis. The circular version allows the extraction of circular components which describe a closed curve, instead of a standard curve with an open interval, more suitable for periodic or cyclic phenomena. Finally, the inverse definition only models the generation process.
This version is more efficient since we only train the second part of the neural network and not the two processes. It produces results more suited for real processes, since it models the natural process generating the observed data. In addition, it allows dealing with missing data because it does not need the sample data as an input. All the former extensions can be used in combination or separately7.
2.2 Dealing with Nonlinearity
In many studies a NLPCA has been used as a successful alternative to deal with the nonlinear relations among variables existent in different kinds of real data. Nevertheless, the use of NLPCA can be justified under a different perspective independently of the linear or nonlinear relation among the data set. Whereas PCA, FA and ICA represent linear models, a NLPCA has the attribute of being a nonlinear system8. In other words, PCA, FA and ICA express the variables in the model as linear combinations, while a NLPCA does it as a nonlinear mixing.
In NLPCA performed via an autoencoder neural network, the nonlinear hidden layers enable, first, a nonlinear mapping from the observed variables in order to estimate the principal components, and then another nonlinear transformation (demapping) from the estimated components in order to approximate the reconstructed variables. As a nonlinear system characterized by the non-proportionality between its inputs and outputs, a NLPCA will produce different insights of the studied phenomena. Particularly in the finance field, it could be assumed that simple variations in the underling systematic risk factors may generate complex effects in the returns on equities; i.e., the relation between the stock returns and the underlying systematic risk factors may be nonlinear.
2.3 Estimation of the Parameters of the Model
The generation function, gives the inverse function, from a set in the latent
space z, as shown in equation (3) above. In order to estimate the parameters of
the model W
3 and W
4 that allow for the estimation of
In the case of implementation done in this paper, the parameters wi,j corresponding to each of the matrices W 1, W 2, W 3, W 4 were estimated by means of gradient search.
Following [26] in order to compute each of the weight values wij the above equation can be expressed more specifically as:
and the partial derivatives are the following (according to the Matlab® notation, and including the bias term corresponding to j = 0):
The above partial derivatives are used for updating the estimation of the weights wij iteratively as:
The selection of the μ mu parameter is done heuristically, taking into account convergence considerations.
3 Empirical Study
3.1 The Data
In conformity with the availability of information, and for the sake of a comparison to former studies9, the sample used in this research contains the log returns of the stocks that have been part of the main index of the Mexican Stock Exchange, the Price and Quotations Index (IPC), during all the periods considered.
Because of their importance in the Mexican Economy and their characteristics of liquidity and market value, these companies can be considered as representative of the Mexican stock market. Table 1 shows the names and sectors of these shares.
No. | TICKER | Name of the Company | Industrial Sector |
---|---|---|---|
1 | ALFAA | Grupo Alfa | Holding |
2 | ARA* | Consorcio Ara | Construction: Housing |
3 | BIMBOA | Grupo Bimbo | Food processing |
4 | CEMEXCP | Cemex | Cement |
5 | CIEB | Corporación Interamericana de Entretenimiento |
Holding |
6 | COMERUBC | Controladora Comercial Mexicana | Commerce: retailing and wholesale |
7 | CONTAL* | Grupo Continental | Food and beverage processing |
8 | ELEKTRA* | Grupo Elektra | Commercial firms |
9 | FEMSAUBD | Fomento Económico Mexicano | Beer |
10 | GCARSOA1 | Grupo Carso | Holding |
11 | GEOB | Corporación GEO | Construction: Housing |
12 | GFINBURO | Grupo Financiero Inbursa | Financial services |
13 | GFNORTEO | Grupo Financiero Banorte | Financial services |
14 | GMODELOC | Grupo Modelo | Food, tobacco and beverages |
15 | KIMBERA | Kimberly-Clark de México | Cellulose and paper |
16 | PE&OLES* | Industrias Peñoles | Ferrous minerals |
17 | SORIANAB | Organización Soriana | Commerce: retailing and wholesale |
18 | TELECOA1 | Carso Global Telecom | Communications |
19 | TELMEXL | Teléfonos de México | Communications |
20 | TLEVICPO | Grupo Televisa | Communications |
21 | TVAZTCPO | TV Azteca | Communications |
22 | WALMEXV | Wal-Mart de México | Commerce: retailing and wholesale |
We carried out our study on four different databases structured as follows: two databases of 20 stocks ranging from July 7, 2000 to January 27, 2006, one expressed in weekly returns (DBWR) and the other in returns in excess of the riskless interest rate (DBWE); and two databases of 22 stocks running through the period from July 3, 2000 to January 31, 2006, the first expressed in returns (DBDR) and the latter in excesses (DBDE) 10.
The period analyzed in this study (2000-2006) was considered according to the following criteria:
a) This article represents the third part of a major research, where we are testing different techniques for extracting the underlying systematic risk factors in the context of the Mexican Stock Exchange. The Neural Networks Principal Component Analysis represents the third approach we have used to perform that extraction, under a statistical approach to the latent systematic risk factors.
b) The technique used in this article, and the other techniques utilized in the previous stages of our research, have an explanatory and a predictive character. First, we are carrying out the explanatory approach, which make us divide our dataset in two blocks: one for explanation or training and another for prediction; i.e., the first period is used for the explanation or training of the model, and the second one, will be used for testing the predictive power of the generative model of returns estimated. Consequently, the data from 2000 to 2006 were used to extract the generative underlying structure of returns, which explains the behavior of the returns of the training period. This estimation will help us in the next stage of prediction, where the model will be tested in subsequent periods of time (from 2006 on).
c) The other techniques that we have employed in our research are the Independent Component Analysis, Factor Analysis and Principal Component Analysis; our objective is to be able to compare the results of the four techniques, concerning both their explanatory and predictive power. Therefore, we are using the same training and prediction periods for the four of them.
d) Additionally, another reason for using this period of the dataset, is to be able to compare in further studies, the effects of the 2008-2009 financial crisis in the estimation of the underlying structure of systematic risk, by means of the extraction of the generative model of returns during the crisis and the postcrisis periods, using the four techniques11.
3.2.1 Extraction of Underlying Systematic Risk Factors Via NNPCA
The Arbitrage Pricing Theory (APT) assumes the following generative multifactor model of Returns12:
From the statistical approach, neither the factors nor their sensitivities are given13 and we must estimate them simultaneously by means of statistical or feature extraction techniques such as, in this case, the NNPCA. Although the NNPCA is capable of extracting the scores of the components (the Fs), it is very difficult or even impossible to obtain a single matrix containing the equivalent to the sensitivities for each factor (betas), because there are two matrices of weights and a nonlinear transformation involved in the process of reproducing the variables14. Consequently, we used the NNPCA for extracting only the scores of the underlying systematic risk (the Fs) in the expression 10.
In order to estimate the NNPCA model, we used its hierarchical extension (h-NLPCA) performed by an auto-associative neural network, which respects the ranking of the principal components in the linear PCA15. This hierarchy implies the fulfillment of two important properties: scalability and stability. Scalability means that the first n components must explain, as much as possible, the variance in the n-dimensional subspace. Stability denotes that the i-th component of an n-dimensional solution must be identical to the i-th component of an m-dimensional result, where n ≠ m.
According to [25], the hierarchy constraints are based on searching in the original data space for the smallest mean square reconstruction error while using the first i-th components according to the following expression:
where, x and
Therefore, the h-NLPCA can be interpreted as we look for a k-dimensional subspace of minimal mean square error (MSE), so that the (k-1)-dimensional subspace is also of MSE. Consequently, all the dimensional subspaces 1, k, are of minimal MSE and represent their dimensionality in the best way16. For the sake of a comparison with our former studies, we estimated 8 different Neural Networks (NNs) to extract from 2 to 9 nonlinear principal components in each database.
In order to generate a loading matrix that make possible a first attempt to interpret the latent risk factor extracted, we used a five-layer architecture with 20 neurons in the input layer for the weekly databases and 22 for the daily ones, from 2 to 917 in the mapping layer, the bottleneck layer, and the demapping layer and, finally, 20 and 22 in the output layer, respectively. In terms of the NN notation, the architectures used were: [20:2-9:2-9:2-9:20] and [22:2-9:2-9:2-9:22].
Concerning the nonlinear transferring functions, following the recommendations of [16] for an autoencoder NN in order to perform the NLPCA, we used a tangent sigmoid function from layer one to layer two and from layer three to layer four; and a linear function from layer two to layer three and from layer four to layer five.
Using the Matlab® code by [22] for the performance of the NLPCA on our four databases, we obtained the scores of the principal components hierarchically ordered, the four matrices of weights and the reproduced variables. We emphasize that the objective of such estimation is to achieve a nonlinear transformation, first, from the observed variables to the principal components, and then to realize another nonlinear transformation capable of reproducing the observed variables from the extracted components.
The results in the reconstruction of the observed returns or excesses were suitable for all the stocks in the four databases; this implies that the estimation of the generative multifactor model in the statistical approach to the APT, performed by NLPCA, was successful18.
The only problem detected was in the reproduction of some observations in a few stocks presenting very high levels of volatility, where the reconstruction was not able to reach all the peaks completely. Nevertheless, if we add more components to the extraction, the reproduction of all the series improves greatly, covering almost all the peaks of high volatility19.
For reasons of space saving, in Figures 2 we only show the lines plots of the observed and reproduced returns and excesses from the first 5 stocks in each database of the experiment, where we extracted nine components. We can observe that the reconstruction is suitable in nearly all cases, except for the observations regarding very high volatility, as stated above.
Note: Logarithmic returns of the first five stocks observed in each database and their respective reconstructions using the estimated NNPCA model. Stock symbols of the stocks presented appear above each line plots.
In addition, for visualization purposes in Figures 3, we present the plots generated by the software used for the extraction, where the first three principal components of the NLPCA are plotted as a grid in the original data space.
Note: The first three principal components of NNPCA plotted as grid in the original data space. The grids represent the new coordinates in the space of the components and give a nonlinear or curved description of the data.
In this case the grids represent the new coordinates of the component space, thus giving a nonlinear or curved description of the data.
Although it is not completely conclusive, the four plots show that the data could be described sufficiently well by nonlinear behaviors.
3.2.2 Interpretation of the Extracted Factors
Although this study is mainly focused on the extraction process of systematic risk factors of the Mexican Stock Exchange, but not on the risk attribution stage of statistical approach to the Arbitrage Pricing Theory, in this section we will just make a first attempt to propose an interpretation of the meaning or nature of the systematic risk factors extracted. We will follow an analogue methodology similar to the classical approach used when Principal Component Analysis (PCA) and Factor Analysis (FA) are used to reduce dimensionality or to extract features from a multifactor dataset.
This approach is based on the use of the factor loading matrix estimated in the extraction process in order to identify the loading of each variable in each component or factor; high factor loadings in absolute terms indicate a strong relation between the variables and the factor. In our context, the factors will be saturated with loadings of one stock or a group of stocks that may help us in the identification of those factors with certain economic sectors, as a first approach to the interpretation of each component or factor.
In the case of NNPCA, that factor loading matrix is not clearly defined, since the demixing process involves the combined effect of two loading matrices (W 1 and W 2) and a nonlinear function of transference; however, in order to use one of these matrices as an analogue one to those used in techniques such as PCA and ICA as a first approach to give meaning to the extracted factors, we can argument the following, considering the role that each matrix plays in the demixing process.
Following the network architecture displayed in Figure 1. Matrix W 1 makes a projection into the space where we have an internal representation in the form of the hidden units, thus, it would be equivalent to a mixing matrix such as those used in PCA and FA. In other words, from a structural point of view NNPCA makes a non-linear transformation given by W 1. For that effect, it is necessary to subtract the medium value by means of the bias involved in the estimation and to scale the inputs somehow, so that the nonlinearity compresses the margin properly.
This makes the function of the first layer of the network to be different to that of other methods such as PCA and FA. On the other hand, matrix W 2 makes a dimensionality change of the representation given the output of the first layer.
Its function is to make a lineal transformation to rotate and scale the output, in such a way that the intermediate representation could be transformed by the second part of the network.
Furthermore, from a structural standpoint, the product (W 1*x) in expression 1, generates the representation that will pass through the nonlinearity later.
The function of the nonlinearity is to make a compression of the space in order to make easy the function of the posterior part of the neural network easy. From this standpoint, the projection form given by (W 1*x) informs about the intermediate representation of the information and it could be compared with the latent factors estimated by PCA and FA; although it is important to remark that they are different things since they are obtained through different criteria.
According to the above stated, in this research we will use matrix W 1 as a loading matrix to propose preliminary meanings for the extracted latent factors.
In the interest of saving space, we only present the loading matrices plots from the database of weekly returns that belong to the experiment where we extracted nine underlying factors, nevertheless, this kind of plots were developed for all the cases under the same methodology. Figure 4 presents these results.
Additionally, we constructed some tables summarizing the results derived from the analysis of the factor loading matrices and plots, where we propose a certain economic sector that may be related to each factor. We grouped together the stocks with the highest loading in each factor according to the official classification of the economic sectors used in the Mexican Stock Exchange.
Table 2 presents this summary There is not a clear interpretation of the factors using the matrix W 1; however, we uncover that in this case, the most of the factors are formed by a mixture of stocks from different industrial sectors instead of a combination of shares from the same sector.
Database of Weekly Returns |
Database of Weekly Excesses |
||
NLPC1 | Beverages and Leisure / mining sectors factor. | NLPC1 | Mining / Food products and beverages, Consumer staples and Communication media sectors factor. |
NLPC2 | Mining and Telecommunications / Holding
sectors factor. |
NLPC2 | Mining / House building sectors
factor. |
NLPC3 | Holding / Mining sectors factor. |
NLPC3 | House building, Mining and Holdings
sectors factor. |
NLPC4 | Home Furnishing and Beverages sectors
factor. |
NLPC4 | Beverages, Leisure and Home furnishing
sectors factor. |
NLPC5 | Salinas Group Factor. | NLPC5 | Consume sector factor. |
NLPC6 | House building and Beverages / Consumer
staples, Communication media and Mining sector factors. |
NLPC6 | Construction sector factor (Geo Factor). |
NLPC7 | Holdings / Food products sector factors. | NLPC7 | Financial and House building /Consumer
staples sectors factors. |
NLPC8 | Food products / Construction sector factors. | NLPC8 | Food and beverages sector factor. |
NLPC9 | Food products, Beverages and Construction
sector factors. |
NLPC9 | House building, communication media and
consumer staples sector factor. |
Database of Daily Returns |
Database of Daily Excesses |
||
NLPC1 | Construction sector factor (Geo factor). | NLPC1 | Salinas Group / Mining sector factor. |
NLPC2 | Mining sector factor (Peñoles factor). | NLPC2 | Beverages / Home furnishing and
Financial services sectors factor. |
NLPC3 | Consumer staples, Financial services,
Home furnishing and Mining sector factors. |
NLPC3 | Salinas Group, Holdings and Mining /
Leisure sectors factor. |
NLPC4 | Communication media and Beverage sector
factor. |
NLPC4 | Holdings / Leisure sectors factors. |
NLPC5 | Beverages and mining / Home furnishing
and house building sectors factor. |
NLPC5 | Beverages and House building / Mining
sector factors. |
NLPC6 | Beverages, Communication media, House building and Home furnishing sectors factor. |
NLPC6 | House building and Holdings / Leisure
sectors factor. |
NLPC7 | Leisure and Financial services sectors /
Salinas Group factor. |
NLPC7 | Communication media / Financial
services sectors factor. |
NLPC8 | House building and Holdings / Home furnishing and Consumer staples sector factor. |
NLPC8 | Mining sector factor (Peñoles factor). |
NLPC9 | Holdings and House building / Mining
and Home furnishing sector factors. |
NLPC9 | Mining and Beverages sector factor. |
In other words, excluding some factors that we could identify clearly; i.e.: number five (Salinas Group factor), in database of weekly returns; number five (Consumer sector factor), number six (Construction sector factor or GEO factor) and number eight (Food and Beverage sector factor), in database of weekly excesses; number one (Construction sector factor or Geo factor) and number two (Mining factor or Peñoles factor), in a database of daily returns; and finally, number eight (Mining factor or Peñoles factor), in database of daily excesses; the rest of the factors represent a combination of sectors that in many cases have opposite signs.
In addition, we can distinguish the strong and constant contribution of some sectors or stocks in many factors in the four databases; e.g., mining sector with PEÑOLES (DBWR:4 + DBWE:3 + DBDR:4 + DBDE:5 = 16), beverage sector with CONTAL (DBWR:4 + DBWE:3 +DBDR:3 + DBDE:3 = 13), construction sector with GEO (DBWR:3 + DBWE:5 + DBDR:5 = 13), home furnishing sector with ELEKTRA (DBDR:6 + DBDE:3 = 9 ), holding sector with ALFA (DBWR:3 + DBDE:3 = 6), food products sector with BIMBO (DBWR:3), consumer staples with WALMEX (DBWE: 3) and SORIANA (DBWE:3), communication media sector with TVAZTECA (DBDR:3) and leisure sector with CIEB (DBDE:3).
In this case, none of the components in any database is clearly related to market factor. Likewise, there is not a homogeneous interpretation of the factors in all the databases. Nevertheless, there are two factors that could have the same interpretation in the different databases but are ranked in different order; e.g., the mining and the construction factors, as can be observed in the referred table.
3.2.3 Econometric Contrast
As a complement for our research, we carried out an econometric contrast of the APT, using the underlying systematic risk factors extracted via the NNPCA, in order to test its validity as a suitable pricing model for the sample and periods considered. This methodology of contrast represents only a first approach to the econometric validation of the APT using NNPCA, so the result should be viewed in that light.
After applying the second assumption of the APT (the principle of arbitrage) to its first assumption (the generative multifactor model of returns) of expression 10, we get the APT fundamental pricing equation20:
where the betas are the sensitivities to the systematic risk factors and the lambdas are the risk premium paid by the market for being exposed to each class of systematic risk.
The former equation can be tested by means of an average cross-section methodology for estimating the ordinary least squares (OLS) coefficients of the following regression model:
The straight methodology for contrasting the APT under the statistical approach would directly use the loadings or betas estimated in expression 10 in the former regression model [5], since both factors and sensitivities are computed simultaneously by the extraction techniques usually employed [1].
Nevertheless, as remarked in [15,17], this methodology could present some econometric problems such as heteroskedasticity and autocorrelation in the residuals, in addition to error in variables, which would yield inefficient OLS estimators with biased variances. Besides, the NNPCA estimation does not generate a single matrix equivalent to the loadings in the generative multifactor model of returns; hence, we cannot use this methodology. One possible solution to the foregoing problems is to employ a two-stage methodology widely used in the fundamental and macroeconomic approach to the APT, where in a first stage we estimate the betas to use in expression 14, then in a second stage we estimate the lambdas.
Following [4]21, in the first stage we estimated the betas to be used in expression 13, by regressing the factor scores obtained by the NNPCA as a cross-section on the returns and excesses. In order to improve the efficiency of the parameter estimates and to eliminate autocorrelation in the error terms of the regressions, we used a seemingly unrelated regression (SUR) to estimate simultaneously the entire system of equations.
The results of the regressions in the four databases were very good, producing, in almost all cases, statistically significant parameters, high R2 coefficients and statistics from the Durbin-Watson test of autocorrelation, all of which led us to the non-rejection of the null hypothesis of no-autocorrelation22.
Following [9]23, in the second stage we estimated the lambdas in expression 14 by regressing the betas obtained in the first stage as a cross-section on the returns and excesses, using OLS. In order to avoid the econometric problems of heteroskedasticity and autocorrelation in the residuals of the model estimated through OLS, we used Ordinary Least Squared corrected by heteroskedasticity and autocorrelation by means of the Newey-West heteroskedasticity and autocorrelation consistent covariances estimates (HEC). Additionally, we verified the normality in the residuals by carrying out the Jarque-Bera test of normality.
In order to accept the APT pricing model, we require the statistical significance of at least one parameter lambda different from λ024, and the equality of the independent term to its theoretical value, i.e., the average returns, in the models expressed in returns:
and zero, in the models expressed in excesses of the riskless interest rate:
We used Wald's test to confirm these equalities.
In Table 3, we present a summary of the results of the econometric contrast. In general, the results of the explanation power (R2), the statistical significance of the multivariate test (F), and the residual test are very good in all the contrasted models, except in the cases where only two factors were extracted.
λ0 | λ1 | λ2 | λ3 | λ4 | λ5 | λ6 | λ7 | λ8 | λ9 | R 2* | λsig/λtot | WALD | J-B | ||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Database of
weekly returns | |||||||||||||||
Model with 2 becas | ● | ● | 9.45% | 0.00% | ● | ||||||||||
Model with 3 betas | 0.005078 | 0.01034 | 0.02173 | 51.89% | 66.67% | ● | |||||||||
Model with 4 betas | 0.005582 | 0.00193 | 0.01002 | ● | 48.58% | 50.00% | ● | ||||||||
Model with 5 betas | 0.005411 | 0.00892 | 0.02423 | ● | 0.00348 | 50.84% | 60.00% | ● | |||||||
Model with 6 betas | 0.004886 | 0.00378 | 0.00997 | ● | ● | ● | 47.96% | 33.33% | ○ | ||||||
Model with 7 betas | 0.005458 | 0.00362 | 0.01168 | ● | ● | ● | 55.59% | 28.57% | ○ | ||||||
Model with 8 betas | 0.005605 | 0.00303 | 0.02117 | ● | ● | ● | ● | ● | 50.58% | 25.00% | ○ | ||||
Model with 9 betas | 0.005782 | ● | 0.02016 | ● | ● | ● | ● | ● | 46.35% | 11.11% | ○ | ||||
Database of
weekly excesses | |||||||||||||||
Model with 2 betas | ● | ● | 6.61% | 0.00% | ○ | ||||||||||
Model with 3 betas | 0.003488 | 0.00195 | 0.02129 | 47.35% | 66.67% | ● | |||||||||
Model with 4 betas | 0.003945 | 0.00237 | 0.00481 | ● | 49.04% | 50.00% | ● | ||||||||
Model with 5 betas | ● | 0.00505 | 0.03206 | ● | ● | 48.14% | 40.00% | ○ | |||||||
Model with 6 betas | ● | 0.00404 | 0.00882 | ● | 0.00147 | ● | 52.74% | 50.00% | ○ | ||||||
Model with 7 betas | ● | 0.00218 | 0.00650 | ● | 0.00168 | ● | 51.67% | 42.86% | ○ | ||||||
Model with 8 betas | ● | 0.00439 | 0.02272 | ● | ● | ● | ● | 53.% | 25.00% | ○ | |||||
Model with 9 betas | 0.0433 | 0.00613 | 0.02391 | ● | ● | ● | ● | 0.00040 | 57.13% | 33.33% | ● | ||||
Database
of daily returns | |||||||||||||||
Model with 2 betas | ● | ● | 0.00% | 0.00% | ○ | ||||||||||
Model with 3 betas | 0.00047 | 0.00113 | -0.00104 | 38.93% | 66.67% | ○ | |||||||||
Model with 4 betas | ● | 0.00090 | -0.00184 | ● | 38.11% | 50.00% | ○ | ||||||||
Model with 5 betas | ● | ● | -0.00229 | ● | ● | 44.15% | 20.00% | ○ | |||||||
Model with 6 betas | 0.001226 | ● | 0.00401 | ● | ● | ● | 56.56% | 16.67% | ○ | ||||||
Model with 7 betas | ● | ● | 0.00211 | ● | ● | ● | 50.05% | 14.29% | ○ | ||||||
Model with 8 betas | ● | ● | -0.00163 | ● | ● | ● | ● | 49.49% | 12.50% | ○ | |||||
Model with 9 betas | ● | ● | -0.00361 | ● | ● | 0.00058 | ● | ● | 61.79% | 22.22% | ○ | ||||
Database of
daily excesses | |||||||||||||||
Model with 2 betas | ● | 0.00046 | 1.36% | 50.00% | ○ | ||||||||||
Model with 3 betas | ● | 0.00085 | 0.00162 | 41.19% | 66.67% | ● | ● | ||||||||
Model with 4 betas | 0.000636 | 0.00043 | 0.00140 | 50.91% | 50.00% | ● | ● | ||||||||
Model with 5 betas | ● | 0.00080 | 0.00174 | ● | 36.09% | 40.00% | ○ | ● | |||||||
Model with 6 betas | ● | ● | 0.00402 | ● | 48.02% | 16.67% | ○ | ○ | |||||||
Model with 7 betas | ● | ● | 0.00146 | -0.00065 | ● | 44.49% | 28.57% | ○ | ○ | ||||||
Model with 8betas | ● | ● | 0.00284 | -0.00069 | 0.00028 | 62.43% | 37.50% | ○ | ○ | ||||||
Model with 9 betas | ● | ● | 0.00281 | ● | ● | ● | 55.92% | 11.11% | ○ | ○ |
Notes: (1) The level of statistical significance used in all the tests was 5%. (2) Empty circles mean that the required results in the different tests were fulfilled, whereas filled circles represent that those tests were not passed according to the different null hypotheses posed in each one of them. (3) λ j Estimated coefficients. H 0 : λ j = 0. Numeric value of the coefficient = Rejection of H 0 . Parameter significant. • = Not rejection of H 0 . Parameter not significant. (4) R 2* : Adjusted R-squared = Explanatory capacity of the model. (5) &955; sig / &955; tot Ratio number of significant lambdas / total number of lambdas in the model. (6) F: Global statistical significance of the model. H 0 = λ 1 = λ 2 = … = λ k = 0. ○ = Rejection of H 0 . Model globally significant. • = Not rejection of H 0 . Model globally not significant. (7) Wald: Wald's test for coefficient restrictions. Databases in returns: H 0 : λ 0 = Average riskless interest rate. Databases in excesses: H 0 : λ 0 = 0. ○ = Not rejection of H 0 . The independent term is equal to its theoretical value. • = Rejection of H 0 . The independent term is not equal to its theoretic value. (8) J-B: Jarque Bera's test for normality of the residuals. H 0 = Normality. ○ = Not rejection of H 0 . The residuals are normally distributed. • = Rejection of H 0 . The residuals are not normally distributed.
The univariate tests for the individual statistical significance of the parameters25 priced from one to three factors different from λ0, thus, giving evidence in favor of the APT in 29 models26. Nevertheless, only four models fulfilled both the statistical significance and the equality of the independent term to its theoretic value, in addition to the fulfilment of the requirements imposed by the residual test.
These two models were those expressed in weekly returns when six, seven and eight factors were extracted; and the one expressed in daily returns when three components were estimated. Moreover, there are twelve other models which fulfil all the conditions for accepting the APT as a pricing model, except for the statistical significance of the independent term, and eight models that fail only in the equality of the independent term to its theoretical value, which provides some additional evidence in favor of this asset-pricing model.
Making a cross-validation with the interpretation of the factors proposed in section 3.2.2, the meaning of the significant factors corresponding to the fully accepted models are the following. In the four models the statistical significant factors were number two and three.
Regarding the database of weekly returns, factor number two contrast the Mining and Telecommunications sectors to the Holding sector; and number three, counter the Holding sector to the Mining one. Concerning the model in the database of daily returns, factor number two was related to the Mining sector factor (Peñoles Factor); and number three correspond to a factor that mixes stocks of the Consumer staples, Financial services, Home furnishing and Mining sectors. Interestingly, datasets expressed in excesses did not produce any fully accepted model.
Further research will be needed regarding this issue, as well as the significance of the undersized values and signs of the estimated individual parameters. To summarize, for the sample and periods considered, we can accept only partially the validity of the NNPCA-APT as a pricing model explaining the average returns (and returns in excesses) on equities of the Mexican Stock Exchange. On the other hand, the evidence showed that the APT is sensitive to the number of factors extracted and to the periodicity and expression of the models.
4 Conclusions
The theoretical attributes of the NLPCA present desirable features when we extract the underlying systematic factors via this alternative technique, since they represent nonlinearly uncorrelated factors and not only linearly uncorrelated ones. The NNLPCA performed via NNPCA is capable of uncovering both, linear and nonlinear correlations, while PCA identifies, for example, only linear correlations. In that sense, we may conclude that the factors obtained in this study represent a more desirable estimation of the underlying systematic risk factors under a statistical approach to the APT27. In our case, we believe that the extracted factors should be better estimations28, to be used in a statistical approach to the APT because: first, they represent factors that have eliminated both, linear and nonlinear correlations among variables, and second, they are the result of a nonlinear transformation, not only a linear mapping, which deals with any nonlinear effect of the systematic risk factors over the returns on equities.
In addition, it is important to point that the non-Gaussian nature of the financial data, make that the generally used techniques for extracting underlying risk factors, such as PCA or FA, may generate, not completely reliable estimations, which suggest that the estimation of the generative underlying multifactor model of returns on equities by means of NNPCA, could represent a more reliable option to this end. See [11,12,17,9,4,5].
We would like to remark that our main goal in this paper has been the estimation of the generative multifactor model of returns of the APT by means of the NNPCA, that is, the risk extraction stage of a statistical approach to the APT. Therefore, the interpretation of the components extracted represents only a first attempt to give meaning to the latent factors; however, further research will be needed about the risk attribution process of this statistical approach.
In the same way, the econometric contrast corresponds only to a first approach to the validation of the APT as a pricing model using the systematic risk factors estimated via this extraction technique; therefore its results should be seen under this perspective. For the moment, we could attribute the not completely satisfactory results of the econometric contrast to two possible reasons: a) The methodology used for the contrast might not be the most suitable for a statistical approach to the APT, and perhaps it would be necessary to use time series moving regressions to estimate the sensitivities to the risk factors or betas [17,19] or mimicking portfolios as proxies of the underlying factors [15,29]. b) The origin of the problem might not be in the first assumption of the APT, the generative multifactor model of returns, but in the second, the arbitrage absence principle [10]; aspect that we have not investigated yet. Further research would be needed concerning these two possible causes of the results in the econometric contrast.