Estimation of the underlying structure of systematic risk with the use of principal component analysis and factor analysis

Ladrón de Guevara Cortés, Rogelio; Torra Porras, Salvador

Services on Demand

Journal

Article

Indicators

Cited by SciELO
Access statistics

Contaduría y administración

Print version ISSN 0186-1042

Contad. Adm vol.59 n.3 Ciudad de México Jul./Sep. 2014

Estimation of the underlying structure of systematic risk with the use of principal component analysis and factor analysis

Estimación de la estructura subyacente de riesgo sistemático usando análisis de componentes principales y análisis factorial

Rogelio Ladrón de Guevara Cortés* and Salvador Torra Porras**

* Universidad Veracruzana roladron@uv.mx.

** Universidad de Barcelona storra@ub.edu.

Fecha de recepción: 16.08.2013
Fecha de aceptación: 10.10.2013

Abstract

We present an improved methodology to estimate the underlying structure of systematic risk in the Mexican Stock Exchange with the use of Principal Component Analysis and Factor Analysis. We consider the estimation of risk factors in an Arbitrage Pricing Theory (APT) framework under a statistical approach, where the systematic risk factors are extracted directly from the observed returns on equities, and there are two differentiated stages, namely, the risk extraction and the risk attribution processes. Our empirical study focuses only on the former; it includes the testing of our models in two versions: returns and returns in excess of the riskless interest rate for weekly and daily databases, and a two-stage methodology for the econometric contrast. First, we extract the underlying systematic risk factors by way of both, the standard linear version of the Principal Component Analysis and the Maximum Likelihood Factor Analysis estimation. Then, we estimate simultaneously, for all the system of equations, the sensitivities to the systematic risk factors (betas) by weighted least squares. Finally, we test the pricing model with the use of an average cross-section methodology via ordinary least squares, corrected by heteroskedasticity and autocorrelation consistent covariances estimation. Our results show that although APT is very sensitive to the extraction technique utilized and to the number of components or factors retained, the evidence found partially supports the APT according to the methodology presented and the sample studied.

Keywords: arbitrage pricing theory, principal component analysis, factor analysis, systematic risk factors, Mexican Stock Exchange.

Resumen

Presentamos una metodología mejorada para estimar la estructura subyacente del riesgo sistemático en el mercado accionario mexicano, usando Análisis de Componentes Principales y Análisis Factorial. Consideramos la estimación de factores de riesgo en el marco de la Teoría de Valoración por Arbitraje (APT) bajo un enfoque estadístico, donde los factores de riesgo sistemático son extraídos directamente de los rendimientos accionarios observados y existen dos etapas diferenciadas conocidas como proceso de extracción de riesgo y proceso de atribución de riesgo. Nuestro estudio se enfoca solamente en el primero de estos dos procesos; incluye la contrastación de nuestros modelos en dos versiones: rendimientos y rendimientos en exceso sobre la tasa de interés libre de riesgo para bases de datos semanales y diarias, así como una metodología de dos etapas para el contraste econométrico. Primero, extraemos los factores de riesgo sistemático mediante la versión lineal estándar del Análisis de Componentes Principales y la estimación por Máxima Verosimilitud del Análisis Factorial. Después, estimamos simultáneamente, para todo el sistema de ecuaciones, las sensibilidades a los factores de riesgo sistemático (betas) mediante mínimos cuadrados ponderados. Finalmente, contrastamos el modelo de valoración usando una metodología transversal promedio a través de mínimos cuadrados, corregida por una estimación de heteroscedasticidad y autocorrelación consistente de covarianza. Nuestros resultados muestran que aunque el APT es muy sensible a la técnica de extracción utilizada y al número de componentes o factores retenidos, la evidencia encontrada apoya parcialmente al APT de acuerdo con la metodología presentada y la muestra estudiada.

Palabras clave: teoría de valoración por arbitraje, análisis de componentes principales, análisis factorial, factores de riesgo sistemático, Bolsa Mexicana de Valores.

Introduction and review of literature

Following a generative multifactor model of returns and an arbitrage argument, the Arbitrage Pricing Theory (APT) prices an equity by considering a set of common systematic risk factors assumed to influence the return produced. Empirical studies, mainly of developed markets such as the New York (NYSE), American (AMEX), London (LSE) and Tokyo (TSE) Stock Exchanges, have proposed different approaches to identify the types of systematic risk factors considered by multifactor models. Zangari (2003) presents a classification of risk factors based on whether their value is observable or not, dividing them into market, macroeconomic, fundamental, sector, technical and statistical factors. In general, the empirical evidence provided is contradictory, both supporting and rejecting the APT, especially when statistical factors are used. The market factor approach is practically an interpretation of the Capital Asset Pricing Model (CAPM), where there is only one common factor and it is observable. Both macroeconomic and fundamental models have been widely discussed in the literature; in many empirical papers sets of predefined variables, procedures and methodologies, for different countries, are examined.¹ Overall, findings have been favourable for both approaches, although there is no generalized consensus about the nature of factors. The macroeconomic approach seeks to identify, a priori, a set of observable macroeconomic time series as proxies of the value of the systematic risk factors. According to Yip et al. (2000), the macroeconomic variables can be classified into four categories: inflation, industrial production, investor confidence and interest rates. On the other hand, in the fundamental approach, the systematic risk factors are approximated by means of predefined financial and accounting variables that reflect the exposure to unobservable factors, such as size, leverage, cash flow, price-earning ratio (PER) and book-to-market ratio. As in the macroeconomic models, there is no general agreement among the different studies on the nature of factors. The main difference between the macroeconomic and the fundamental standpoints is the elements they consider as given in a multifactor model. The former consider the risk premiums for each kind of systematic risk as given and estimate the exposures or sensitivity to each kind of systematic risk, and the latter, vice versa. The other two security-specific approaches use technical and sector variables as proxies of the effects of unobserved factors, although very little empirical investigation has been carried out exclusively under these perspectives. The statistical approach focuses mainly on uncovering a suitable number of pervasive factors, regardless of their nature,² through latent variables analysis techniques such as Principal Component Analysis (PCA) and Factor Analysis (FA). In this case, both the risk premiums and the exposure to them are usually estimated simultaneously. Roll et al. (1980), Brown et al. (1983), Chen (1983), Bower et al (1984), Cho et al (1984) Connor et al. (1988), Lehmann et al. (1988) and Hasbrouck et al. (2001) obtained favourable results, revealing between three and five priced factors in the American stock market; Beenstock et al. (1986) identified twenty priced factors in the UK stock exchange and Elton et al. (1988) found four factors in the Japanese market. Nevertheless, Reinganum (1981) rejected statistical APT as a means of explaining stock price variations for the NYSE and AMEX, as did Gómez-Bezares et al. (1994), Nieto (2001), and Carbonell et al (2003) for the Spanish Stock Exchange (SSE). Moreover, Abeysekera et al. (1987) obtained mixed results for the London Stock Exchange, as did Jordán et al. (2003) for the Spanish Mutual Funds Market.

There is no clear supremacy of one approach over the others. Among the theoretical and empirical comparative studies made, Maringer (2004) presents a good summary of the advantages, disadvantages and recommended uses of macroeconomic, fundamental and statistical models; Connor (1995) shows that statistical and fundamental models outperform macroeconomic models in terms of explanatory power, and that fundamental models slightly outperform statistical ones for the USA market; Chan et al. (1998) found evidence that fundamental factors perform better than macroeconomic, technical, statistical and market factors in the UK and japanese markets; on the other hand, Teker et al. (1998) showed that the statistical model outperforms the macroeconomic one for the US market; and Cauchie et al. (2004) demonstrated that statistical factors yield a better representation of the determinants of the swiss market stock returns than the macroeconomic ones. In addition, Miller (2006a) makes a new comparison, complementing that of Connor's classic study. Consequently, three well-known risk analysis and portfolio management firms, MSCI-BARRA³, FTSE-BIRR⁴ and SUNGARD-APT⁵, have opted mostly for the fundamental, macroeconomic and statistical approaches, respectively, for constructing their worldwide multifactor risk models, portfolio analytics and risk reporting commercial products.

More recent studies have attempted to combine the different approaches. Miller (2006b) proposed a hybrid version of a multifactor model, combining fundamental and statistical factors, in which the latter are used to explain the fundamental model's residual part, obtaining modest results on the japanese market. Liu et al. (2007) proposed that fundamental models can be used as an approach to extract the effect of the macroeconomic factors, by dividing the model's common fundamental factors into two sub-parts: one explained by macroeconomic factors and the other by non-macroeconomic factors.

Empirical investigation of multivariate asset-pricing models in emerging stock markets has been relatively scarce. Most studies have been based on a macroeconomic perspective, finding two or three priced factors. Results have been mixed concerning priced factors across the markets.⁶ With respect to the present study, only two reviews have used the statistical definition of the APT: Ch'ng et al. (2001) on the Malaysia Stock Market and Dhankar et al. (2005) on the Indian Stock Exchange revealing two and five priced factors, respectively.

Little research has been carried out regarding the application of the APT for the Mexican Stock Exchange. To the best of our knowledge, the only references are Calle (1991), Navarro and Santillán (2001), López-Herrera and Vázquez (2002a y b) and Valdivieso (2004), all of whom used the macroeconomic approach. Although these authors found evidence of around four priced factors, there is a problem of low explanation power in some cases. Recently, Saldaña et al. (2007) used a macroeconomic and fundamental combined approach of the APT applied on the telecommunication sector of the Mexican Stock Exchange, finding favorable evidence of this asset pricing model. Conversely, Treviño (2011) presents a more robust econometric methodology for a longer period of time, finding little evidence in favour of a macroeconomic APT applied on the mexican stock market. Additionally, López-Herrera and Ortiz (2011) carry on a multifactor beta model to explain the relationship between macroeconomic factors and asset pricing in Mexico, United States and Canada, in order to analyze the integration of each market with global macroeconomic variables.

Regarding studies focused on Latin America where APT has been used under different approaches we can mention the following. Arango et al. (2013) carry on the APT under the macroeconomic approach on the Colombian Stock Exchange, using principal component analysis to summarise the set of macroeconomic factors and financial variables utilized in the study. They find that risk perception is the most important variable to explain stock's returns. Kristjanpoller and Morales (2011) apply the APT to the chilean stock market under the macroeconomic approach as well; they find some evidence regarding the impact of some macroeconomic variables on the returns on equities. Londoño et al. (2010) test the APT on the colombian market, under two approaches: a) a macroeconomic and b) a macroeconomic plus international stock markets indicators. Furthermore, they use a multilayer neural network to relate the main index from the Colombian Stock Exchange to the factors considered. Their findings show that the neural network approach is more effective than a traditional statistical one.⁷ Da Costa and Soares (2009) utilize a fundamental version of the APT applied to the Brazilian banking sector, finding weak evidence supporting this model. Oliveira (2011) present a comparative study using both the macroeconomic and the statistical approach of the APT, applied on three groups of countries composed by developed and emerging market, where some Latin American countries such as: Argentina, Chile and Mexico, are included. In this case the statistical factors are extracted by means of principal component analysis. Finally, Tabak and Staub (2007) use the APT to infer the probability of financial institution failure for banks in Brazil.

The aim of the present study is to fill a gap in the financial literature by testing a statistical definition of the APT on an important emerging financial market, the Mexican Stock Exchange. We shall extract the pervasive systematic risk factors by means of two different techniques: Principal Component Analysis and Factor Analysis through Maximum Likelihood. The structure of the present paper is as follows: first, we present the fundamentals of APT and of PCA and FA respectively; secondly, we describe the empirical study; in third place some conclusions are drawn; finally, we present the references, figures and tables.

Arbitrage Pricing Theory (APT)

The APT has been proposed as an alternative to the CAPM, but it does not provide a complete solution. The APT has some advantages over the CAPM since it represents a more generalized model; it considers risk factors other than the market, it does not need restrictive assumptions such as normality in the distributions of returns and the investors' utility functions, and the market portfolio does not play any role; however, it shares some of the CAPM's weaknesses, like the linearity of its specification and the requirement of using historical data. Whereas the CAPM begins with the market model, the APT starts with a generative multifactor model of returns defined by the following expression:⁸

The statistical approach to the APT assumes that the return on equity depends on a set of unobservable factors common to all stocks (F's) and on one specific component (ε).⁹ The problem here is that the values of the factors are unobservable, and so the betas cannot be estimated through a regression model, as is done in the market model. Subsequently, we need to use extraction techniques, such as Principal Component Analysis or Factor Analysis, to estimate the former equation for all the assets simultaneously, and to be able to extract the value of the factors (F's) and calculate their loadings or betas (β's).

The arbitrage argument or principle of arbitrage absence is based on the following reasoning. Taking into account the "single price law", in the same market two identical assets should have the same price; otherwise it would be possible to carry out an arbitrage transaction and obtain a differential profit. At the heart of APT and its pricing model lies arbitrage opportunities analysis, since only in its absence can we define a linear relation between the expected returns and the systematic risks. In order to avoid arbitrage possibilities, the return on equity must be equal to the expected return on the portfolio that combines the factor portfolios¹⁰ and the riskless asset (the mimicking portfolio, or the arbitrage portfolio). An arbitrage portfolio is any portfolio constructed with no capital invested and no risk taken that yields a null return on average.

By applying the arbitrage argument to the multifactor generative model, we arrive at the fundamental APT pricing equation:¹¹

where λ₀ represents the riskless interest rate, λ_k the risk premium for each kind of systematic risk factor, and β_k the sensitivities or exposures to each type of systematic risk.

Statistical risk factors

Our investigation is based upon the statistical approach of multivariate asset-pricing models; subsequently, we assume that the values of systematic risk factors are unobservable and that they must be extracted by means of statistical techniques. This approach presents certain advantages over others: gathering the required information is less expensive and more accessible than in macroeconomic or fundamental models; it is less subjectively biased because it does not predefine either the number or the nature of factors, so it is less exposed to an econometric specification error; and finally, the factors extracted are directly supported by a strong asset-pricing theory: the Ross (1976) APT. In addition, it involves two differentiated processes namely, risk extraction and risk attribution, which make it more objective. Conversely, statistical factors do not have a direct economic or financial interpretation, although in a second phase they can be correlated or decomposed with the help of explicit variables.¹² In other words, from this standpoint, risk measurement and risk attribution are different steps of the process.¹³

The two most commonly used multivariate analysis techniques for extracting risk factors are Principal Component Analysis and Factor Analysis, but there is still no firm view as to which one is the ideal technique. Classical studies have utilized both; for example: Roll et al. (1980), in their seminal work, carried out Factor Analysis through Maximum Likelihood (MLFA), suggesting that returns on equities are determined by the factor loadings or betas; however, Chamberlain et al. (1983) and Connor et al. (1988) claimed that eigenvectors obtained by PCA could also be used as factor loadings. In opposition to these views, Shukla et al. (1990) asserted that PCA is only equivalent to FA when the idiosyncratic risk for every asset is the same, since PCA does not consider the specific risks. We could say that FA is closer to the underlying spirit of APT than is PCA; nevertheless, the latter presents the advantage of offering a unique mathematical solution and making no assumptions about the normality of the returns.

Principal Component Analysis

Strictly speaking, PCA is not a model, as it merely represents a geometric transformation and projection of data in order to facilitate their interpretation. PCA seeks to obtain a smaller number of artificial variables, the principal components, via a linear combination of the original ones, assuming two basic restrictions: the principal components must be orthogonal to each other, and they must have decreasing variances. Each original variable contributes with a different weight to the principal component formation. In other words, we want to project the original data onto a smaller dimension where the components will be mutually uncorrelated and at the same time retain the maximal possible variance, i.e. the risk. The mathematical expression of the idea behind PCA is as follows:

where: y denotes the principal components; α, the coefficients or loadings for each variable in each component construction, and x, the original variables. Generalizing in abbreviated matrix notation for the generic principal component h we have:

and considering all the equations together for all the observations:

In order to estimate the vector a_h we have to decompose the covariance matrix by way of the linear algebra concept of eigenvalue decomposition (EVD)¹⁴, where a_h will be the eigenvector associated with the h-esim eigenvalue (λ_h) of the covariance or correlation matrix, after been ranked from higher to lower. In the classic version for the econometric contrast of the APT, loadings a will represent the exposures to the pervasive systematic risk factors, the betas of the APT model that will be regressed on the asset returns to obtain the factor returns or factor risk premiums (lambdas in the APT pricing equation).¹⁵ These betas or factor loadings, which together form the factor matrix, are the correlation between each variable and the principal components. According to Uriel et al. (2005) we can compute them by using the correlation coefficient r_hj between the h-esim component and the j-esim variable, as well.

Finally in PCA, we can obtain as many principal components as there are variables, because the covariance matrix (S) to be decomposed will contain in its main diagonal the total amount of variance represented by the value of one.¹⁶ In other words, we will try to explain the total amount of variance of the observed variables.

Factor Analysis (FA)

Factor Analysis represents an explicit model with its own hypothesis, assuming that the original variables are a linear combination of the underlying factors. Although FA seeks to obtain a smaller number of factors, like PCA, its philosophy is completely different. In FA, we construct the p variables¹⁷ through a linear combination of their m pervasive common factors¹⁸ (with m<p), their particular weights or exposures (betas), and a specific error term. In order to construct those factors, it is necessary to estimate the commonality or proportion of the variance explained by the common factors. Then, we have to split the variance and covariance matrix into two parts, one explained by common factors and the other by the error term. The fundamental idea of FA can be expressed in formal terms as follows:

where μ₁, μ₂, ..., μ_j, ... μ_p denote the vector of means of the variable; x₁, x₂, ..., x_j, ... x_p; the observable variables; f₁, f₂, ..., f_h, ..., f_m, the common factors; λ_jh, the factor loading h in variable j; and u₁, u₂, ..., u_p, are the specific factors. Generalizing for the generic variable j, we can express the value of a row of the former equations in condensed vector notation as follows:

and gathering all the equations for all the observations:

In FA the elements of matrix Λ (the λ coefficients) are the factor loadings applied to the common factors. They constitute the elements of the factor matrix and can be computed by the correlation coefficient r_hj in expression 6 as well. There are many techniques to estimate the parameters of the factor model. We can divide them into two approaches: a) based on the eigenvalue decomposition and b) based on the estimation of equations to reconstruct the correlation matrix. In FA, the number of factors (m) is smaller than the number of variables (p) because the correlation matrix of returns to be decomposed contains in its main diagonal an estimation of the initial commonality,¹⁹ depending on the estimation technique utilized. In other words we will explain only the amount of variance explained by common factors, i.e., the covariance or correlations among the variables.

To summarize, the main difference between these techniques is that in PCA the components are constructed as a linear combination of the observable variables, whereas in FA, the observable variables are explained by the common factors. Thus, although in PCA we can express the variables in terms of the principal components by way of an algebraic transformation, both methods will not be equivalent unless the error term in FA tends to zero, since in FA we assume that the specific factors are uncorrelated with each other and with the common factors.

Empirical Study

According to the above-stated we take the Arbitrage Pricing Theory as our theoretical framework which poses on one hand, a generative multifactor model of returns, and on the another hand, an arbitrage absence principle, that together, produce an asset pricing model. Nevertheless, the scope and limitations of our research are given precisely for the statistical approach to the APT. Our study is focused in the risk extraction process whose main objective is to uncover the underlying multi-factor structure of systematic risk driving the returns on equities, independently of the number and nature of the factors. The risk attribution process is basically out of the scope of the present study, however, in this section we will attempt to provide a first approach to the meaning of the extracted systematic risk factors in order to be able to identify them. Likewise, the test of the arbitrage principle is out of the scope of the current study.²⁰

In other words, the main objective of our empirical study is to uncover the underlying generative multifactor structure of returns of our sample, by way of the use of classic dimension reduction or feature extraction techniques such as PCA and FA. The results will show that the generative multifactor model of returns performs very well; however the systematic risk factors extracted and the betas estimated must be tested in order to verify whether or not they are priced according to the APT pricing model. In a second stage of our methodology, we run an econometric contrast in order to determine which of them are statistically significant and consequently determine whether or not the APT is accepted as an asset pricing model in the context of our study.

The data

The empirical study was carried out on the Mexican Stock Exchange (BMV); for this, two aspects were taken into account: first, that very little research has been done concerning this institution; second, its relevance as an emergent financial market. The stocks selected for this study form part of the IPC and represent leading companies in the industrial sectors to which they belong; thus, we can consider them to be characteristic securities of the BMV and the Mexican economy. Both the period analysed and the shares selected reflected the availability of data among the diverse information sources consulted. Our basic aim was to build a homogeneous and sufficiently broad database, capable of being processed with the multivariate and econometric techniques involved in the APT model. First, we chose the IPC sample used from February 2005 to January 2006; then, we constructed two return databases taking into account, as the main criterion, that the equities chosen had remained in the IPC sample during all the considered periods for which information was available.²¹ In accordance with these considerations, we prepared a database made up of 20 companies and 291 weekly quotations (DBWR) ranging from July 7, 2000 to January 27, 2006; in addition, one with 22 shares that included 1 410 observations (DBDR) from July 3, 2000²² to January 27, 2006.²³ We calculated the logarithmic weekly returns considering the assets' closing prices²⁴ for each Friday, in accordance with the following expression:²⁵

We also built two other databases considering the returns in excess of the ris-kless interest rate. The interest rates considered as the riskless interest rate were the average weekly and daily funding interest rates using government securities, published by the Bank of Mexico. For the weekly databases, it was necessary to convert them into the weekly equivalent to make them comparable with our returns on equities. After that, we subtracted the daily and weekly riskless interest rates from the weekly and daily returns on equities, respectively, in the two databases described above. Thus, we produced two more new databases, including the same stocks and observations as in the former, but expressed as returns in excess over the riskless interest rate (DBWE and DBDE). Consequently, our study was applied to the four resulting databases, i.e., we tested the two model specifications for the two different databases.

The period analyzed in this study (2000-2006) was considered according to the following criteria:

This article represents the first part of a major research, where we are testing different techniques for extracting the underlying systematic risk factors in the context of the Mexican Stock Exchange. Principal Component Analysis and Factor Analysis represent the classic techniques to perform that extraction, under a statistical approach of the systematic risk factors.

1. Both, the techniques used in this article, and the other techniques utilized in the next stages of our research, have an explanatory and a predictive character. We first are carrying out the explanatory approach, which make us to divide our dataset in two blocks: one for explanation or training and another for prediction; i.e., the first period is used for the explanation or training of the model, and the second one, will be used for testing the predictive power of the generative model of returns estimated. Consequently, the data from 2000 to 2006 were used to extract the generative underlying structure of returns, which explains the behavior of the returns of the training period. This estimation will help us in the next stage of prediction, where the model will be tested in subsequent periods of time (from 2006 on).

2. The other techniques that we are employing in our research are the Independent Component Analysis and the Neural Networks Principal Component Analysis; our objective is to be able to compare the results of the four techniques, concerning both their explanatory and predictive power. Therefore, we are using the same training and prediction periods for the four of them.

3. Additionally, other reason for using this period of the dataset, is to be able to compare in further studies, the effects of the 2008-2009 financial crisis in the estimation of the underlying structure of systematic risk, by way of the extraction of the generative model of returns during the crisis and the post-crisis periods, using the four techniques²⁶

Preliminary tests

First of all, the following tests were carried out to establish the adequacy of the sample to be treated with multivariate techniques.²⁷ The number of observations in all the databases was suitable.²⁸ The correlation matrix structure ensured the existence of a sufficient correlation level among the variables, according to the results of the following tests. Visual inspection of the correlation matrix revealed that a large number of correlation coefficients exceeded the generally accepted parameters.²⁹ Bartlett's sphericity test verified that the correlation matrix was significantly different from the identity matrix.³⁰ The Kaiser-Meyer-Olkin index, in all four databases, was also very good.³¹ Finally, the anti-image correlation matrix³² and the Measures of Sampling Adequacy (MSA)³³ also produced excellent results. Thus, on the basis of the evidence produced, we were able to proceed with confidence to extract the risk factors using PCA and FA.

Extraction of underlying systematic risk factors via PCA and MLFA

In this study, we first obtained the generative multifactor model of returns in expression 1, using the classic multivariate techniques to extract the underlying factors Principal Component Analysis (PCA) and Maximum Likelihood Factor Analysis (MLFA). Using a Matlab^® code programmed to perform the PCA and MLFA on our four databases, we obtained the scores of the principal components (Y) and the common factors (F) hierarchically ordered, as well as the matrices of weights for PCA and FA (A and Λ, respectively).

Since there is not a definite widespread criterion to define the best number of components to extract in PCA and in FA, we have used nine different criteria usually accepted in PCA and FA literature. These criteria have been: the arithmetic mean of the eigenvalues, the percentage of explained variance, the exclusion of the components or factors explaining a small amount of variance, the scree plot, the unretained eigenvalue contrast (Q statistic), the likelihood ratio contrast, Akaike's information criterion (AIC), the Bayesian information criterion (BIC), and the maximum number of components feasible to estimate in each technique. Considering that each criterion indicated a different number of factors to extract in each database, for the sake of comparison among techniques and pursuing the main objective of extracting a smaller number of risk factors than the number of stocks, we chose a window test for all the databases ranging from two to nine factors according to the results presented in table 1. Subsequently, we estimated eight different multifactor models to extract from 2 to 9 principal components and common factors for each one of our four databases.³⁴ Then, we proceeded to reconstruct the original variables according to the generation process of each technique by computing the following expression in PCA:³⁵

and the following expression in FA:³⁶

The reconstruction of the observed returns or excesses was outstanding for almost all the stocks in the four databases, which imply that the estimation of the generative multifactor model in the statistical approach of the APT performed by both PCA and FA was successful. Nevertheless, the highest and lowest peaks in some stocks were not very well reconstructed. For reasons of saving space, we only present the lines and stem plots of the observed and reproduced returns and excesses of the first 5 stocks of each database, which belong to the experiment where we extracted nine underlying factors.³⁷ Figures 1 and 2 show the results of PCA and FA, respectively. We can easily observe that the reconstruction is very good in almost all cases.

Explanation of the variability by the extracted components or factors

The amount of variance explained by the extracted components or factors, as well as the accumulated one, is presented in table 2. We can observe that in all cases the three first components and factors explain between the 66% and the 84% of variability, which give some evidence about the importance of those components or factors. Factor analysis overcomes principal component analysis in this aspect, since in the four databases produce higher percentage of accumulated explanation. Moreover, in almost all cases the factors extracted by FA explain higher amounts of variance than those estimated by PCA.

Interpretation of the extracted factors

Although the second process of the statistical approach to the APT, i.e., the risk attribution stage, is out of the scope of this study, in this section we will make a first attempt to propose an interpretation of the meaning or nature of the systematic risk factors extracted, following a classic approach which has been widely used when PCA and FA are used to reduce dimensionality or to extract features from a multifactor dataset. This approach is based on using the factor loading matrix estimated in the extraction process to identify the loading of each variable in each component or factor; high factor loadings in absolute terms indicate a strong relation between the variables and the factor. In our context, the factors will be saturated with loadings of one stock or a group o stocks that may help us to indentify those factors with some economic sectors, as a first approach of interpretation of each component or factor.

In line with the previously reported results, we only present the factor loading matrix plots of each database, which belong to the experiment where we extracted nine underlying factors; figure 3 present the results of PCA and figure 4 those for FA. We constructed some tables summarizing the results derived from the analysis of the factor loading matrices and plots, where we propose some economic sector that may be related to each factor. We group together the stocks with the highest loading in each factor according the economic sectors official classification used in the Mexican Stock Exchange; table 3 present this summary. In general, as expected by theory, in both techniques for the four datasets, the first component or factor is clearly related to the market factor. In addition, there is no difference, regarding the interpretation of factors, in the models expressed in returns and those specified in excesses, with the exception of the factor seven extracted via factor analysis in the daily databases. Concerning PCA we can observe that the second and third components are identified with the minery and construction sector, respectively, in the four databases; however, from the fourth to the ninth components we can find a distinction between the interpretation of components extracted from the weekly and the daily databases. Respecting FA, there is a marked difference between the interpretation of factors that affect the weekly and daily returns, as we can observe in the Table 3. Relating both techniques, in addition to the first factor, only the third factor might be identified as the same factor for almost all the datasets and expression of the model, which corresponds to the construction sector. We can remark that we can identify two factors related to two important business groups in Mexico, which we may explain as market movers in the Mexican Stock Exchange. These components or factors are the PC5 extracted by PCA from the weekly databases, that it may be understood as the Salinas Group factor; and the F2, extracted by FA, in the weekly datasets, that we may associate with the Slim Group.

Finally, attending to the explained variance of each components or factors extracted (see table 2), we could select the first three of them in each dataset as the main factors, which lead us to think that: the market factor (for all datasets and both techniques), minery factor (for PCA), the Slim Group and communication/ commercial factors (for the weekly and daily datasets using FA, respectively), the construction sector (for PCA and weekly databases in FA), and the radio and television sector factor (for daily databases en FA), could be the most important factors explaining the returns on equities in the Mexican Stock Exchange.

Econometric contrast

As a complement to our research, we carried out an econometric contrast of the APT using the underlying systematic risk factors extracted via PCA and FA, in order to test its validity as a suitable pricing model for the sample and periods considered. This contrast represents only a first approach to the econometric validation of the APT using PCA and FA, so the result should be viewed in that light. The APT's pricing equation in expression 2 can be tested by way of an average cross-section methodology estimating the ordinary least squares (OLS) coefficients of the following regression model:

Since both factors and sensitivities are computed simultaneously by the multi-variate techniques usually employed (Amenc and Le Sourd, 2003), the straight methodology for contrasting the APT under the statistical approach, use directly the loadings estimated in expression 1 as the betas in the former regression model (Gomez-Bezares et al., 1994). Nevertheless, as Marin and Rubio (2001) and Nieto (2001) remark, this methodology could present some econometric problems such as heteroskedasticity and autocorrelation in the residuals in addition to error in variables, which would yield inefficient OLS estimators with biased variances. One possible solution to the foregoing problems is to employ a two-stage methodology widely used in the fundamental and macroeconomic approach to the APT, where in a first stage we estimate the betas to use in expression 13 from the scores of the extracted factors, then in a second stage we estimate the lambdas.

Following Bruno et al. (2002)³⁸, in the first stage we estimated the betas or sensitive to the underlying risk factors to use in expression 13, by regressing the factor scores obtained by the PCA or MLFA as a cross-section on the returns and excesses. In order to improve the efficiency of the parameter estimates and to eliminate autocorrelation in the error terms of the regressions, we used weighted least squares (WLS)³⁹ to estimate the entire system of equations at the same time.⁴⁰ The results of the regressions in the four databases were very good, producing, in almost all cases, statistically significant parameters, high values of the R² coefficients and results in the Durbin-Watson test of autocorrelation, which lead us to the non-rejection of the null hypothesis of no-autocorrelation.⁴¹

In accordance with Jordan and Garcia (2003)⁴², in the second stage we estimated the lambdas or risk premia in expression 13 by regressing the betas obtained in the first stage as a cross-section on the returns and excesses, using ordinary least squares. In order to avoid the econometric problems of heteroskedasticity and autocorrelation in the residuals of the model estimated through OLS, we used ordinary least squared corrected by heteroskedasticity and autocorrelation by means of the Newey-West heteroskedasticity and autocorrelation consistent covariances estimates (HEC). Additionally, we verified the normality in the residuals by carrying out the Jarque-Bera test of normality. In order to accept the APT pricing model, we require the statistical significance of at least one parameter lambda different from λ₀,⁴³ and the equality of the independent term to its theoretic value, i.e., the average returns, in the models expressed in returns:

and zero, in the models expressed in excesses of the riskless interest rate:

We used Wald's test to confirm these equalities.

In table 4, we present a summary of the results of the econometric contrast for PCA and in table 5 for MLFA. In general, the results of the explanation power, the adjusted R-squared (R²*), the statistical significance of the multivariate test (F), and the Jarque-Bera normality test of the residuals are very good in all the contrasted models, except in the cases where only two factors were extracted using PCA; nevertheless, using FA there are more models that do not produce a good level of explanation and they are not statistically significant in multivariate terms. The univariate tests for the individual statistical significance of the parameters⁴⁴ priced from one to six factors different from λ₀ in PCA and from one to eight in MLFA, thus giving evidence in favour of the APT in 30 models using PCA and 27 utilising FA.⁴⁵ Nevertheless, only four models in PCA and three in MLFA fulfilled both the statistical significance of the parameters and the equality of the independent term to its theoretic value, in addition to the fulfilment of normality in the residuals. Concerning the PCA these models were the one expressed in weekly returns when seven components were extracted, those expressed in daily returns when three, and nine components were retained, and the model expressed in daily excesses with six components. Regarding the MLFA those models were the ones using five factors in the weekly returns database, and eight and nine, in the daily returns dataset.

Making a cross validation of the accepted models and the interpretation of factors proposed above, we could state de following: The significant components that affect the weekly model accepted in PCA, are the minery and construction sector factor. For the accepted daily models expressed in returns, the previous components are significant as well, in the models that consider two and nine factors; additionally, the model with nine factors is affected by the entertainment consum, the holding-beverage-Salinas group, and infrastructure-financial sector factors. Model with 6 betas, consider almost the same components, in addition to the market sector factor. Concerning the accepted models in FA, in the weekly database of returns, the significant factors would be the market one, and the communication-commercial sector factor. Finally, for the daily databases of returns, those would be represented by the holding-beverage-Salinas Group, and the holding-food and beverage sector factors, in the case of the model using 8 betas; and the entertainment consum sector factor and a miscellaneous sector factor not clearly identified, in the case of the model with 9 betas.

Interestingly, market factor was statistically significant only in two of the accepted models; further research would be needed about this issue, as well as about the meaning of the undersized value and sign of the estimated individual parameters.

To summarize, for the sample and periods considered, we can accept only partially the validity of the APT using PCA and FA as a pricing model explaining the average returns (and returns in excesses) on equities of the Mexican Stock Exchange. On the other hand, the evidence showed that the APT is very sensitive to the number of factors extracted and to the periodicity and expression of the models.

Conclusions

In general, and in accordance with the scope and limitations of this study, the estimation of the generative multifactor model of returns by means of PCA and FA reproduced the observed returns on equities of our sample very well; thus we can state that both techniques performed an outstanding extraction of the underlying systematic risk factors driving the returns on equities of our sample, under an statistical approach of the APT.

Regarding the interpretation, according the basic approach carried on in this study, we uncover that factors or components driving the returns are sensitive to the technique used, the periodicity and the expression of the returns used in the model.

Conversely, for the sample and periods considered, we can accept only partially the validity of the APT using PCA and FA, as a pricing model explaining the average returns (and returns in excesses) on equities of the Mexican Stock Exchange. On the other hand, the evidence showed that the APT is very sensitive to the number of factors extracted and to the periodicity and expression of the models. The APT model, as applied in this study, did not produce a clear correspondence with the behavior of the returns in the Mexican Stock Market; nevertheless, we have detected some evidence favourable to the APT revealing the presence of priced pervasive statistical risk factors in a large number of models, as well as seven models that fulfilled completely all the requirements for accepting the APT pricing model.

Consequently, we conclude that the performance of the APT statistical approach with respect to the Mexican Stock Exchange presents some inconsistencies that make it unstable and sensitive to the different techniques used for extracting risk factors. Further research will be required to examine alternative approaches for underlying factor extraction, such as Independent Component Analysis (ICA) and Neural Networks Principal Component Analysis (NNPCA), in order to uncover the true generative structure of returns on equities in this emerging market. Finally, our results are consistent with earlier studies in which this statistical approach was applied to other markets and with the number of priced factors found in Mexico through studies in which a macroeconomic approach was used.⁴⁶

References

Abeysekera, S. P. and A. Mahajan (1987). A test of the APT in pricing UK stocks. Journal of Business Finance & Accounting 14 (3): 377-391. [ Links ]

Amenc, N. and V. Le Sourd (2003). Portfolio theory and performance analysis. Great Britain: Wiley. [ Links ]

Aquino, R. Q. (2005). Exchange rate risk and Philippine stock returns: Before and after the Asian financial crisis. Applied Financial Economics 15 (11): 765-771. [ Links ]

Arango, C., G. González, D. Peláez and H. Velásquez (2013). Arbitrage Pricing Theory: Evidencia empírica para el mercado accionario Colombiano, 2005-2012, Working paper, Universidad EAFIT. Available at: http://repository.eafit.edu.co/handle/10784/629 [ Links ]

BARRA (1998). United States equity. Version 3. (E3). Risk model handbook. USA: BARRA. [ Links ]

Beenstock, M. and K. F. Chan (1988). Economic forces in the London Stock Market. Oxford Bulletin of Economics & Statistics 50 (1): 27-39. [ Links ]

Bisquerra, R. (1989). Introducción conceptual al análisis multivariante. Un enfoque informático con los paquetes SPSS-X, BMDP, LISREAL y SPAD. Vol. I. Barcelona: Promociones y Publicaciones Universitarias. [ Links ]

Bank of Mexico. (2006). Statistical information. Available at: http://www.banxico.org.mx [ Links ]

Bower, D. H., R. S. Bower and D. E. Logue (1984). Arbitrage Pricing Theory and utility stock returns. The Journal of Finance 39 (4): 1041-1054. [ Links ]

Brown, S. J. and M. I. Weinstein (1983). A new approach to testing asset pricing models: The bilinear paradigm. The Journal of Finance 38 (3): 711-743. [ Links ]

Burmeister, E., R. Roll and S. A. Ross (2003). Using macroeconomic factors to control portfolio risk, Working Paper, FTSE-BIRR, New York. [ Links ]

Bruno N., U. Medina and S. Morini (2002). Contraste factorial del Arbitrage Pricing Theory en el Mercado Bursátil Español. Análisis Financiero (88): 58-63. [ Links ]

Calle, L. F. de la (1991). Diversification of Macroeconomic risk and international integration of capital markets: The case of Mexico. The World Bank Economic Review 5 (3): 415-436. [ Links ]

Cauchie, S., M. Hoesli and D. Isakov (2004). The determinants of stock returns in a small open Economy. International Review of Economics and Finance 13 (2): 167-185. [ Links ]

Carbonell, J. and S. Torra (2003). Contrastación empírica de los modelos de valoración CAPM y APT: Aplicación a los índices de la Bolsa de Valores de Barcelona, Working Paper, Servicio de Estudios, Bolsa de Barcelona, Barcelona. [ Links ]

Chamberlain, G. and M. Rothschild (1983). Arbitrage, factor structure, and mean-variance analysis on large asset markets. Econométrica 51 (5): 1281-1304. [ Links ]

Chan, L. K. C., J. Karceski and J. Lakonishok (1998). The risk and return from factors. Journal of Financial & Quantitative Analysis 33 (2): 159-188. [ Links ]

Chen, N. F. (1983). Some empirical tests of the theory of Arbitrage Pricing. The Journal of Finance 38 (5): 1393-1414. [ Links ]

----------(1991). Financial investment opportunities and the Macroeconomy. The Journal of Finance 46 (2): 529-554. [ Links ]

Ch'ng, H. K. and G. S. Gupta (2001). A test of Arbitrage Pricing Theory: Evidence from Malaysia. Asia Pacific Journal of Economics and Business 5 (1): 76-96. [ Links ]

Cho, D. C., E. J. Elton and M. J. Gruber (1984). On the robustness of the Roll and Ross Arbitrage Pricing Theory. The Journal of Financial and Quantitative Analysis 19 (1): 1-10. [ Links ]

Connor, G. (1995). The three types of factor models: A comparison of their explanatory power. Financial Analysts Journal 51 (3): 42-46. [ Links ]

----------and R. A. Korajczyk (1988). Risk and return in an equilibrium APT: Application of a new test methodology. Journal of Financial Economics 21 (2): 255-289. [ Links ]

Costa, M. da and M. Soares (2009). Teoria de Precificação por Arbitragem: Um estudo empírico no setor bancário Brasileiro. Enfoque: Reflexão Contábil 28 (1): 70-82. [ Links ]

Dhankar, R. S. and R. Singh (2005). Arbitrage Pricing Theory and the Capital Asset Pricing Model-evidence from the Indian Stock Market. Journal of Financial Management & Analysis 18 (1): 14-27. [ Links ]

Elton, E. J. And M. J. Gruber (1988). A multi-index risk model of the Japanese Stock Market. Japan and the World Economy 1 (1): 21-44. [ Links ]

Eviews® (2002). Help of Eviews 4.1. Quantitative Micro Software. [ Links ]

Fuentes, R., J. Gregoire and S. Zurita (2006). Macroeconomic factors in Chilean work performances. Trimestre Económico 73 (289): 125-138. [ Links ]

Gómez-Bezares, F., J. A. Madariaga and J. Santibañez (1994). Valoración de acciones en la bolsa española. Bilbao: Editorial Desclee de Brouwer. [ Links ]

Hair, J. F. Jr., R. E. Anderson, R. L. Tatham and W. C. Black (1999). Análisis multivariante, 5a ed. Madrid: Pearson-Prentice Hall. [ Links ]

Hasbrouck, J. and D. J. Seppi (2001). Common factors in prices, order flows, and liquidity. Journal of Financial Economics 59 (3): 383-411. [ Links ]

Iqbal, J., and A. Haider (2005). Arbitrage Pricing Theory: Evidence from an emerging stock market. Lahore Journal of Economics 10 (1): 123-139. [ Links ]

Jordán, L. and J. García (2003). Estimación y contraste del modelo APT en los fondos de inversión mobiliaria españoles. Análisis Financiero (89): 22-35. [ Links ]

Kristjanpoller, W. and M. Morales (2011). Teoría de la asignación del precio por arbitraje aplicada al mercado accionario chileno. Lecturas de Economía 74: 37-59. [ Links ]

Lehmann, B. N. and D. M. Modest (1988). The empirical foundations of the Arbitrage Pricing Theory. Journal of Financial Economics 21 (2): 213-254. [ Links ]

Liu, Y. and D. Melas (2007). Macroeconomic factors in a fundamental world. MSCI-BARRA Research Insights march: 1-14. [ Links ]

Londoño, C., M. Lopera y S. Restrepo (2010). Teoría de precios de arbitraje. Evidencia empírica para Colombia a través de redes neuronales. Revista de Economía del Rosario 13 (1): 41-73. [ Links ]

López-Herrera, F. and E. Ortiz (2011). Dynamic multibeta macroeconomic asset pricing model at NAFTA stock markets. International Journal of Economics and Finance 3 (1): 55-68. [ Links ]

----------and F. J. Vázquez (2002a). Un modelo de la APT en la selección de portafolios accionarios en el mercado mexicano. Contaduría y Administración (206): 9-30. [ Links ]

----------(2002b). Variables económicas y un modelo multifactorial para la Bolsa Mexicana de Valores: Análisis empírico en una muestra de activos. Revista Latinoamericana de Administración (29): 5-28. [ Links ]

Luque, T. (2000). Técnicas de análisis de datos en investigación de mercados. Madrid: Pirámide. [ Links ]

Marin, J. and G. Rubio (2001). Economía financiera. Barcelona: Antoni Bosch. [ Links ]

Maringer, D. G. (2004). Finding the relevant risk factors for asset pricing. Computational Statistics & Data Analysis 47 (2): 339-352. [ Links ]

Miller, G. (2006a). Equity risk modeling: A comparison of factor models. Horizon. The MSCI-BARRA Newsletter (181): 2-17. [ Links ]

----------(2006b). Needles, haystacks, and hidden factors. Journal of Portfolio Management 32 (2): 25-32. [ Links ]

Navarro, C. M. and R. Santillán (2001). A test of the APT in the Mexican Stock Market, Research Paper, BALAS Conference, University of San Diego, San Diego. [ Links ]

Nieto, B. (2001). Los modelos multifactoriales de valoración de activos: Un análisis empírico comparativo, Working Paper, Instituto Valenciano de Investigaciones Económicas, Alicante. [ Links ]

Oliveira, B. (2011). Arbitrage Pricing Theory in international markets, Master Dissertation, Faculdade de Economia, Administração e Contabilidade, Universidade de São Paulo. [ Links ]

Peña, D. (2002). Análisis de datos multivariantes. Madrid: MacGraw-Hill. [ Links ]

Rensburg, P. van (2000). Macroeconomic variables and the cross-section of Johannesburg Stock Exchange returns. South African Journal of Business Management 31 (1): 31-43. [ Links ]

Reinganum, M. R. (1981). The Arbitrage Pricing Theory: Some empirical results. The Journal of Finance 36 (2): 313-321. [ Links ]

Roll, R. and S. A. Ross (1980). An empirical investigation of the Arbitrage Pricing Theory. The Journal of Finance 35 (5): 1073-1103. [ Links ]

Ross, S. A. (1976). The Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory 13 (3): 341-360. [ Links ]

Saldaña, J., M. Palomo and M. Blanco (2007). Los modelos CAPM y APT para la valuación de empresas de telecomunicaciones con paramétros operativos. Innovaciones de Negocios 4 (2): 331-355. [ Links ]

Sheikh, A. (1996). BARRA's risk models. Barra Research Insights: 1-24. [ Links ]

Shukla, R. and C. Trzcinka (1990). Sequential tests of the Arbitrage Pricing Theory: A comparison of principal components and maximum likelihood factors. The Journal of Finance 45 (5): 1541-1564. [ Links ]

Shum, W. C. and G. Y. N. Tang (2005). Common risk factors in returns in Asian emerging stock markets. International Business Review 14 (6): 695-717. [ Links ]

SUNGARD-APT. (2010). The APT Approach. Available at: http://www.sungard.com/campaigns/fs/alternativeinvestments/apt/insights.aspx. [ Links ]

Tabak, B. and R. Staub (2007). Assessing financial instability: The case of Brazil. Research in International Business and Finance 21: 188-202. [ Links ]

Teker, S. and O. Varela (1998). A comparative analysis of security pricing using factor, macrovariable and Arbitrage Pricing models. Journal of Economics & Finance 22 (2-3): 21-41. [ Links ]

Treviño, M. (2011). Time varying Arbitrage Pricing factors in the Mexican Stock Matket, Working paper, Universidad Autónoma de Nuevo León. Available at SSRN: http://dx.doi.org/10.2139/ssrn.1929141 [ Links ]

Twerefou, D. K. and M. K. Nimo (2005). The impact of macroeconomic risk on asset prices in Ghana, 1997-2002. African Development Review 17 (1): 168-192. [ Links ]

Uriel, E. and J. Aldas (2005). Análisis multivariante aplicado. Madrid: Thompson. [ Links ]

Valdivieso, R. (2004). Validación de la eficiencia y modelos de fijación de precios en el Mercado Mexicano de Valores, Doctoral Dissertation, Universidad Nacional Autónoma de México, Mexico, D.F. [ Links ]

Visauta, B. and J. C. Martori (2003). Análisis estadístico con SPSS para Windows: Estadística multivariante. Madrid: MacGraw-Hill. [ Links ]

Yip, F. and L. Xu (2000). An application of Independent Component Analysis in the Arbitrage Pricing Theory. In S. I. Amari, C. L. Giles, M. Gori and V. Piuri (Eds.). Proceedings of the International Joint Conference on Neural Networks. California: IEEE: 279-284. [ Links ]

Zangari, P. (2003). Equity risk factor models. In B. Litterman (Ed.). Modern Investment Management. New J: John Wiley & Sons: 334-395. [ Links ]

Notas

¹ A revision of empirical studies using approaches other than the statistical one is beyond the scope of this paper; however, interested readers can easily find many references in the financial literature.

² In a second stage, it is possible to identify the pervasive factors with some financial or macroeconomic variables by means of correlation procedures or other kind of methodologies.

³ For a more extensive study of the MSCI-BARRA model see Amene et al. (2003), Sheikh (1996), BARRA (1998).

⁴ For more information about BIRR model see Burmeister et al. (2003).

⁵ For more details about Advaneed Portfolio Teehnology (APT) model see Amene et al. (2003) and SUNGARD-APT (2010).

⁶ Some references are van Rensburg (2000) on Johannesburgh; Ch'ng et al. (2001) on Malaysia; Aquino (2005) on the Philippines; Dhankar et al. (2005) on India; Twerefou et al. (2005) on Ghana; Iqbal et al. (2005) on Karachi; Shum et al. (2005) on Hong Kong, Singapur, and Taiwan; and Fuentes et al. (2006) on Chile.

⁷ The better results may be explained by the non-linear specification of the APT, which is out of the scope of this paper but represents a future line of research of the authors as a continuation of the present work.

⁸ Where, β_ji represents the sensitivity of equity j to factor F_jt the value of the systematic risk factor j in time t common for all the stocks, and the idiosyncratic risk affecting only equity i.

⁹ It is assumed that the factors are uncorrelated with each other, as are the model's residual terms, both with each other and with the factors.

¹⁰ Portfolios which mimic the systematic risk factors in the economy.

¹¹ A mathematical demonstration for obtaining the fundamental pricing equation from the generative multifactor model of returns by the application of the arbitrage argument can be found in Amenc et al. (2003).

¹² See Amenc et al. (2003) and SUNGARD-APT (2010).

¹³ On the other hand, the rest of the approaches usually mix these two differentiated processes in one step.

¹⁴ The eigenvalue decomposition implies: S=ULU'; where S is the covariance matrix; U, the eigenvector matrix; L, the eigenvalue matrix, and U' the matrix U transposed. When we use normalized data the matrix S is equal to the correlation matrix R.

¹⁵ In this study we carry on a two-stage version for the econometric contrast explained in the empirical study section.

¹⁶ The value of one will be in the case of using the matrix o f correlation (R).

¹⁷ In our case, returns on equities.

¹⁸ In our case, systematic risk factors.

¹⁹ A number always less than one.

²⁰ Forthcoming researches will center on the risk attribution process of the statistical approach as well as on the test of the arbitrage principle of the APT.

²¹ Survival bias: Equities that did not remain in the IPC sample throughout the entire study period, because they were unlisted, substituted, or only present for some periods, were excluded. The purpose of this criterion was to work with a strong database (from a financial point of view), considering only stocks that had survived as part of the IPC sample throughout this period of time, satisfying all the listing and maintenance requirements established by the BMV. See Gomez-Bezares et al. (1994).

²² In this case, we started in july because, until 2000, the IPC sample validity was half-yearly, with the new half-yearly sample beginning in july. From 2001 to 2010, the sample validity was yearly, changing each february.

²³ The number of assets and the periods considered were defined by the available information in accordance with the above-stated criteria. Unfortunately, since there are many gaps in the observations of several stocks in the mexican market, it is very difficult to build a dataset of quotations which contains both a long number of observations and a large number of stocks. In our case, the 20 and 22 stocks considered represents the maximum number of shares from which we could obtain a good enough number of observations of all of them, that allowed us to build complete and homogeneous datasets for both periodicities (without missing values). This fact constitutes a very important aspect for the correct application of the extraction techniques presented. In addition, we decided to use two differently structured databases in order to test the case of weekly and daily returns as well as a larger and a smaller number of observations, according to the different studies found in literature.

²⁴ Although other studies have included other elements such as dividends and application rights to calculate the return on equities in addition to price variation, we could not incorporate them, as this sort of data was not available to us.

²⁵ Where rit is the return on equity i in time t; ln, the Neperian logarithm; Pit, the equity price i in time t; Pit-1, the equity price i in time t minus 1.

²⁶As stated in the introduction of this article this study only focuses in the estimation of the explanatory model using PCA and FA. The estimation of the explanatory models using the other referred techniques, the testing of the prediction power of the estimated models and the comparison of the results in the crisis and post-crisis periods are out of the scope of this article, and represent other stages of the research conducted by the authors of the present document.

²⁷Strictly speaking, the first preliminary test would consist in verifying the univariate normal distribution of the returns on equities. We used the Jarque-Bera test on the four databases, finding that in most cases the stocks of our sample did not follow a univariate normal distribution. However, the effects of this condition on our results are beyond the scope of this study.

²⁸ There were 291 observations in two databases and 1 410 in the other two. Luque (2000) recommends having at least 100 cases and no fewer than 50. Hair et al. (1999) considered it necessary to have five times more observations than variables. In our case, those figures would represent 100 and 110, respectively.

²⁹ While some authors believe that a suitable correlation level must be higher than 0.3, many others think it must be at least 0.5.

³⁰ In the four databases we obtained high values in this respect, fluctuating around 2 162.23 and 2 176.19 in the weekly databases, and around 9 707.33 and 9 723.98 in the daily databases, with a significance level of zero in all four cases; we reject the null hypothesis that the correlation matrix was an identity matrix, and conclude that the variables were mutually correlated. The higher the value of the statistic and the smaller the significance level, the less probability that the correlation matrix is an identity matrix. For more details about Bartlett's sphericity test, see Luque (2000).

³¹ The results for this statistic in all four databases reached levels higher than 0.90. Its feasible values range from 0 to 1, values over 0.80 are considered to be good to excellent. The objective of this test is to compare the magnitudes of the observed correlation and the partial correlation coefficients among variables. For details, see Visauta et al. (2003).

³² This test requires small values for the coefficients. The anti-image matrix is formed with the negatives of the partial correlation coefficient for each pair of variables, neutralizing the effect of the others.

³³ The levels obtained were over 0.90 in almost all cases. We found the MSA in the main diagonal of the anti-image correlation matrix. They would be the KMO, but for each variable individually, so their parameters and interpretation are the same as for the KMO. See Visauta et al. (2003).

³⁴ The total number of estimated multifactor models was 32 for PCA and 32 for MLFA.

³⁵ This expression represents an algebraic transformation of the expression 5 taken from Peña (2002).

³⁶ This expression is the same expression that expression 9 but without including the matrix of specific factors U, because this matrix represents the error in reproduction of the original variables, which will be known after the reconstruction process and is computed by: U=X-Xr, where Xr is the matrix X reconstructed.

³⁷ In this paper we only show results for this experiment, nevertheless, the rest of the estimations when 2, 3, 4, 5, 6, 7 and 8 components or factors where extracted present similar behavior.

³⁸ In their work, the authors use principal component analysis to extract the underlying risk factors from a set of macroeconomic variables in the spanish market.

³⁹ According to this methodology as stated in the help of Eviews® (2002): "The equation weights are the inverses of the estimated equation variances, and are derived from unweighted estimation of the parameters of the system".

⁴⁰ Our first attempt to estimate all the betas in the system of equation was a seemingly unrelated regression (SUR), however, the estimation was not possible since the SUR methodology requires computing the inverse of the residual matrix, and our data produce a residual matrix near a singular one; subsequently, it was not feasible to compute its inverse.

⁴¹ For reasons of saving space these results are not presented.

⁴² In their study the authors use factor analysis to extract the underlying risk factors from a set of returns on mutual funds in the spanish market.

⁴³ The ideal situation is that more than one parameter different from λ₀ be statistically significant, since the APT assumes that there are multiple underlying risk factors in the economy affecting the returns on equities, not only one.

⁴⁴ Statistic t.

⁴⁵ The total number of tested models was 32.

⁴⁶ See references in the introduction of this paper.