1. Introduction

The objective of the present study is to analyse trends of a growing better educated labour force in Mexico and appraise how the labour market has valued the increasing human capital accumulation. This objective will be carried out through the estimation of the returns to education, specifically the eﬀect on wages of a rising education level. The level of education chosen and the market rewards in the form of wages may have been aﬀected by several structural changes that occurred in Mexico in recent decades, including an increase in compulsory education years and greater openness to trade and privatization of government-owned firms. The Mexican economy has faced sluggish economic growth in recent years; on average, its Gross Domestic Product (GDP) grew only about 2.35 percent per year between 2000 and 2013, which is low compared with other developing countries. To enhance growth, increase flexibility in the labour market, increase tax revenue, and encourage credit, several reforms were approved during 2012 and 2013. It is too soon, however, to study the eﬀects of these reforms on the labour market.

The resources used to examine the relationship between wages and education were Mexico’s National Urban Employment Survey (ENEU) for the period 1988-2004 and the National Survey of Occupation and Employment (ENOE) for the subsequent period 2005-2013. A standard measure to estimate the eﬀect of education on workers income is applying the structural model of the returns to education proposed by ^{Mincer (1958)}. In this study a nonparametric technique is used to deal with the unobservable characteristics across individuals because of the association between education and ability, career choice, or type of occupation over time. Quantile regression proposed by ^{Koenker and Bassett (1978)} will be used to account for this bias by separating the unobserved heterogeneity. To keep the comparability over time, the instrumental variables method is not recommended because of the issues related to finding a convenient instrument with the required properties.

A common problem in the household surveys is unreported wages by individuals, which can enlarge the bias in the estimates. To deal with the selectivity bias potentially present in the sample, several correction methods will be applied, such as the parametric method as developed by ^{Heckman (1979)} and a semi-parametric and semi-nonparametric method following ^{Gallant and Nychka (1987)}, ^{De Luca (2008)} and ^{De Luca and Perotti (2010)}, which are more flexible than the self-selection parametric correction method.

The contribution of the paper is to have found a robust declining trend over time on the returns to schooling in Mexico after 1997. Particularly if a declining trend on the returns to schooling is found in the upper quantiles but not in the lower quantiles, a reduction of inequality between the top and lower quantiles is predicted. These results are consistent with a phenomenon related to a reduction in income and wage inequality estimated after the North-American Free Trade Agreement (NAFTA) by ^{Esquivel (2011)} and ^{Campos (2013b)}.

The paper is structured as follows: section 2 provides a brief literature review of the empirical evidence regarding the trend of the returns to education over time; section 3 presents a brief summary of the schooling and wage trends in Mexico and a descriptive statistic summary of the sample used; section 4 discusses the main issues regarding the estimation of the returns to schooling; section 5 describes the empirical strategy of the study; section 6 presents the results for male workers; and section 7 concludes. The Appendix is presented at the end of the document.

2. Literature review

The estimation of returns to schooling is the main topic of ^{Mincer (1958)}, ^{Schultz (1961)}, ^{Becker (1962)}, ^{Becker (1964)}, ^{Ben-Porath (1967)}, and ^{Mincer (1974)}. In these studies, the authors asserted that productivity can be improved because schooling directly enhances wages. In contrast, ^{Spence (1973)} proposed that schooling is related to higher wages through a signalling eﬀect of ability. ^{Weiss (1995)} claimed the main distinction between the two approaches is this: the first one assumes that education is the cause of workers’ productivity diﬀerences, and the second one assumes that workers’ diﬀerences exist prior to the education choice. ^{Regan, Oaxaca, and Burghardt (2007)} developed a neoclassical model of optimal schooling, arguing not only that the Mincerian schooling model overstates the returns to education due to the lack of an ability control variable but also that in the context of a simple schooling model, with a linear schooling specification, returns to education have identification problems and cannot be considered as an internal rate of return.

The empirical evidence regarding the returns to schooling has accumulated over time. Cross-sectional studies have attempted to disentangle the endogeneity of education and wages. ^{Card (1999)} reviewed literature including studies by ^{Griliches (1977)}, ^{Angrist and Krueger (1991)}, and ^{Ashenfelter and Krueger (1994)}, among others, using instrumental variables to account for this endogeneity. However, using instrumental variables methodology to estimate the returns to education over time may also present issues regarding the choice of a convenient instrument, as Heckman and Vytlacil (2005) pointed out, because diﬀerent instruments define diﬀerent parameters. Other studies have estimated returns to schooling by looking at the whole conditional earnings distribution, such as the quantile approach by ^{Koenker and Bassett (1978)}. Its robustness properties in the presence of heterogeneity make quantile regression a suitable technique. Many researchers have also studied the evolution of the returns to education. Case country empirical evidence provides different patterns of the returns to schooling over time. Various studies reflect an increasing, stable, or declining trend; the direction of the trend depends on the characteristics of each country, time horizon considered, structural reforms, change in the political system, and the like. For example, ^{Buchinsky (2001)} estimated an increasing trend of the returns to schooling in the US between 1963 and 1980 at all quantiles of wage. Also, his results showed that returns to education and experience are diﬀerent for any quantile, even if the change over time follows the same pattern for all the quantiles.

^{Machado and Mata (2001)} provided evidence of increasing returns to schooling over the period 1982-1994 for Portugal, claiming that education is more valued for highly paid jobs because the impact of education at the tails of the distribution was distinct, the return at the 90th quantile increased by 3 percent, and the returns at the low quantiles decreased by 1.5 percent. Although returns to nine-year mandatory schooling decreased over the entire wage distribution, they argued this is because of the fall of the returns associated with the elementary education categories. The returns to having a college degree on average increased from 1982 to 1994 for the median and upper quantiles. They concluded that only after a certain degree, education pays oﬀ, and when this happens education is more valued for highly paid jobs.

An increasing trend of returns to education over time is also estimated for transition economies; ^{Flabbi, Paternostro, and Tiongson (2007)} used comparable data for eight countries^{1} from the early transition period up to 2002. They suggested that the cause of this tendency is the institutional and structural factors that were present over this period of time.

Some authors have compared diﬀerent countries using similar controlled variables. ^{Martins and Pereira (2004)} presented quantile estimates for the returns to schooling for 16 European countries for the mid-1990s and compared trends over time across countries. They contrasted specific cross-country returns to schooling according to the accessibility of country data and accounted for diﬀerences in the data observations; diverse sources of information such as household, employee, or employer survey; and diﬀerent wage measures, gross or net. They found a robust stylised fact that returns to schooling are larger for more skilled individuals, and thus returns to schooling increased across deciles for the set of countries analysed. Greece showed a rather decreasing trend across deciles, but they claimed this was because the available wage data were net of taxes, which makes the comparison to other countries diﬃcult. Italy and Austria also reported net wages, and their trend across deciles was opposite to Greece.

Returns to education estimates for Germany have shown stable returns to schooling. ^{Fitzenberger and Kurz (2003)} provided an empirical analysis of the structure of earnings in West Germany across skill groups and industries for the period 1984-1994. They used panel data with a block bootstrap procedure to account for heterogeneity and autocorrelation in the error term. They found a uniform trend over time, as well as diﬀerent eﬀects of human capital and industry variables on earnings across quantiles.

Conversely, other studies have found declining trends over time. ^{Naticchioni, Ricci, and Rustichelli (2010)} compared public and private sector workers and found deeper decline in the private sector, which was the result of institutional factors such as stronger unions, higher wage compression, and other labour market conditions that did not aﬀect the public sector in Italy. Unlike other countries where the trend of Educational Wage Premia (EWP) has been stable or slightly increasing for some groups, Austria has shown a decrease of the EWP for all educational attainments at all quantiles.

^{Fersterer and Winter-Ebmer (2003)} estimated returns to education in Austria over the period 1981-1997 using cross-sectional data. They estimated that returns to secondary and tertiary education drop for all quantiles and the spread of returns is lower for women. In addition, they oﬀered an explanation of this decline, which is consistent with a rise in the number of highly educated workers over the last two decades.

In general, they found higher estimated returns to education at higher quantiles, and all the coeﬃcients are statistically diﬀerent from each other. Furthermore, the fall in returns over time is relatively similar across quantiles, although women’s returns to schooling fell disproportionally in the lowest decile.

According to some studies for the case of Mexico, the tendency of the returns to schooling has not followed a linear development over 1987-2002. ^{Rodriguez (2005)} found that the highest rates of returns to education were present in 1991-1992 and the lowest, which coincide with the Mexican peso crisis, in 1994-1995. Moreover, the dispersion of the returns to education among regions increased, first because of the peso crisis and second because of the eﬀects of the North American Free Trade Agreement (NAFTA). However, as he mentioned, it is striking to observe that returns to schooling have decreased in recent years. ^{Lopez (2006)} found that returns to schooling for upper secondary rose sharply in the late 1980s and early 1990s and then fell after 1993. However, returns to tertiary education continued to rise until 1996 before falling to levels that remained superior to those observed in the early 1990s. Moreover, she explained this drop as a cyclical fall in education premium in recession times, which was observed in other Latin American countries.

After NAFTA, ^{Robertson (2004)} found that the relative price of skill-intensive goods fell, therefore the relative wage of skilled workers decreased. ^{Esquivel (2011)} also found an income inequality reduction, after NAFTA, explained by a reduction in labour income, a rise on remittances, and public transfers. ^{Campos (2013b)} found a decreasing wage inequality, after 1996, at the top of the wage distribution; he attributed this trend to a decrease in returns to education, slower demand growth, and an increase in supply of college educated.

3. Schooling and wage trends in Mexico

The information to carry out this study is obtained from Mexico’s National Urban Employment Survey (ENEU) for the period 1988-2004 and the National Survey of Occupation and Employment (ENOE) for the subsequent period 2005-2013. The two surveys are equivalent in terms of the questions addressed in this research, even though the questionnaire changed. Another important change is the sampling strategy; nonetheless it is possible to make comparable inferences. ENEU-ENOE surveys present quarterly data and are a five-quarter round panel, which means they follow up with one family for five subsequent quarters. The estimation will consider only men who responded to the questionnaire in the third quarter of each year. The selection of the third quarter not only avoids the overrepresentation of the individuals in the sample but also provides a constant measure of income that is not changing with additional payments for utility shares, vacations, and Christmas bonuses, among other items that are usually given in the first two and last quarters of the year. The analysis will be based on men aged between 20 and 55 years. The upper limit is a conservative bound because according to the Social Security Law^{2} prevailing in Mexico during the studied period, workers could retire before they turned 60 years old if they had worked at least 1 250 weeks, among other requirements.^{3}

The ENEU-ENOE total sample from 1988 to 2013 comprises 2’123 527 men, from which two main groups can be identified: 1) individuals who worked the week previous to the interview, who represented 84 percent, equivalent to 1’783 692 men; and 2) individuals out of the labour force or unemployed, who represented the remaining 16 percent of the total sample. To determine whether the individuals are regular workers, not seasonally employed, there is a query that asks precisely whether they have worked the whole year. The information from this question reveals that 98.32 percent of the workers were active the whole year; workers who report a positive wage represented 73.6 percent of the total sample. The majority of workers (72.69 percent) are employees, 25.41 percent are self-employed, and the remaining 1.90 percent of the workers does not receive a payment, see Table A1 in the Appendix.

A common problem in household surveys is unreported wages. Individuals, especially those with high wages, intentionally may not report wages or may deliberately report lower-than-actual wages. People with high wages may be cautious because of the high insecurity levels perceived in Mexico in recent years. For example, the rate of extortion per 100 000 inhabitants grew from 1.16 percent to 6.79 percent from 2000 to 2013. This crime level increased about five times more than any other type according to oﬃcial accusation reports from the police oﬃces of the 32 Mexican states.^{4} At the same time, people earning low wages might report even lower wages so they can be considered beneficiaries in a social program. The percentage of workers who report wages is 87.5 percent, while a slightly lower percentage, 84 percent, of the whole-year workers reported wages. It seems a small percentage of workers are not reporting wages: 12.5 percent out of the workers and 16.3 percent out of the whole-year workers. The relevant issue is to analyse the trend of the unreported wages. Using quarterly information from the ENOE survey, ^{Campos (2013a)} found an increasing proportion of workers with missing wages; in fact, he showed that individuals with the largest proportion of unobserved wages have higher levels of education, such as high school and university.

A simple regression between missing wages and schooling years indicates a coeﬃcient of 0.005 and is statistically significant at 99 percent of confidence level; thus there is a small but positive correlation between unreported wages and higher levels of education. This reflects that the sample is still random but there could be a potential endogeneity issue related to the estimation of the returns to education if unreported wages are ignored.

The proportion of individuals acquiring education has been rising. The total men’s sample shows that 53 percent of the population holds basic education primary and secondary education. People with upper levels of education -high school and college and more education-represent 44 percent of the total. Education levels follow a sustainable increasing trend over the period, which is consistent with the reduction of the labour force that has no approved schooling years, representing only 3 percent.

Figure 1 shows the percentages of the population with secondary, high school, and college and more levels of education over time; the percentages shown are calculated with respect to each year’s observations. The secondary level refers to the population that attained between 7 and 9 schooling years, individuals with high school obtained between 10 and 12 years of education, and individuals with college or more attained at least 13 schooling years.

Source: Own calculations from ENEU 1988-2004 and ENOE 2005-2013. Proportions are obtained from the total observations per year.

The figure also shows that the proportion of the population with secondary education increasingly grew and dominated the other education levels over time; in 1988 it was around 21 percent, and it increased to 30 percent in 2013. As shown in the figure, the proportion of the population with college or more education reached the same level as the population with secondary education only during 1991-1997. It can be seen that after 1997 the percentage of individuals with college education or more was reduced and about the same as the high school education percentages. The percentage of the population with college or higher education dropped from 23.87 percent in 1997 to 22.26 percent in 1998, and for the individuals with high school education the reduction was smaller, from 22.81 to 22.76 percent respectively.

Even though the evolution of the education achievement has not been linear and the education choice is influenced by the labour market results, there is a clear increase in the education levels over the last 26 years. The lower levels of education dropped by 43 percent; the secondary and high school levels increased by about the same amount, 40 percent; and college and more education rose 20.5 percent from 1988 to 2013. The resulting trend indicates that the Mexican population is increasing its schooling years when entering the labour market. Because data are a repeated cross-section, the evolution provides evidence of the presence of more educated workers, which is relevant to quantify how the returns to education will be aﬀected given this change in the composition of the labour force.

Figure 2 shows real hourly wage evolution through the analysed period of 1988-2013; the largest reduction on the average real hourly wage coincides with the economic crisis 1994-1995, other drops are calculated after 2001 and in 2008. The implementation of NAFTA in 1994 was an important change that reduced wage inequality, as stated by ^{Robertson (2004)} wage inequality increased between 1988 and 1994 and after NAFTA the wage inequality fell.

Source: Own calculations from ENEU 1988-2004 and ENOE 2005-2013. Note: Average real hourly wage is expressed in Mexican Pesos at constant prices of 2013 and calculated for the objective sample of workers aged 20 and 55 years old who reported a positive wage.

Figure 2 shows an estimated reduction of 18 percent on the average real hourly wage comparing 2013 and 1988. Although the trend captures the evolution of the sample analysed, it can give hints about the association of an increasing labour force better educated and a reduction on real hourly wages.

4. Estimation of returns to schooling

The human capital earnings function proposed by ^{Mincer (1974)} has commonly been used to estimate returns to schooling. This model focuses on the life-cycle dynamics of earnings, both observed and potential earning and human capital investments. It relies on assumptions in the functional form, such as linearity in schooling,^{5} quadratic specification for experience, and independence in schooling and experience.

Mincer’s model approximates an equation that can be estimated linearly, where returns to schooling estimates are the same for any level of education.

The estimation of the returns to education has issues related to the bias present because of the correlation between wages, education, occupation choice, participation in the labour market, and latent unobservable characteristics. The following presents two of the most problematic biases ability and selectivity, and also the strategy to deal with these issues.

*4.1. Ability bias*

In general, the studies that estimate the returns to education have acknowledged the correlation between earnings, education, and occupation choice. Researchers have been cautious with the inferences of potential causal eﬀects because it is complex to identify whether higher income is due to higher education, or whether individuals with higher earnings have chosen to attain more schooling or a certain type of occupation. The causality could be established using instrumental variables; ^{Card (1999)} presents a review of papers that have tried to measure the causal eﬀect of education on earnings by using instruments on the supply side of the education system as determinants of education outcomes. Other papers, such as ^{Griliches (1977)}, ^{Angrist and Krueger (1991)}, and ^{Ashenfelter and Krueger (1994)}, have found that returns to schooling using instrumental variables are as big as or even greater than the corresponding OLS estimates, and they claim this as evidence of an ability bias in the OLS estimates.

The empirical strategy is not using the instrumental variables method to deal with the endogeneity caused by the ability bias; because the instrument may also change over time aﬀecting the dynamic comparison of the estimates on the wage distribution (^{Imbens and Angrist, 1994}). Instead it is applying quantile regression which employs information near to the specified section without reliance on any distributional assumptions. Therefore, quantile regression provides an association between wages and schooling that correspond to particular sections of the distribution, oﬀering a more complete view of the relationship between the variables under study. However, the endogeneity issue may be still present.

The specification and interpretation of quantile regression are similar to that of ordinary least squares. On the one hand, quantile estimation minimises the sum of absolute residuals, which can be seen as an optimal point estimator for asymmetric loss, if the symmetric case is the median, while OLS minimises the squared sum of the residuals. On the other hand, the interpretation of each quantile coeﬃcient of the eﬀect of schooling upon wages depends on the particular section of the distribution considered, while the OLS estimates represent the average eﬀect. Unlike the minimisation problem of OLS, quantile equation is not diﬀerentiable, thus quantile estimators cannot be obtained directly. Therefore, nonparametric techniques can be applied to obtain the variance-covariance matrix of the estimation, such as bootstrap to provide standard errors to make inferences regarding the significance of estimated quantile coeﬃcients.

^{Koenker and Bassett (1978)} exposed formally the quantile regression estimation. This approach will estimate the local eﬀect that education has upon wages at any point of the distribution, thus accounting for unobserved heterogeneity existent but not aﬀecting the temporal comparison of returns to schooling (^{Naticchioni, Ricchi, and Rustichelli, 2010}). Unlike the standard approach, OLS estimates the mean regression of the distribution, while quantile regression provides a more complete picture of the returns to schooling because it computes several regressions for diﬀerent points within the wage distribution. For the case of quantile 50 or median, the symmetry implies that the minimisation of the sum of absolute residuals must equate the number of positive and negative residuals, then assuring that the number of positive and negative residuals is the same as the number of observations above and below the median. Other quantiles are obtained by minimizing asymmetrically weighted absolute residuals. Formally, the quantile estimator is obtained by solving the optimizing problem:

The definition stands for a minimum point for which its density probability given *x* is larger or equal to the *θ−th* quantile. The resulting *θ*; it is also called “quantile estimated coeﬃcient”. Quantile coeﬃcients arise from an optimization problem, where sample observations are sorted by a bootstrap procedure. To find the *θ−th* sample quantile requires ranking the observations to split the residuals vector into its positive and negative parts. More specifically, the quantile estimator minimises the sum of absolute residuals to characterise the entire distribution; the median is a type of a Least-Absolute Deviation (LAD) method.

*4.2. Selectivity bias*

The estimation of the returns to education can be biased if only those who report a positive wage are considered. According to ^{Heckman (1979)}, there is a specification error or omitted variables problem resulting in biased estimated coeﬃcients of the returns to education. The bias arises because individuals may not report a wage either because they are not working or because they choose not to report their wage. ^{Gronau (1974)} also established the presence of selectivity bias because of diﬀerent job search strategies, which in turn aﬀects the participation decision and the distribution of wage oﬀers. ^{Keane and Wolpin (1997)} argued that school attendance is a choice that depends on endowments and financing constraints; if individuals have diﬀerent characteristics then the returns to education may be mixed with these eﬀects. Therefore, self-selection issues may also be present because schooling, work, and occupational choice are correlated.

Let *w* define the natural log of the real hourly wage, the entire wage distribution. Define *w*
_{
1
} as missing wages from the surveys, which are unobserved not only because of lack of information regarding the reservation wage but also because some workers decide not to report a wage. Define w_{2} as positive wages that workers report in the surveys. ^{Heckman (1979)} proposed a parametric method to correct the estimation using a selection equation to account for factors that influence the decision to participate in the labour market, such as number of children, education, age, marital status, partners wage, and seasonal work, among others; these factors are included in the vector of characteristics, *x*
_{
1
} . The coeﬃcient that measures the relationship between the observed attributes and the decision to participate is *α*
_{
0
} . The Equation (2) will be referred to as the participation equation:

The idiosyncratic error term v represents unobservable characteristics that aﬀect the decision to participate in the labour market, for example preferences for work, family, schooling, ability, and occupation choice, among others. By assumption, *E (v|x*
_{
1
}
*)* = 0.

The wage equation relates the observed wage with socioeconomic characteristics, such as education, age, marital status, and social security coverage. Equation (3) defines the relationship between observed wage and personal characteristics; the estimated coeﬃcients are in the vector β_{0}. The wage equation is named also as the outcome equation:

However, the mean distribution of the disturbances in the Equation (3) is not zero, E (u|x_{2})≠0, because of the unobserved variables that are playing a role in the wage determination and that are related to education, participation decision, occupational choice, and ability.

The empirical strategy is to account for the ability and selectivity biases. Even though, the ability bias can be still present, quantile regression allows separating the endogenous relationship between wages and education within the wage distribution. Following ^{Buchinsky (1998)}, Equation (3) in the quantile context is:

where *θ* repr*e*sents the selected quantiles: 10, 25, 50, 75, and 90. Let *Q*
_{
θ
}
*(w|x*
_{
2
}
*)* represent the quantile estimation of the relationship between wages conditional to the characteristics defined in *x*
_{
2
} for each quantile *θ*:

Also, the presence of the self-selection or selectivity bias may aﬀect the decision to participate in the labour force, therefore making biased and inconsistent the relationship between wages and education. Because wage is observed only for those who are working or available for those who obtained a wage higher than their reservation wage, therefore:

In the presence of the selectivity bias, the disturbances in the quantile regression form:

This means that the disturbances causing the selectivity bias in the outcome equation have a relationship with the disturbances in the participation equation, and therefore both disturbances *u* and *v* are related. ^{Buchinsky (1998)} used a semi-parametric technique to estimate a function known as index *g* to relate the disturbances *u* and *v* with a common error.

The estimation of the outcome wage equation in the quantile context, accounting for the selectivity bias, is:

^{Buchinsky (1998)} defined *h*
_{
θ
}
*(g)* ≡ *Q*
_{
θ
}
*(u*
_{
θ
}
*|x*
_{
1
}
*, w*
_{
2
}
*> w*
_{
1
}
*)*, implying that the relevant variables for the estimation of the index *g* depend on the attributes from the reservation wage Equation (2), which states the probability to work:

The empirical strategy to deal with the selectivity bias focuses on how to estimate *h*
_{
θ
}
*(g)*. ^{Heckman (1979)} was the first to suggest a correction using a parametric specification, which assumes a bivariate normal density between the disturbances to obtain the inverse Mill’s ratio. The strategy is obtaining a proxy of the unobserved omitted variables to include in the outcome equation. ^{Klein and Spady (1993)} have pointed out the risks of parametric misspecification of the density because it can bias even more the estimated coeﬃcients. In fact, ^{Heckman (1979)} proposed to estimate the outcome equation in the second step using OLS which, as he claimed, provide unbiased but ineﬃcient estimates.

The estimated coeﬃcients are inconsistent when the assumption of the disturbances distribution is incorrect (^{Andrews and Schafgans, 1998}). ^{Newey (1988)}, ^{Ahn and Powell (1993)}, and others used Heckman’s approach to develop semiparametric estimators relaxing the normality assumption or any other functional form of error distribution.

^{Buchinsky (1998)} estimated the selectivity bias using a two-step procedure similar to ^{Heckman (1979)}, ^{Newey and Powell (1990)}, and ^{Klein and Spady (1993)}. In the first step, α_{0} and β_{0} are estimated to get the function g, using the Semiparametric Least Squares (SLS) suggested by ^{Ichimura (1993)}. In the second step, the estimated bias is used to obtain a nonparametric correction term, which is approximated to a polynomial *h(g)*, to be included in the quantile regression with the observable characteristics defined in *x*
_{
2
} , as it is expressed in the Equation (8).

5. Empirical estimation

The motivation to study the returns to schooling is because of the rising better educated labour supply in Mexico and a real wage decrease in recent years. However, there are issues related to the estimation of the returns to schooling. Since quantile regression allows separating unobserved characteristics, such as ability, within the conditional distribution of wages and it is a semi-nonparametric estimation, the preferred method to deal with the selectivity bias should be one that does not assume a disturbance distribution a priori. The approach to correct for selectivity bias proposed by ^{Heckman (1979)} is a parametric estimation that assumes normality on the disturbance distribution. Semi-nonparametric methods are alternative methods to deal with the selectivity bias with more flexibility because they do not assume any particular functional form of the disturbance distribution.

Assuming that endogeneity and selectivity bias do not change over time, the quantile coeﬃcients without any correction are still valid because they will describe the trend over time, even though the levels may be underestimated. In other words, the long-run trend is more important than the level return values for this study. Quantile regression can estimate robust returns to schooling not only for the average but also for every wage percentile in the conditional distribution, making possible the comparison of the inter-quantile estimates over time without making any inference regarding causality, but finding an association between wages and schooling.

The empirical strategy in this paper is estimating the returns to education in two steps, following ^{Buchinsky (1998)}. In the first step a measure of the selectivity bias is estimated, and in the second step the quantile regression will be estimated after accounting for the selectivity bias obtained from the first step.

In the first step, the selectivity correction will consider a parametric, semi-parametric, and semi-nonparametric correction technique. The parametric correction is estimated using ^{Heckman’s (1979)} approach to obtain the inverse Mill’s ratio. The selection equation in the Heckman model is:

The binary variable *worker*
_{
it
} takes the value of 1 if the individual worked last week; *way*
_{
it
} is a binary variable that indicates whether an individual worked the whole year. Otherwise, the worker is a seasonal worker. The variable *child*
_{
it
} refers to the number of children. A singular fact from these surveys is that the question related to the number of children is asked only to the women in the household. This would create a measurement error in the sample because only men are considered for the estimation of the returns to education; and wife’s number of children in the household will be matched to her husband to account for each man’s number of children. This issue is controlled by considering information gathered from nuclear families, which are composed of a father, mother, and children; however, the measurement error would be large in households with more than one family, if the children are not correctly attributed to their corresponding fathers. The expression *X*
_{
it
} represents the set of variables such as one that indicates whether the person works the whole year, approved schooling years variable, binary variables indicating whether they have or have not a medical service; a polynomial specification on age; and binary variables indicating the marital status. For a summary statistics of these variables for the total sample see Table A1 in the Appendix.

Following ^{Buchinsky (1998)}, after the estimation of the Heckman selection equation the inverse Mills ratio is obtained and expanded in a quadratic polynomial; this will be the estimated bias, *h(g)*, to be included in the outcome equation.

An alternative method is to estimate the unknown disturbances distribution as a joint bivariate distribution using semi-parametric and semi-nonparametric methods. Following ^{De Luca (2008)}, it is more eﬃcient to estimate the Equations (8) and (10) jointly by maximizing the loglikelihood function. For this purpose, the dependent variables are set to be binary: the selectivity equation would be equal to 1 if the individual is working and 0 otherwise. In the outcome equation, the dependent variable is a binary variable that takes the value of 1 when the individual reports a positive wage and 0 otherwise.

The bivariate binary-choice model estimates the predicted joint probability of the event that an individual reports wages and works:

The diﬀerences between semi-parametric and semi-nonparametric techniques are the method to approximate the unknown distribution of the disturbances.^{6} Neither approach makes any parametric assumption related to the disturbances distribution, as opposite to the parametric approach. ^{Buchinsky (1998)}, ^{Martins (2001)}, and others have used Kernel functions to estimate the unknown disturbance distribution, which is a semiparametric approach because implies choosing a parametric specification of the index function. If the semi-parametric method to correct selectivity bias assumes the disturbances to follow a bivariate Gaussian distribution, the model is the Seemingly Unrelated Bivariate Probit (Biprobit). The probability will allow estimating an index g, which in turn will help to approximate the selectivity bias using a quadratic polynomial. Analogous to the Heckman case, the same variables are used to estimate the selectivity equation.

If the Gaussian distribution is relaxed, then the methodology would be a semi-nonparametric correction (^{De Luca, 2008}). The semi-nonparametric approach allows a more flexible functional form of the unknown disturbances distribution, because it uses a joint bivariate density function of the disturbances approximated by a Hermite Polynomial Expansion (HPE). According to ^{Gallant and Nychka (1987)}, the HPE form is computationally easier than the traditional multi-variate normal. In the first stage the unknown joint density between the disturbances, defined in Equation (11), is approximated by a Hermite polynomial expansion following ^{De Luca (2008)} and ^{De Luca and Perotti (2010)}, which in turn is based on the work of ^{Gallant and Nychka (1987)}. The diﬀerence between these studies is that ^{De Luca (2008)} allows the Hermite polynomial order to diﬀer. The empirical estimation will consider four diﬀerent specifications of the degree of the HPE. A linear specification in both participation Equation (2) and wage reporting Equation (3) is denoted as (r1r1); a linear assumption in the participation and a quadratic specification in the wage reporting equation is denoted as (r1r2) and vice versa (r2r1); and quadratic specification in both equations is (r2r2). To calculate the probability in the participation equation only the schooling variable and the binary variables of the medical services are included, otherwise, there is collinearity in the second stage.

In the second step, the returns to education are estimated including the estimated selectivity bias from the first stage as a regressor in the quantile regression, ^{Arias, Hallock, and Sosa, 2001}).

The model follows a simple Mincer’s specification of the earnings equation applying OLS and quantile regression considering the information of every male worker (i) at any time (t) over the period 1988-2013. The outcome equation is defined as:

Where ln*(wage)*
_{
it
} is the natural logarithm of the hourly real wage at current prices of 2013. Following ^{Buchinsky (1998)}, the set of variables *X*
_{
it
} included in the selection equation is also included in the outcome equation. In the set of variables are included a continuous variable of attained education, *educ*
_{
it
} , where the relevant variable to this study, β_{t} is the estimated return to education at any year. A dummy variable that can serve as a proxy of the formality, *form*
_{
it
} , takes the value of 1 if the individual has medical services provided by the Mexican Social Security Institute (IMSS); 39.86 percent of workers have access to this service. The marital status, *marital*
_{
it
} , is a dummy variable that takes the value of 1 if the individual is married and 0 otherwise. From the dataset, tenure or any other variable that provides current experience^{7} in the labour force is not available. Thus, to avoid specification error due to potential experience proposed by ^{Mincer (1958)}, in this study, age will be used as a control variable, where *f (age)*
_{
it
} is an age fourth-order^{8} polynomial:

^{Regan and Oaxaca (2009)} proposed a measure of experience based on the actual hours of work over several years using panel data information.^{9}
^{Card (1999)} mentioned that the estimates could be lower in comparison with those with the experience variable explicitly included. The estimated selectivity bias obtained from step one is represented by *u*
_{
it
} is the error term.

6. Results

The dataset allows capturing the labour market conditions over time. It is a repeated cross section data set, which provides the information of diﬀerent workers at diﬀerent points in time. The estimation will quantify the eﬀect of education for men over the period 1988-2013. Quantile estimation provides a closer look at the overall wage distribution; results are calculated for each year and θ quantile 10th, 25th, 50th, 75th, and 90th. The results are graphically presented because it is easier to compare trends on the diﬀerent quantile returns to education coeﬃcients including diﬀerent estimated selectivity bias from parametric, semi-parametric, and semi-nonparametric correction methods. The standard errors are obtained from a variance-covariance matrix via Bootstrap method which also includes quantile blocks in-between allowing coeﬃcient comparisons of diﬀerent quantiles.

The results are shown in two graphs. Figure 3 shows the schooling coeﬃcient including the selectivity bias using a parametric (Heckman) and a semi-parametric (Biprobit) correction. Figure 4 shows the schooling coeﬃcients with selectivity bias using diﬀerent specifications of the Hermite Polynomial Expansion (HPE) approximations of the semi-nonparametric (SNP) technique.

Source: Own calculations from ENEU 1988-2000, ENE 2000-2005, and ENOE 2005-2013. The estimated coeﬃcients are shown in the Appendix, Table A2.

Source: Own calculations from ENEU 1988-2000, ENE 2000-2005, and ENOE 2005-2013. The estimated coeﬃcients are shown in the Appendix, Table A2.

There are 156 quantile returns to education coeﬃcients estimated for each of the selectivity bias correction specifications: five coeﬃcients for each quantile over the period 1988-2013 and OLS coeﬃcients for each year during the same period of time. Quantile regression is a location measure of the eﬀect of education on wages. In general, the estimated quantile coeﬃcients are not crossing among them. Tests of the quantiles crossing are rejected for quantiles 50, 75, and 90 in any year under any of the selectivity bias correction model. However, for the lowest quantiles (10 and 25) for some years, the null hypothesis of quantile crossing is not rejected.

The estimates of the uncorrected model^{10} of the lowest quantile (Q10) are in the range of 4.6 percent in 1988, 6.4 percent in 1997, and 5.1 percent in 2013. The Heckman and Biprobit selectivity correction models provided higher estimates than the uncorrected specification. The Heckman models estimates are in the range of 0.048 and 0.113, and Biprobit estimates are larger in the range of 0.06 and 0.146.

Figure 3 shows the same quantile specification defined in Equation (12); the diﬀerence between the estimates is the inclusion of the selectivity bias correction by Heckman and Biprobit methods. The selectivity correction specification makes the returns to education coeﬃcient vary. Parametric and semi-parametric imply assuming either a normal disturbance distribution or a particular distribution of the index g. These assumptions can lead to a misspecification of the true model. The figure shows that the semi-parametric Biprobit method had abrupt jumps compared to the uncorrected returns to education and the parametric Heckman correction. In fact, the parametric selectivity bias correction approximates to the uncorrected estimates, although at the end of the period the Heckman method implies lower returns to education.

The currency crisis of 1994-1995 reduced wages dramatically; this also reduced the returns to education for the lowest quantile in 1995. For the upper quantiles, the eﬀect of the crisis is captured only by the Biprobit model. However, later estimated drops in the returns to education do not coincide with the crisis of 2001 and 2008. The uncorrected and Heckman corrected returns to education are notably higher in the first part of the analysed period than at the end.

Figure 4 presents the quantile coeﬃcients using semi-nonparametric selectivity correction with four diﬀerent specification of the HPE order. The joint probability of reporting wage and participation in the labour market is obtained in the first stage. The semi-nonparametric selectivity correction models are around the uncorrected estimates. They appeared to be diﬀerent shifts of the uncorrected model because even though they diﬀer in magnitude they follow the same trend. As the Hermite Polynomial Expansion increases, the estimated coeﬃcients of the returns to education tend to be lower. The linear specification (r1r1) provided the highest estimates in the range of 0.32 to 0.155. The linear and quadratic combined specifications (r1r2 and r2r1) seemed to be similar in magnitude. And the quadratic specification in both equations provided the smallest coeﬃcients.

In order to choose which semi-nonparametric specification is preferred, ^{De Luca (2008)} and ^{De Luca and Perotti (2010)} suggested using the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) tests to choose the polynomial degree in the specification of the HPE among the semi-nonparametric specifications.^{11}

The AIC and BIC measures reveal that the r1r2 specification provides lower AIC and low BIC. The AIC and BIC measures became lower as the polynomial increases; the quadratic (r2r2) provided the lowest AIC and BIC. However, semi-nonparametric model (r2r2) may suﬀer from a misspecification because the returns to education estimates tended to zero. This issue is also stated by ^{Gallant and Nychka (1987)}; even if the semi-nonparametric methods do not make any assumptions regarding the distribution of the disturbances, they do make assumptions regarding the polynomial orders that have to be chosen.

Results presented in the Figures 3 and 4 show a declining trend over time, which is evident for most of the model specifications used. However, it is more appropriate to apply a formal test to determine whether there is a linear or nonlinear trend of the returns to schooling over time, if any. A simple model is applied to each quantile regressing the estimated coeﬃcients of the returns to education on a time trend for every selectivity bias correction model. The linear coeﬃcients are shown in Table A4 in the Appendix. In general, the slope coeﬃcients are negative, indicating a linear declining trend, except for the Biprobit selectivity bias specification in which coeﬃcients are mostly positive and statistically significant only for the top quantile. The nonlinear trends (quadratic or cubic) are also shown but only for the semi-nonparametric specification (r1r2) in Table A5 in the Appendix. Quadratic coeﬃcients were negative and statistically significant for most of the quantiles, indicating a concave trend. The nonlinear trend was rejected for the lowest quantile. The preferred correction estimation is the semi-nonparametric method because of the flexibility of not assuming a disturbance distribution prior the estimation; although there is still a choice in the degree of the polynomial specification in the first step, the problem can be solved by comparing the AIC and BIC measures.

After finding a concave declining trend over time, it would be useful to compare how the within-conditional distribution changed from 1988 to 2013. For comparison purposes, this exercise includes not only the semi-nonparametric specification preferred (r1r2) but the uncorrected estimated of the returns to education and the coeﬃcients using the selectivity correction by Heckman and Biprobit.

The comparisons are based on the ratio of the top quantile estimate, Q90, with respect to lower quantiles, Q75, Q50, Q25, and Q10. This comparison provides a proxy of the inequality variation of the returns to schooling among the quantiles at two points in time, 1988 and 2013. Consistent with empirical evidence for other countries such as ^{Buchinsky (1998)}, ^{Martins and Pereira (2004)}, and ^{Machado and Mata (2001)}, top quantiles obtained larger returns to education than the lower quantiles. Because the results showed that the estimated returns to education had been reduced over time, the question that would be address is how the inequality among the quantiles changed.

Table 1 shows that under no selectivity correction there is an estimated reduction in the ratio of the top and lowest quantile of 0.31 percent, while using a semi-nonparametric model this reduction is 39.91 percent and using a parametric model the inequality reduction is 16 percent; only the semi-parametric model showed an increase in the inequality between the top and lowest quantiles. A similar situation is obtained between the top and Q25, although in the uncorrected model an increase in the ratio is estimated, 16.78 percent.

Selectivity bias correction correction | Q90-Q10 | Q90-Q25 | Q90-Q50 | Q90-Q75 |
---|---|---|---|---|

Heckman | -16.00 | -2.29 | 8.71 | 17.92 |

Biprobit | 106.31 | 86.78 | 51.79 | 22.26 |

Uncorrected | -0.31 | 16.78 | 8.83 | 3.25 |

Semi-nonparametric (r1r2) | -39.91 | -21.07 | -16.25 | -8.46 |

Source: Own calculations from table A1. Note: Q90-Q10 means the ratio of the estimated coeﬃcients Q90 over Q10.

The diﬀerences between the ratio coeﬃcients of the upper quantiles, Q75 and Q50, between 1988 and 2013 are in the same direction. There is an increase in the inequality of the returns to education between the two ratios, Q90-Q50 and Q90-Q75; the Biprobit selectivity correction model estimated the largest rise of 51.79 and 22.26 percent respectively. The parametric selectivity correction and the uncorrected coeﬃcients showed an increase from 1988 to 2013 in the range of 8.71 and 8.83 percent in the ratio of the top and median quantiles, Q90 and Q50. Completely diﬀerent estimates of the inequality measurement are obtained between the top two quantiles.

In general, the preferred semi-nonparametric estimates show a reduction in the ratio of the top quantile relative to the lower quantiles, a diﬀerence that increases in absolute value moving downward in the conditional wage distribution. However, this means that the reduction in inequality between quantiles is not only because the education in the lowest quantile is more valued but also the opposite, education in the top quantile is less valued in 2013 than it was in 1988.

7. Conclusion

This study found the relationship between wages and education to be strongly positive. The labour market coordinates the supply of workers and the salaries that employers are willing to pay them according to their qualifications and therefore the result should be evident in the workers wage premium depending on their education level. The Mexican labour market has been changing dramatically in recent decades because of several structural reforms. It is relevant to analyse how the market has valued additional human capital investments made not only at diﬀerent parts of the wage distribution but also at diﬀerent points in time.

The paper provided evidence of a declining trend on the quantile estimated coeﬃcients over time. The empirical strategy tried to control for ability and selectivity biases. To deal with the ability bias, quantile regression allowed comparing returns to education in a particular section within wage distribution. Even though, the problem of endogeneity may still be present, the results reveal a positive association between education and wages. It is also found that upper quantiles obtained larger returns to education at any year considered under any model specification. To deal with the selectivity bias, parametric, semi-parametric, and semi-nonparametric corrections were applied. The Biprobit specification provided estimates that are out of the range of the rest of the estimates; the uncorrected and Heckman specification coeﬃcients provided similar returns to education, but at the end, Heckman correction showed an even deeper declining trend. The semi-nonparametric specifications showed that the uncorrected estimates are shifted down as the Hermite Polynomial Expansion increased the order. The preferred method to deal with the selectivity bias is the semi-nonparametric because it is more flexible in the disturbance assumption, although, the order of the polynomial specification has to be still chosen. In this study the AIC and BIC statistics are used to select a linear specification in the worker equation and a quadratic specification in the wage equation, that is, the r1r2 specification.

The return to schooling increased moving from lower to upper quantiles, although the marginal increase across quantiles became smaller. The 1994 currency crisis deeply aﬀected the reduction of wages; the eﬀect is evident for workers in the lowest quantiles but most negative relative to workers in the upper quantiles. The estmated coeﬃcients got smaller by the end of the period analysed. The inter-quantile analysis that allows estimating the inequality of the return to education between the top quantile and each of the lower quantiles, Q10, Q25, Q50, and Q75, reveals a reduction under the semi-nonparametric selectivity bias correction; the uncorrected estimates show a reduction in the gap between top and lowest quantile while the Heckman correction estimates a reduction in the inequality between the top quantile and the lower quantiles, Q10 and Q25. These results imply a reduction in inequality between quantiles caused because education in the top quantile is less valued in 2013 than it was in 1988. The results obtained do not promote an optimistic view of future educational investment. If returns to education are indicators of the eﬃciency of education to raise wages, these results may discourage individuals from investing in their own human capital; then it should be a concern for policy makers to promote job creation to fulfil a growing better educated labour force.