Evaluation of polynomial regression models for the Student t and Fisher F critical values, the best interpolation equations from double and triple natural logarithm transformation of degrees of freedom up to 1000, and their applications to quality control in science and engineering

Verma, Surendra P.

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Revista mexicana de ciencias geológicas

versión On-line ISSN 2007-2902versión impresa ISSN 1026-8774

Rev. mex. cienc. geol vol.26 no.1 Ciudad de México abr. 2009

Evaluation of polynomial regression models for the Student t and Fisher F critical values, the best interpolation equations from double and triple natural logarithm transformation of degrees of freedom up to 1000, and their applications to quality control in science and engineering

Evaluación de modelos polinomiales de regresión para los valores críticos de t de Student y F de Fisher, las mejores ecuaciones de interpolación con transformación logarítmica natural de tipo doble y triple de grados de libertad hasta 1000, y su aplicación en el control de calidad en ciencias e ingenierías

Surendra P. Verma

Centro de Investigación en Energía, Universidad Nacional Autónoma de México, Priv. Xochicalco s/no., Apartado Postal 34, Temixco, Mor. 62580, Mexico. spv@cie.unam.mx

Manuscript received: June 17, 2008
Corrected manuscript received: September 4, 2008
Manuscript accepted: September 4, 2008

ABSTRACT

Serious gaps exist in the present critical value tables for the Student t and Fisher F or ANOVA significance tests. Statistically correct applications of these tests to the experimental data therefore become difficult. A total of 18 different regression models were evaluated for the Student t and 24 for the Fisher F critical values. These models varied from simple polynomial (quadratic to 7th order) to the combined single (ln), double (lnln), or triple (lnlnln) natural–logarithm– (ln–) transformed polynomial models. The advantage of ln–, lnln– or lnlnln–transformations of the degrees of freedom for interpolating the Student t and Fisher F critical values has been documented for the first time in the published literature. The use of critical value equations applicable in the range of 1–1000 degrees of freedom for ln–transformation, 2–1000 for lnln–transformation, or 3–1000 for lnlnln–transformation, instead of the tables, is proposed as a 21st century innovation for the computer programming of these significance tests. A number of application examples are pointed out to illustrate the usefulness of this work.

Key words: F–ratio, ANOVA, critical value, degrees of freedom, reference material, significance tests.

RESUMEN

Las tablas de valores críticos para las pruebas de significado de t de Student y F de Fisher o ANOVA, se caracterizan por serias deficiencias. Aplicaciones estadísticamente correctas de estas pruebas a los datos experimentales, por lo tanto, se hacen difíciles. Se evaluaron un total de 18 modelos de regresión para los valores críticos de t de Student y 24 modelos para los de F de Fisher. Estos modelos varían de modelos polinomiales (de tipo cuadrático hasta la 7ª. potencia) sencillos hasta los polinomiales combinados con la transformación logarítmica natural (ln) de tipo sencillo (ln), doble (lnln), o triple (lnlnln). Las ventajas de la transformaciones ln, lnln o lnlnln de los grados de libertad para la interpolación de los valores críticos de t de Student y F de Fisher se demuestran por primera vez en la literatura publicada. Para la programación computacional de dichas pruebas de significado se propone, en lugar de las tablas, como la innovación del siglo XXI, el uso de las ecuaciones de valores críticos aplicables en el intervalo de 1–1000 grados de libertad para la transformación ln, de 2–1000 para la transformación lnln, o de 3–1000 para la transformación lnlnln. Se presentan una serie de ejemplos de aplicación con el fin de ilustrar la utilidad de este trabajo.

Palabras clave: Relación F, ANOVA, valor crítico, grados de libertad, materiales de referencia, pruebas de significado o significancia.

INTRODUCTION

Quality control in all branches of science and engineering demands the application of significance tests, such as the Student t, Fisher F or F–ratio, and ANOVA or analysis of variance (e.g., Anderson, 1987; Ebdon, 1988; Otto, 1999; Jensen et al., 2000; Miller and Miller, 2000; Bevington and Robinson, 2003; Verma, 2005; Walker and Maddan, 2005). It is customary to apply these tests at a given pre–determined confidence level (CL) such as 95% (e.g., Miller and Miller, 2000) or 99% (e.g., Verma, 1998, 2005; Verma and Quiroz–Ruiz, 2008). Such critical values or percentage points should thus be available for all degrees of freedom (dF or v) required for their statistically correct application. An examination of the published literature readily reveals that this is not the case.

The critical values for the Student t test are available in most literature sources (e.g., Verma, 2005) as a total of 42 critical values corresponding to v = 1(1)30(5)50(10)100 (100)200(300)500(500)1000 (each set being for seven two–sided CL of 60% to 99.8%, or equivalently one–sided CL of 80% to 99.9%), and for the Fisher F test as 20 values of horizontal dF (HdF) v₁=1(1)12(3)15(5)30(10)50 (50)100(900)1000 and 39 values of vertical dF (VdF) v₂= 1(1)30(5)40(10)60(20)100(100)200(300)500(500)1000 (the Fisher F values are generally available for 95% and 99% CL). This shows that serious gaps exist in the critical value tables for these very frequently used significance tests, e.g., within the dF range of 1–1000 and for any given CL, 958 values out of 1000 are missing for the Student t and a total of 980×961 values for the Fisher F. Note that the critical value corresponding to the dF of ∞ is not considered here because the ∞ is an undefined number in mathematical terms and refers to the population (and not to a statistical sample).

I present a new methodology for the interpolation of the existing critical values, evaluate 18 and 24 different regression models for the Student t and Fisher F, respectively, and propose the new best–fitted polynomial double or triple natural logarithm–transformed equations (defined as lnln and lnlnln functions, respectively) that allow us to extend the availability of critical values for all dF (v for the Student t; v₁ and v₂ for the Fisher F) from 1 up to 1000, i.e., 1(1)1000.

REGRESSION PROCEDURE AND THE INTERPOLATION OF CRITICAL VALUES

For the manipulation of critical values, linear to cubic regressions have been used in the literature (e.g., Bugner and Rutledge, 1990; Rorabacher, 1991; Verma et al., 1998). In fact, I tried several polynomial fits (from quadratic up to 7^th order) to obtain new equations for the Student t and Fisher F critical values, but to my surprise none of them performed satisfactorily for interpolation purposes (see Figure 1 and the explanation below in this section). The failure of the polynomial fits motivated me to perform some kind of data transformation before undertaking the polynomial regressions. Single natural–logarithm (ln) transformation for statistically correct handling of compositional data was proposed long ago by Aitchison (1986) and was used a couple of years ago by Verma et al. (2006) for proposing new discriminant function diagrams. To my pleasant surprise, this methodology has also recently provided excellent interpolations of critical values of discordancy tests (Verma and Quiroz–Ruiz, 2008).

Although representing an incomplete treatment of compositional data, log–transformation has been traditionally used for fitting a linear function to a log–transformed compositional ratio variable (Na/K) in geothermal fluid geothermometry (Fournier, 1979; Verma and Santoyo, 1997; Díaz–González et al., 2008). For this example of geothermics to be comparable to my present work, quadratic and higher–order regression fits should have been evaluated. Furthermore, a correct log–ratio transformation would be to use more than two compositional variables and a common denominator for log–ratios as suggested by Aitchison (1986, 1989), Verma et al. (2006), and Agrawal et al. (2008); dealing with just one such ratio (Na/K) is not sufficient to recognize the multivariate nature of the compositional data (Aitchison, 1989; Agrawal and Verma, 2007).

Prior to the polynomial regressions, three types of natural–logarithm transformations of the dF (v, v₁, and v₂) –called here as the ln, lnln, and lnlnln functions– were carried out and evaluated for the first time in the published literature. These three ln–transformations mean that one uses the ln(v), ln(ln(v)), and ln(ln(ln(v))) variables, respectively, instead of the raw v for the Student t, or the raw v₁ or v₂ for the Fisher F, in the theoretical regression function. The results of the evaluation of the quadratic to 7^th order fits are graphically presented in Figure 1.

First, the quality parameter R² called the multiple–correlation coefficient (Bevington and Robinson, 2003) was used (Figures 1a, 1c, 1e). R² is simply an extrapolation of the well–known concept of the linear–correlation coefficient r, which characterizes the correlation between two variables at a time, to include multiple correlations, such as polynomial correlations, between groups of variables taken simultaneously. The parameter r is useful for testing whether one particular variable should be included in the theoretical function that is fitted to the data, whereas the parameter R² characterizes the fit of the data to the entire function (Bevington and Robinson, 2003; Verma and Quiroz–Ruiz, 2008; Verma et al., 2009). Thus, a comparison of the R2 for different functions is useful in optimizing the theoretical functional forms such as those evaluated in the present work (Figures 1a, 1c, 1e).

Secondly, the sum of the squared residuals SSR = {∑(cv_table – cv_calc)²}_int was investigated as the other quality parameter (Figures 1b, 1d, 1f), where cv_table is the value listed in a table for the Student t or Fisher F test for any given CL, and cv_calc is the value calculated from the corresponding regression equation; the subscript_int emphasizes that the regression equations are for the interpolation purposes only, and should not be generally used for the extrapolation of the data, although Verma and Quiroz–Ruiz (2008) have shown that such ln–transformed equations may as well be useful for the extrapolation purpose. No attempt was made to normalize this quality parameter (SSR) with respect to the number of tabulated critical values for a given case, nor with respect to some other variable such as the mean critical value, because the main interest was to use it for the visual comparison of the different (18 for the Student t and 24 for the Fisher F) regression models (Figures 1b, 1d, 1f). Nevertheless, the use of normalized SSR values will only change the vertical scale in Figure 1, without any modification in the observed trend.

It is readily seen that in all cases for 99% CL (Figure 1), the R² parameter for purely polynomial fits from the quadratic (q) to the 7^th order (p7) is consistently small (0.08044–0.44374 in Figure 1a; 0.36301–0.55106 in Figure 1c; and 0.37629–0.59237 in Figure 1e) and the corresponding SSR parameter is unduly large (60–34 in Figure 1b; 10,500–8,600 in Figure 1d; and 5.7–4.0 in Figure 1f) to be of any use in such interpolations. The improvement from any of the ln–, lnln–, and lnlnln–transformations preceding the polynomial fits of the q to 7^th order is highly significant because the fitting quality parameter R² varied, respectively, from 0.71096 to 0.99845, 0.97063 to 1.00000, and not reported (because lnln–transformation already reaches the theoretical maximum value of 1) in Figure 1a; from 0.94694 to 0.99761, 0.97097 to 0.99424, and 0.93975 to 0.98627 in Figure 1c, and from 0.95395 to 1.00000, 0.9380 to 0.99999, and 0.99365 to 0.99999 in Figure 1e. When for a polynomial model involving ln–transformation the R2 practically approaches the theoretical maximum value of 1, any further improvement in the fitting–quality is impossible to attain even if one uses a higher–order polynomial or a more complex ln–transformation. The SSR parameter for the ln–transformed models correspondingly is extremely small (17.7–0.1, 1.8–(2.8×10–6), and not reported in Figure 1a; 2,100–150, 700–150, and 880–200 in Figure 1d; and 0.9–9×10^–5, 0.07–8×10^–5, and 0.05–8×10^–5 in Figure 1f) as compared to the respective simpler polynomial models (see above). The relatively large squared residuals in Figure 1d as compared to those in Figure 1f are due to the fact that the 99% CL critical values for the Fisher F corresponding to VdF=1 and HdF=1–1,000 are much greater (405–637) than those for VdF=1,000 and HdF=1–1,000 (6.66–1.16; for critical values see any standard textbook on the subject; e.g., Anderson, 1987; Urbina–Medal and Valencia–Ramírez, 1987; Verma, 2005).

Similarly, also for purely polynomial fits for the Student t and Fisher F tests corresponding to 95% CL critical values (plots not shown) the R² parameter was consistently very small (0.1070–0.5272 and 0.0180–0.5053, respectively) and the SSR parameter was consistently unreasonably large (about 840–6 and 19,900–1,000, respectively). At this CL (95%) the ln–, lnln–, and lnlnln–transformations prior to the polynomial fits provided much greater R² values, respectively, from 0.4937–1.0000, 0.9659–1.0000, and 0.9768–1.0000 (for HdF=1 and VdF=1–1000, or VdF=1 and HdF=1–1000). The SSR parameter correspondingly was extremely small for the ln–, lnln–, and lnlnln– transformed models (as low as 0.024, 0.008, and 0.021, respectively).

These kinds of results and trends were shown by all other critical value sets as well, i.e., the superiority of the ln–transformed polynomial models as compared to the simpler polynomial models (without ln–transformation) has been demonstrated beyond any doubt.

BEST–FIT EQUATIONS

I decided to call the best–fit equation as the one that: (i) provided R² close to 1 (in fact, practically equal to 1); (ii) showed small sum of absolute (SAR=∑Abs (cv_calc–cv_table), or squared (SSR defined above) residuals; and (iii) was based on the smallest number of regression terms and the less complex ln–transformation, i.e., under similar circumstances, the ln function was preferred as compared to the lnln function and the latter as compared to the lnlnln function.

For the Student t critical values, the lnln–transformed 5^th order best–fit equation for v =2–1000 is:

In equation (1) of the 5^th order polynomial regression involving lnln–transformation, I is the intercept term and F₁, F₂, F₃, F₄, and F₅ are the coefficients of the linear, quadratic, 3^rd, 4^th, and 5^th order terms, respectively. All these coefficients of the best–fit equation (1) have been summarized in Table 1. This equation can be used to compute any critical value for v=2–1000 and for any desired CL (see Table 1).

For the Fisher F tables, two different sets of best–fit equations had to be proposed for any given CL, one for the interpolation of the vertical dF (VdF v₂) for a given horizontal dF (HdF v₁), and the other for the HdF (v₁) for a given VdF (v₂) and CL. For example, the following double ln–transformed 6^th order best–fit equation applicable for v₂= 2–1000 for a given v₁ is:

Similarly, the following double ln–transformed 5^th order best–fit equation applicable for v₁= 2–1000 for a given v₂ is:

The values of the coefficients for the 99% CL are listed in Tables 2 and 3, respectively.

Finally, not all combinations of HdF and VdF are covered by equations 2 and 3. For example, critical values will be missing for the combination of v₁=13 and v₂=31–34, 36–39, 41–49, 51–59, 61–79, 81–99, 101–199, 201–499, and 501–999. First, I tried to evaluate more complex polynomial models involving different kinds of ln–transformations of simultaneously both v₁ and v₂ in a single equation, but failed to obtain any acceptable solution.

Therefore, a "second–round" of equations had to be proposed to complete the missing Fisher F values. As an example, the following triple ln–transformed 6^th order best–fit "second–round" equation applicable for v₂= 3–1000 for a given v₁ is:

The values of the coefficients for the 99% CL are listed in Table 4 for v₁=13–29, thus, completing the Fisher F critical values for v₁=1(1)30. Because for the F–ratio and ANOVA tests, v₁ (HdF) refers to the dF that correspond to the number of classes or groups, the above equations and Tables 2, 3 y 4 cover most, if not all, applications of these significance tests. However, if there were still needs for greater values of v₁, the present method of the triple ln–transformed 6^th order fit can be easily extended to include any missing cases.

Another example includes the following double ln–transformed 5^th order best–fit "second–round" equation applicable for v₁= 2–1000 for a given v₂ (note equation 5 is identical to equation 3):

The values of the coefficients for the 99% CL (Table 5) are for v₂=31–49. They complete the Fisher F critical values for v₂=1(1)50.

As stated earlier, this new methodology involving lnln– or lnlnln–transformations can be easily extended to calculate any other critical value for the F–ratio or ANOVA test. Similar equations and Tables were also generated for the Fisher F 95% CL, but are not included here; these are available by email request to the author.

This methodology of ln–, lnln–, or lnlnln–transformation should be useful to handle all other types of critical value tables if one is interested in precisely estimating interpolated values for their application in significance tests (this work), discordancy tests (Verma and Quiroz–Ruiz, 2008), or any other type of statistical tests. On the other hand, because the best interpolation equations based on these innovative natural logarithm–transformations along with polynomial fits provide ideal solutions with R² values of practically 1 (maximum theoretical value attainable) and extremely small SSR values, I consider no need to try any other conventional fitting methods such as numerical computational methods or artificial neural network (ANN) methodology (Verma et al., 2008; Díaz–González et al., 2008).

It will be a good idea to abandon the use of the critical value tables; instead, the new critical value equations can be easily programmed in spreadsheets as well as in new computer software. Thus, the use of critical value equations applicable in the range of 1–1000 degrees of freedom for ln–transformation, 2–1000 for lnln–transformation, or 3–1000 for lnlnln–transformation, instead of the tables, can be advantageously proposed as a 21^st century innovation for the computer programming of these significance tests. This computer programming work is currently under progress.

APPLICATIONS IN SCIENCE AND ENGINEERING

I first suggest a number of literature references from different areas of science and engineering, which deal with the kind of research for which the new critical value equations will be useful. Then I provide a few actual application examples from a reference material (RM) in geochemistry.

The new equations presented in this work will be useful for the statistical analysis of data in several different fields. A few of them are: agricultural and food science and technology (Pellegrini et al., 2003; Sauvage et al., 2007); biotechnology (Gonzalez et al., 2002); clay mineralogy of sediment cores (Pandarinath, in press); energy and fuels (Bansal et al., 2008); environmental science and technology (Wang et al., 2006); fluid geothermometry (Palabiyik and Serpen, 2008; Díaz–González et al., 2008); instrumentation (Goodman, 1998); medical science and technology (Zacheis et al., 1999; Cooper et al., 2006); proteomic research (Maurer et al., 2005; Verhoeckx et al., 2005; Xia et al., 2006; Verma and Quiroz–Ruiz, 2008); water–rock interaction (Verma et al., 2005); and zoology (Harcourt et al., 2005).

The quality control (assurance and assessment programs) using inter–laboratory data on RM is another important research area where these new critical value equations will be of much use. To name a few of these research areas on the study of RMs, they are: biology and biomedicine (Ihnat, 2000; Patriarca et al., 2005), cement industry (Sieber et al., 2002), environmental and pollution Research (Dybczynski et al., 1998; Gill et al., 2004), food science and technology (Langton et al., 2002; Morabito et al., 2004), organochlorinated compounds and petroleum hydrocarbons in sediments (Villeneuve et al., 2002, 2004); rock chemistry (Verma, 1998; Marroquín–Guerra et al., in press), and water research (Holcombe et al., 2004; Verma, 2004).

The example cases were chosen from the RM database recently used to evaluate the performance of single discordant–outlier tests (Verma et al., 2009). It is important to remind that the Student t and Fisher F–ratio tests are sensitive to the presence of discordant outliers (Jensen et al., 2000). For the ANOVA test also, the data to be examined should ideally be free from discordant outliers. Therefore, in order to correctly apply these tests, the individual datasets should first be processed for the possible presence of such outliers using appropriate discordancy tests (Barnett and Lewis, 1994) along with the new, precise, and accurate critical values (Verma and Quiroz–Ruiz, 2006a, 2006b, 2008; Verma et al., 2008).

A large number of examples exist in this extensive RM database that fall in the category of actual gaps in the critical value tables, for example, for the Student t the dF or v as 30< v <1000, but distinct from v = 30(5)50(10)100(100)200 (300)500(500)1000, i.e., different from the tabulated v of 35, 40, 45, 50, 60, 70, 80, 90, 100, 200, 500, or 1,000. Similarly, for the Fisher F tables, such gaps exist for horizontal dF (HdF) v1>12, i.e., v₁ different from 12(3)15(5)30(10)50(50)100(900)1000 and for vertical dF (VdF) v₂> 30, i.e., v₂ different from 30(5)40(10)60(20)100(100)200(300)500(500)1000.

The examples (Table 6) include the major element data in the RM diabase W–1 from the United States Geological Survey (U.S.A.) compiled by Verma et al. (2009) that require the use of newly interpolated critical values. The extensive footnote of Table 6 provides more information on the application of the ANOVA, F–ratio and Student t tests.

For TiO₂, Al₂O₃, MgO, CaO, K₂O, and P₂O₅, the ANOVA test and for H₂O⁺, and H₂O^–, the F–ratio and Student t tests showed that the data from the different analytical method groups can be combined into a single group and an overall mean and standard deviation as well as confidence limits of the mean can be calculated. For the remaining major elements or oxides (SiO₂, Fe₂O₃t, MnO and Na₂O), statistically significant differences were observed among the method groups, and the combination of all methods into a single group was therefore not recommended. For the latter cases, the overall statistical parameters were calculated for only those method groups that showed no significant differences among them.

For these significance tests (ANOVA, F–ratio, and Student t), the new equations provided precise interpolated critical values as documented in the earlier section. If we were to calculate the 95% or 99% confidence limits of the mean (not included in Table 6), we would also need precise critical values for the Student t test corresponding to the appropriate degrees of freedom (v).

CONCLUSIONS

The criteria of the multiple–correlation coefficient (R²) and the interpolation residuals (SAR and SSR) clearly suggest that simple polynomial regressions are not appropriate for the interpolation of the Student t and Fisher F critical values. The ln–transformation is a required operation to achieve better regression models. The ln–, lnln–, and lnlnln–transformations combined with the polynomial regression models, were shown to perform better than the simple polynomial models. The best interpolation models involved lnln– and lnlnln–transformations prior to the polynomial fits. The use of these new best interpolation equations in spreadsheet calculations or computer programs is recommended for all applications in science and engineering involving these significance tests. Finally, the new interpolated critical values for the Student t test would be useful to calculate more precisely the 95% or 99% confidence limits of the mean.

ACKNOWLEDGEMENTS

The writing of a single–author book on statistics during 2004–2005 and the collaboration with A. Quiroz–Ruiz on a simulation procedure to generate new, precise critical values for discordancy tests published in four papers during 2006–2008 permitted me to identify this gap in the published literature and motivated me to fulfill it through the present paper. I am much grateful to two anonymous reviewers for their highly positive evaluation of my work and the co–editor in chief Carlos González León for efficiently handling this manuscript.

REFERENCES

Agrawal, S., Verma, S.P., 2007, Comment on "Tectonic classification of basalts with classification trees" by Pieter Vermeesch (2006): Geochimica et Cosmochimica Acta, 71(13), 3388–3390. [ Links ]

Agrawal, S., Guevara, M., Verma, S.P., 2008, Tectonic discrimination of basic and ultrabasic volcanic rocks through log–transformed ratios of immobile trace elements: International Geology Review, 50(12), 1057–1079. [ Links ]

Aitchison, J., 1986, The Statistical Analysis of Compositional Data: London, Chapman and Hall, 416 p. [ Links ]

Aitchison, J., 1989, Measures of location of compositional data sets: Mathematical Geology, 21, 787–790. [ Links ]

Anderson, R.L., 1987, Practical Statistics for Analytical Chemists: New York, Van Nostrand Reinhold, 316 p. [ Links ]

Bansal, V., Krishna, G.J., Singh, A.P., Gupta, A.K., Sarpal, A.S., 2008, Determination of hydrocarbons types and oxygenates in motor gasoline: A comparative study of different analytical techniques: Energy and Fuels, 22(1), 410–415. [ Links ]

Barnett, V., Lewis, T., 1994, Outliers in Statistical Data: Chichester, UK, John Wiley & Sons, Third edition, 584 p. [ Links ]

Bevington, P.R., Robinson, D.K., 2003, Data Reduction and Error Analysis for the Physical Sciences: Boston, MA, USA, McGraw–Hill, 320 p. [ Links ]

Bugner, E., Rutledge, D.N., 1990, Modelling of statistical tables for outlier tests: Chemometrics and Intelligent Laboratory Systems, 9(3), 257–259. [ Links ]

Cooper S.J., Trinklein N.D., Anton E.D., Nguyen L., Myres R.M., 2006, Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome: Genome Research, 16(1), 1–10. [ Links ]

Díaz–González, L., Santoyo, E., Reyes–Reyes, J., 2008, Tres nuevos geotermómetros mejorados de Na/K usando herramientas computacionales y geoquimiométricas: aplicación a la predicción de temperaturas de sistemas geotérmicos: Revista Mexicana de Ciencias Geológicas, 25(3), 465–482. [ Links ]

Dybczynski, R., Polkowska–Motrenko, H., Samczynski, Z., Szopa, Z., 1998, Virginia tobacco leaves (CTA–VTL–2) – new Polish CRM for inorganic trace analysis including microanalysis: Fresenius Journal of Analytical Chemistry, 360(3–4), 384–387. [ Links ]

Ebdon, D., 1988, Statistics in Geography: Oxford, Basic Blackwell, 232 p. [ Links ]

Fournier, R.O., 1979. A revised equation for the Na/K geothermometer: Geothermal Resources Council Transactions, 3, 221–224. [ Links ]

Gill, U., Covaci, A., Ryan, J.J., Emond, A., 2004, Determination of persistent organohelogenated pollutants in human hair reference material (BCR 397); an interlaboratory study: Analytical and Bioanalytical Chemistry, 380(7–8), 924–929. [ Links ]

Gonzalez, R., Tao, H., Shanmugam, K.T., York, S.W., Ingram, L.O., 2002, Global gene expression difference associated with changes in glycolytic flux and growth rate in escherichia coli during the fermentation of glucose and xylose: Biotechnology Progress, 18(1), 6–20. [ Links ]

Goodman, K.J., 1998, Hardware modifications to an isotope ratio mass spectrometer continuous–flow interface yielding improved signal, resolution, and maintainance: Analytical Chemistry, 70(5), 833–837. [ Links ]

Harcourt, A.H., Coppeto, S.A., Parks, S.A., 2005, The distribution–abundance (density) relationship; its form and causes in a tropical mammal order, Primates: Journal of Biogeography, 32(4), 565–579. [ Links ]

Holcombe, G., Lawn, R., Sargent, M., 2004, Improvements in efficiency of production and traceability for certification of reference materials: Accreditation and Quality Assurance, 9(4–5), 198–204. [ Links ]

Ihnat, M., 2000, Performance of NAA methods in an international interlaboratory reference material characterization campaign: Journal of Radioanalytical and Nuclear Chemistry, 245(1), 73–80. [ Links ]

Jensen, J.L., Lake, L.W., Corbett, P.W.N., Goggin, D.J., 2000, Statistics for Petroleum Engineers and Geoscientists: Amsterdam, The Netherlands, Elsevier, Second edition, 338 p. [ Links ]

Langton, S.D., Chevennement, R., Nagelkerke, N., Lombard, B., 2002, Analysing collaborative trials for qualitative microbiological methods; accordance and concordance: International Journal of Food Microbiology, 79(3), 175–181. [ Links ]

Marroquín–Guerra, S.G., Velasco–Tapia, F., Díaz–González, L., in press, Evaluación estadística de Materiales de Referencia Geoquímica del Centre de Recherches Pétrographiques et Géochimiques (Francia) aplicando un esquema de detección y eliminación de valores desviados y su posible aplicación en el control de calidad de datos geoquímicos: Revista Mexicana de Ciencias Geológicas. [ Links ]

Maurer, M.H., Feldmann, J., R.E., Brömme, J.O., Kalenka, A., 2005, Comparison of statistical approaches for the analysis of proteome expression data of differentiating neural stem cells: Journal of Proteome Research, 4(1), 96–100. [ Links ]

Miller, J.N., Miller, J.C., 2000, Statistics and Chemometrics for Analytical Chemistry: Reading, UK, Pearson Education Ltd., Prentice Hall, Fourth edition, 271 p. [ Links ]

Morabito, R., Massanisso, P., Cámara, C., Larsson, T., Frech, W., Kramer, K.J.M., Bianchi, M., Muntau, H., Donard, O.F.X., Lobinski, R., McSheehy, S., Pannier, F., Potin–Gautier, M., Gawlik, B.M., Bowadt, S., Quevauviller, P., 2004, Towards a new certified reference material for butyltins, methylmercury and arsenobetaine in oyster tissue: Trends in Analytical Chemistry, 23(9), 664–676. [ Links ]

Otto, M., 1999, Chemometrics. Statistics and Computer Application in Analytical Chemistry: Weinheim, Wiley–VCH, 314 p. [ Links ]

Palabiyik, Y., Serpen, U., 2008, Geochemical Assessment of Simav Geothermal Field, Turkey: Revista Mexicana de Ciencias Geológicas, 25(3), 408–425. [ Links ]

Pandarinath, K., in press, Clay minerals in SW Indian continental shelf sediments cores as indicators of provenance and paleomonsoonal conditions: a statistical approach: International Geology Review. [ Links ]

Patriarca, M., Chiodo, F., Castelli, M., Corsetti, F., Menditto, A., 2005, Twenty years of the Me.Tos. project; an Italian national external quality assessment scheme for trace elements in biological fluids: Microchemical Journal, 79(1–2), 337–340. [ Links ]

Pellegrini, N., Del Rio, D., Colombi, B., Bianchi, M., Brighenti, F., 2003, Application of the 2,2'–Azinobis(3–ethylbenzothiazoline–6–sulfonic acid) radical cation assay to a flow injection system for the evaluation of antioxidant activity of some pure compounds and beverages: Journal of Agricultural and Food Chemistry, 51(1), 260–264. [ Links ]

Rorabacher, D.B., 1991, Statistical treatment for rejection of deviant values: critical values of Dixon's "Q" parameter and related subrange ratios at the 95% confidence level: Analytical Chemistry, 63(2), 139–146. [ Links ]

Sauvage, F.–X., Pradal, M., Chatelet, P., Tesniere, C., 2007, Proteome changes in leaves from grapevine (vitis vinifera L.) transformed for alcohol dehydrogenase activity: Journal of Agricultural and Food Chemistry, 55(7), 2597–2603. [ Links ]

Sieber, J., Broton, D., Fales, C., Leigh, S., MacDonald, B., Marlow, A., Nettles, S., Yen, J., 2002, Standards reference materials for cements: Cement and Concrete Research, 32(12), 1899–1906. [ Links ]

Urbina–Medal, E.G., Valencia–Ramírez, G.J., 1987, Probabilidad y Estadística. Aplicaciones y Métodos (traducción del libro de Canavas, G.C.): México, McGraw Hill, 651 p. [ Links ]

Velasco–Tapia, F., Guevara, M., Verma, S.P., 2001, Evaluation of concentration data in geochemical reference materials: Chemie der Erde, 61(1), 69–91. [ Links ]

Verhoeckx, K.C.M., Gaspari, M., Bijisma, S., van der Greef, J., Witcamp, R.F., Doornbos, R.P., Rodenberg, R.J.T., 2005, In search of secreted protein biomarkers for the anti–inflammatory effect of b–2–Adrenergic receptor agonists: application of DIGE technology in combination with multivariate and univariate data analysis tools: Journal of Proteome Research, 4(6), 2015–2023. [ Links ]

Verma, M.P., 2004, A revised analytical method for HCO₃– and CO₃²– determinations in geothermal waters: an assessment of IAGC and IAEA interlaboratory comparisons: Geostandards and Geoanalytical Research 28(3), 391–409. [ Links ]

Verma, S.P., 1998, Improved concentration data in two international geochemical reference materials (USGS basalt BIR–1 and GSJ peridotite JP–1) by outlier rejection: Geofísica Internacional, 37(3), 215–250. [ Links ]

Verma, S.P., 2005, Estadística Básica para el Manejo de Datos Experimentales: Aplicación en la Geoquímica (Geoquimiometría): Mexico City, Mexico, Universidad Nacional Autónoma de México, 186 p. [ Links ]

Verma, S.P., Quiroz–Ruiz, A., 2006a, Critical values for six Dixon tests for outliers in normal samples up to sizes 100, and applications in science and engineering: Revista Mexicana de Ciencias Geológicas, 23(2), 133–161. [ Links ]

Verma, S.P., Quiroz–Ruiz, A., 2006b, Critical values for 22 discordancy test variants for outliers in normal samples up to sizes 100, and applications in science and engineering: Revista Mexicana de Ciencias Geológicas, 23(3), 302–319. [ Links ]

Verma, S.P., Quiroz–Ruiz, A., 2008, Critical values for 33 discordancy test variants for outliers in normal samples of very large sizes from 1,000 to 30,000 and evaluation of different regression models for the interpolation and extrapolation of critical values: Revista Mexicana de Ciencias Geológicas, 25(3), 369–381. [ Links ]

Verma, S.P., Santoyo, E., 1997, New improved equations for Na/K, Na/Li and SiO₂ geothermometers by outlier detection and rejection. Journal of Volcanology and Geothermal Research, 79(1), 9–23. [ Links ]

Verma, S.P., Orduña–Galván, L.J., Guevara, M., 1998, SIPVADE: A new computer programme with seventeen statistical tests for outlier detection in evaluation of international geochemical reference materials and its application to Whin Sill dolerite WS–E from England and Soil–5 from Peru: Geostandards Newsletter: Journal of Geostandards and Geoanalysis, 22(2), 209–234. [ Links ]

Verma, S.P., Torres–Alvarado, I.S., Satir, M., Dobson, P.F., 2005, Hydrothermal alteration effects in geochemistry and Sr, Nd, Pb, and O isotopes of magmas from the Los Azufres geothermal field (Mexico): a statistical approach: Geochemical Journal, 39(2), 141–163. [ Links ]

Verma, S.P., Guevara, M., Agrawal, S., 2006, Discriminating four tectonic settings: five new geochemical diagrams for basic and ultrabasic volcanic rocks based on log–ratio transformation of major–element data: Journal of Earth System Science, 115(5), 485–528. [ Links ]

Verma, S.P., Quiroz–Ruiz, A., Díaz–González, L., 2008, Critical values for 33 discordancy test variants for outliers in normal samples up to sizes 1000, and applications in quality control in Earth Sciences: Revista Mexicana de Ciencias Geológicas, 25(1), 82–96. [ Links ]

Verma, S.P., Díaz–González, L., González–Ramírez, R., 2009, Relative efficiency of single–outlier discordancy tests for processing geochemical data on reference materials and application to instrumental calibrations by a weighted least–squares linear regression model: Geostandards and Geoanalytical Research, 33(1), 29–49. [ Links ]

Villeneuve, J.–P., de Mora, S.J., Cattini, C., 2002, World–wide and regional intercomparison for the determination of organochlorine compounds and petroleum hydrocarbons in sediment sample IAEA–417: Vienna, Austria, International Atomic Energy Agency, Analytical Quality Control Services, 136 p. [ Links ]

Villeneuve, J.–P., de Mora, S., Cattini, C., 2004, Determination of organochlorinated compounds and petroleum in fish–homogenate sample IAEA–406; results from a worldwide interlaboratory study: Trends in Analytical Chemistry, 23(7), 501–510. [ Links ]

Walker, J.T., Maddan, S., 2005, Statistics in Criminology and Criminal Justice. Analysis and Interpretation: Sudbury, Mass., USA, Jones and Bartlett Publishers, Second edition, 427 p. [ Links ]

Wang, Z., Yang, C., Hollebone, B., Fingas, M., 2006, Forensic fingerprinting of diamondoids for correlation and differentiation of spilled oil and petroleum products: Environmental Science and Technology, 40(18), 5636–5646. [ Links ]

Xia, Q.W., Hendrickson, E.L., Zhang, Y., Wang, T.S., Taub, F., Moore, B.C., Porat, I., Whitman, W.B., Hackett, M., Leigh, J.A., 2006, Quantitative proteomics of the Archaeon Methanococcus maripaludis validated by microarray analysis and real time PCR: Molecular and Cellular Proteomics, 5(5), 868–881. [ Links ]

Zacheis, D., Dhar, A., Lu, S., Madler, M.M., Klucik, J., Brown, C.W., Liu, S., Clement, F., Subramanian, S., Weerasekare, G.M., Berlin, K.D., Gold, M.A., Houck, Jr., J.R., Fountain, K.R., Benbriik, D.M., 1999, Heteroarotinoids inhibit head and neck cancer cell lines in vitro and in vivo through both RAR and RXR retinoic acid receptors: Journal of Medical Chemistry, 42, 4434–4445. [ Links ]