SciELO - Scientific Electronic Library Online

 
vol.25 issue3Palinspastic reconstruction of the Mesozoic paleomargin of North America in western Cuba and southeastern Gulf of Mexico: Implications for the evolution of the southeastern Gulf of Mexico author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Revista mexicana de ciencias geológicas

On-line version ISSN 2007-2902Print version ISSN 1026-8774

Abstract

VERMA, Surendra P.  and  QUIROZ-RUIZ, Alfredo. Critical values for 33 discordancy test variants for outliers in normal samples of very large sizes from 1,000 to 30,000 and evaluation of different regression models for the interpolation and extrapolation of critical values. Rev. mex. cienc. geol [online]. 2008, vol.25, n.3, pp.369-381. ISSN 2007-2902.

In this final paper of a series of four, using our well-tested simulation procedure we report new, precise, and accurate critical values or percentage points (with four to eight decimal places) of 15 discordancy tests with 33 test variants, and each with seven significance levels α = 0.30, 0.20, 0.10, 0.05, 0.02, 0.01, and 0.005, for normal samples of very large sizes n from 1,000 to 30,000, viz., 1,000(50) 1,500(100)2,000(500)5,000(1,000)10,000(10,000)30,000, i.e., 1,000 (steps of 50) 1,500 (steps of 100) 2,000 (steps of 500) 5,000 (steps of 1,000) 10,000 (steps of 10,000) 30,000. The standard error of the mean is also reported explicitly and individually for each critical value. As a result, the applicability of these discordancy tests is now extended to practically all sample sizes (up to 30,000 observations or even greater). This final set of critical values for very large sample sizes would cover any present or future needs for the application of these discordancy tests in all fields of science and engineering. Because the critical values were simulated for only a few sample sizes between 1,000 and 30,000, six different regression models were evaluated for the interpolation and extrapolation purposes, and a combined natural logarithm-cubic model was shown to be the most appropriate. This is the first time in the literature that a log-transformation of the sample size n before a polynomial fit is shown to perform better than the conventional linear to polynomial regressions hitherto used. We also use 1,402 unpublished dataseis from quantitative proteomics to show that our multiple-test method works more efficiently than the MAD_Z robust outlier method used for processing these data and to illustrate thus the usefulness of our final work on these lines.

Keywords : outlier methods; normal sample; Monte Carlo simulations; critical value tables; Dixon tests; Grubbs tests; skewness; kurtosis; statistics; regression equations; log-transformation; proteomics.

        · abstract in Spanish     · text in English     · English ( pdf )

 

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License