Deciding on Null Hypotheses using P-values or Bayesian alternativesA simulation study

  1. Ana María Ruiz-Ruano García 1
  2. Jorge López Puga 1
  1. 1 Universidad Católica San Antonio
    info

    Universidad Católica San Antonio

    Murcia, España

    ROR https://ror.org/05b1rsv17

Revista:
Psicothema

ISSN: 0214-9915

Año de publicación: 2018

Volumen: 30

Número: 1

Páginas: 110-115

Tipo: Artículo

Otras publicaciones en: Psicothema

Resumen

Antecedentes: el p-valor es hoy en día, pese a las críticas, uno de los elementos clave del contraste de hipótesis. La estadística Bayesiana y los factores de Bayes han sido propuestos como alternativas para mejorarlo. Este estudio compara la ejecución de dos factores de Bayes con el p-valor cuando la hipótesis nula es la más plausible. Método: se simularon un millón de pares de conjuntos de datos independientes procedentes de poblaciones normales y se consideraron diferentes tamaños muestrales. Se calcularon los p-valores para comparar las medias muestrales para cada par de muestras, así como las alternativas Bayesianas. Resultados: los factores de Bayes muestran mejor ejecución que el p-valor, favoreciendo la hipótesis nula frente a la alternativa. El Factor de Bayes basado en el BIC funciona mejor que la calibración del p-valor bajo las condiciones simuladas y su comportamiento mejora a medida que el tamaño de la muestra aumenta. Conclusiones: nuestros resultados muestran que los factores de Bayes son buenos complementos para el contraste de hipótesis. Su utilización puede ayudar a los investigadores a no caer en falsos descubrimientos estadísticos y nosotros sugerimos el uso conjunto de la estadística clásica y Bayesiana.

Información de financiación

JLP was supported by grant PSI2014-53427-P from Ministry of Economy and Competitiveness (FEDER funding) and grant 19267/PI/14 from Fundación Séneca.

Referencias bibliográficas

  • Altman, N., & Krzywinski, M. (2017a). Points of significance: P values and the search for significance. Nature Methods, 14, 3-4. doi: 10.1038/ nmeth.4120
  • Altman, N., & Krzywinski, M. (2017b). Points of significance: Interpreting P values. Nature Methods, 14, 213-214. doi: 10.1038/nmeth.4210
  • Anscombe, F. J. (1961). Bayesian statistics. The American Statistician, 15, 21-24. doi: 10.2307/2682504
  • Bakan, D. (1966). The test of significance in psychological research. Psychological Bulletin, 66, 423-437.
  • Baker, M. (2016, May 25). Is there a reproducibility crisis? Nature, 533, 452-454. doi: 10.1038/533452a
  • Balluerka, N., Vergara, A. I., & Arnau, J. (2009). Calculating the main alternatives to null-hypothesis-significance testing in between-subjects experimental designs. Psicothema, 21, 141-151.
  • Benjamin, D. J., Berger, J., Johannesson, M., Nosek, B. A., Wagenmakers, E., Berk, R., …, Johnson, V. (2017, September). Redefine statistical significance. Nature Human Behaviour. doi: 10.1038/s41562-017-0189-z
  • Bolstad, W. M. (2007). Introduction to Bayesian statistics (2nd ed.). Hoboken, NJ: Wiley.
  • Box, G. E. P. (1976). Science and statistics. Journal of the American Statistical Association, 71, 791-799.
  • Cohen, J. (1994). The Earth is round (p < .05). American Psychologist, 49, 997-1003.
  • Cohen, J., Cohen, P., West, S. G., & Aiken, L. S. (2003). Applied multiple regression/correlation analysis for the behavioural sciences (3rd ed.). New York: Routledge.
  • Dar, R., Serlin, R. C., & Omer, H. (1994). Misuse of statistical tests in three decades of psychotherapy research. Journal of Consulting and Clinical Psychology, 62, 75-82. doi: 10.1037/0022-006X.62.1.75
  • Edwards, W., Lindman, H., & Savage, L. J. (1963). Bayesian statistical inference for psychological research. Psychological Review, 70, 193-242.
  • Fienberg, S. E. (2006). When did Bayesian inference become “Bayesian”? Bayesian Analysis, 1, 1-40. doi: 10.1214/0-BA101
  • Gallistel, C. R. (2009). The importance of proving the null. Psychological Review, 116, 439-453. doi: 10.1037/a0015251
  • Gigerenzer, G. (1998). We need statistical thinking, not statistical rituals. Behavioral and Brain Sciences, 21, 199-200.
  • Gigerenzer, G. (2004). Mindless statistics. The Journal of Socio-Economics, 33, 587-606. doi: 10.1016/j.socec.2004.09.033
  • Haller, H., & Krauss, S. (2002). Misinterpretations of significance: A problem students share with their teachers? Methods of Psychological Research Online, 7(1).
  • Halsey, L. G., Currant-Everett, D., Vowler, S. L., & Drummond, G. B. (2015). The fickled P value generates irreproducible results. Nature Methods, 12, 179-185. doi: 10.1038/nmeth.3288
  • Held, L., & Ott, M. (2018). On p-values and Bayes factors. Annual Review of Statistics and Its Application, 5. doi: 10.1146/annurev-statistics031017-100307
  • Hoijtink, H., van Kooten, P., & Hulsker, K. (2016a). Why Bayesian psychologists should change the way they use the Bayes Factor. Multivariate Behavioral Research, 51, 2-10. doi: 10.1080/00273171.2014.969364
  • Hoijtink, H., van Kooten, P., & Hulsker, K. (2016b). Bayes factors have frequency properties-This should not be ignored: A rejoinder to Morey, Wagenmakers, and Rouder. Multivariate Behavioral Research, 51, 2022. doi: 10.1080/00273171.2015.1071705
  • Jarosz, A., & Wiley, J. (2014). What are the odds? A practical guide to computing and reporting Bayes factors. Journal of Problem Solving, 7, 2-9. doi: 10.7771/1932-6246.1167
  • JASP Team (2017). JASP (Version 0.8.1.2) [Computer software].
  • Jeffreys, H. (1948). Theory of probability (2nd ed.). Oxford: Oxford University Press.
  • Krzywinski, M., & Altman, N. (2013). Importance of being uncertain. Nature Methods, 10, 809-810. doi: 10.1038/nmeth.2613
  • Leek, J. T., & Peng, R. D. (2015, April 28). P values are just the tip of the iceberg. Nature, 520, 612. doi: 10.1038/520612a
  • Marden, J. I. (2000). Hypothesis testing: From p values to Bayes factors. Journal of the American Statistical Association, 95, 1316-1320.
  • Masson, M. E. J. (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing. Behavioral Research, 43, 679-690. doi: 10.3758/s13428-010-0049-5
  • Morey, R. D., & Rouder, J. N. (2011). Bayes Factor approaches for testing interval null hypothesis. Psychological Methods, 16, 406-419. doi: 10.1037/a0024377
  • Morey, R. D., Wagenmakers, E., & Rouder, J. N. (2016). Calibrated Bayes factors should not be used: A reply to Hoijtink, van Kooten, and Hulsker. Multivariate Behavioral Research, 51, 11-19. doi: 10.1080/00273171.2015.1052710
  • Munafò, M., Noble, S., Browne, W. J., Brunner, D., Button, K., Ferreira, J., …, Blumenstein, R. (2014). Scientific rigor and the art of motorcycle maintenance. Nature Biotechnology, 32, 871-873. doi: 10.1038/nbt.3004
  • Munafò, M., Nosek, B. A., Bishop, D. V. M., Button, K. S., Chambers, C. D., Percie du Sert, N., …, Ioannidis, J. P. A. (2017). A manifesto for reproducible science. Nature Human Behavior, 1, Article Number 21. doi: 10.1038/s41562-016-0021
  • Nuzzo, R. (2015, October 7). Fooling ourselves. Nature, 526, 182-185. doi: 10.1038/526182a
  • Nuzzo, R. (2014, February 12). Statistical errors: P values, the ‘gold standard’ of statistical validity, are not as reliable as many scientists assume. Nature, 506, 150-152. doi: 10.1038/506150a
  • Ord, A. S., Ripley, J. S., Hook, J., & Erspamer, T. (2016). Teaching statistics in APA-accredited doctoral programs in clinical and counselling psychology: A syllabi review. Teaching of Psychology, 43, 221-226. doi: 10.1177/0098628316649478
  • Orlitzky, M. (2012). How can significance tests be deinstitutionalized? Organizational Research Methods, 5, 199-228.
  • Puga, J. L., Krzywinski, M., & Altman, N. (2015). Points of Signifiance: Bayesian statistics. Nature Methods, 12, 377-378. doi: 10.1038/ nmeth.3368
  • Puga, J. L., & Ruiz-Ruano, A. M. (2017, October 30). Bayes Factor and P-value Comparison: A simulation study. http://doi.org/10.17605/OSF. IO/2E56P
  • R Core Team (2017). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.
  • Rosnow, R. L., & Rosenthal, R. (1989). Statistical procedures and the justification of knowledge in psychological science. American Psychologist, 44, 1276-1284.
  • Sellke, T., Bayarri, M. J., & Berger, J. O. (2001). Calibration of p values for testing precise null hypothesis. The American Statistician, 55, 62-71.
  • Stern, H. S. (2016). A test by any other name: P-values, Bayes Factors and statistical inference. Multivariate Behaviour Research, 51, 23-39. doi: 10.1080/00273171.2015.1099032
  • Trafimow, D. (2014). Editorial. Basic and Applied Social Psychology, 36, 1-2. doi: 10.1080/01973533.2014.865505
  • Trafimow, D., & Earp, B. D. (2017). Null hypothesis significance testing and Type I error: The domain problem. New Ideas in Pychology, 45, 19-27. doi: 10.1016/j.newideapsych.2017.01.002
  • Trafimow, D., & Marks, M. (2015). Editorial. Basic and Applied Social Psychology, 37, 1-2. doi: 10.1080/01973533.2015.1012991
  • Wagenmakers, E. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779-804. doi: 10.3758/ BF03194105
  • Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70, 129-133. doi: 10.1080/00031305.2016.1154108
  • Wilkinson, L., & Task Force on Statistical Inference (1999). Statistical methods in psychology journals. American Psychologist, 54, 594-604.