Efficiency of propensity score adjustment and calibration on the estimation from non-probabilistic online surveys

  1. Ramón Ferri-García 1
  2. María del Mar Rueda 1
  1. 1 Department of Statistics and Operations Research, University of Granada
Revista:
Sort: Statistics and Operations Research Transactions

ISSN: 1696-2281

Año de publicación: 2018

Volumen: 42

Número: 2

Páginas: 159-182

Tipo: Artículo

Otras publicaciones en: Sort: Statistics and Operations Research Transactions

Resumen

One of the main sources of inaccuracy in modern survey techniques, such as online and smart- phone surveys, is the absence of an adequate sampling frame that could provide a probabilistic sampling. This kind of data collection leads to the presence of high amounts of bias in final es- timates of the survey, specially if the estimated variables (also known as target variables) have some influence on the decision of the respondent to participate in the survey. Various correction techniques, such as calibration and propensity score adjustment or PSA, can be applied to re- move the bias. This study attempts to analyse the efficiency of correction techniques in multiple situations, applying a combination of propensity score adjustment and calibration on both types of variables (correlated and not correlated with the missing data mechanism) and testing the use of a reference survey to get the population totals for calibration variables. The study was performed using a simulation of a fictitious population of pot ential voters and a real volunteer survey aimed to a population for which a complete census was available. Results showed that PSA combined with calibration results in a bias removal considerably larger when compared with calibration with no prior adjustment. Results also showed that using population totals from the estimates of a reference survey instead of the available population data does not make a difference in estimates accuracy, although it can contribute to slightly increment the variance of the estimator.

Referencias bibliográficas

  • Bethlehem, J. (2010). Selection bias in web surveys. International Statistical Review, 78, 161–188.
  • Cassel, C. M., Särndal, C. E. and Wretman, J. H. (1976). Some results on generalized difference estimation and generalized regression estimation for finite populations. Biometrika, 63, 615–620.
  • Cochran, W. G. (1968). The effectiveness of adjustment by subclassification in removing bias in observational studies. Biometrics, 24, 295–313.
  • Couper, M. (2000). Web surveys: A review of issues and approaches. Public Opinion Quarterly, 64, 464– 494.
  • Couper, M. (2017). Developments in survey collection. Annual Review of Sociology, 43, 121–145.
  • Couper, M., Kapteyn, A., Schonlau, M. and Winter, J. (2007). Noncoverage and non-response in an internet survey. Social Science Research, 36, 131–148.
  • Couper, M. and Peterson, G. (2017). Why do web surveys take longer on smartphones? Social Science Computer Review, 35, 357–377.
  • Dever, J. A., Rafferty, A. and Valliant, R. (2008). Internet surveys: can statistical adjustments eliminate coverage bias? Survey Research Methods, 2, 47–62.
  • Deville, J. C., and Särndal, C. E. (1992). Calibration estimators in survey sampling. Journal of the American statistical Association, 87, 376–382.
  • Dı́az de Rada, V. (2012). Ventajas e inconvenientes de la encuesta por internet. Papers, 97, 193–223.
  • Dı́az de Rada, V. and Domı́nguez, J. A. (2015). The quality of responses to grid questions as used in Web questionnaires (compared with paper questionnaires). International Journal of Social Research Methodology, 18, 337–348.
  • Dı́az de Rada, V. and Domı́nguez, J. A. (2016). Mail survey abroad with an alternative web survey. Quality and Quantity, 50, 1153–1164.
  • Elliott, M. R. and Valliant, R. (2017). Inference for nonprobability samples. Statistical Science, 32, 249– 264.
  • Heerwegh, D. (2009). Mode differences between face-to-face and web surveys: an experimental investigation of data quality and social desirability effects. International Journal of Public Opinion Research, 21, 111–121.
  • Kim, J. K. and Park, M. (2009). Calibration estimation in survey sampling. International Statistical Review, 78, 21–39.
  • Lee, S. (2006). Propensity score adjustment as a weighting scheme for volunteer panel web surveys. Journal of Official Statistics, 22, 329–349.
  • Lee, S. and Valliant, R. (2009). Estimation for volunteer panel web surveys using propensity score adjustment and calibration adjustment. Sociological Methods & Research, 37, 319–343.
  • Little, R. J. and Rubin, D. B. (2002). Statistical Analysis with Missing Data. Wiley, New York.
  • Manfreda, K. L., Berzelak, J., Vehovar, V., Bosnjak, M. and Haas, I. (2008). Web surveys versus other survey modes: A meta-analysis comparing response rates. International Journal of Market Research, 50, 79–104.
  • Martı́nez, S., Rueda, M., Arcos, A. and Martı́nez, H. (2010). Optimum calibration points estimating distribution functions. Journal of Computational and Applied Mathematics, 233, 2265–2277.
  • Mei, B. and Brown, G. (2017). Conducting online surveys in China. Social Science Computer Review, 0894439317729340.
  • National Institute of Statistics (2016). Población (españoles/extranjeros) por edad (grupos quinquenales), sexo y año. Retrieved from http://www.ine.es/jaxi/Tabla.htm?path=/t20/e245/p08/l0/&file=02002.px (Accessed 20 March 2018).
  • National Institute of Statistics (2017a). Encuesta sobre Equipamiento y Uso de Tecnologı́as de Información y Comunicación en los Hogares. Retrieved from http://www.ine.es/prensa/tich2017.pdf (Accessed 20 March 2018).
  • National Institute of Statistics (2017b). España en Cifras 2017. Retrieved from http://www.ine.es/prodyser/ espacifras/2017/index.html (Accessed 20 March 2018).
  • National Institute of Statistics (2017c). Nivel de formación de la población adulta (de 25 a 64 años). Retrieved from http://www.ine.es/ss/Satellite?c=INESeccionC&p=1254735110672&pagename=Pro ductosYServicios%2FPYSLayout&cid=1259925481659&L=0l (Accessed 20 March 2018).
  • Pew Research Center (2017). Demographics of Internet and Home Broadband Usage in the United States. Retrieved from http://www.pewinternet.org/fact-sheet/internet-broadband/(Accessed 20March 2018).
  • Rosenbaum, P. R. and Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70, 41–55.
  • Rubin, D. B. (1986). Statistical matching using file concatenation with adjusted weights and multiple imputations. Journal of Business & Economic Statistics, 4, 87–94.
  • Rueda, M., Sánchez-Borrego, I., Arcos, A. and Martı́nez, S. (2010). Model-calibration estimation of the distribution function using nonparametric regression. Metrika, 71, 33–44.
  • Särndal, C. E. (2007). The calibration approach in survey theory and practice. Survey Methodology, 33, 99–119.
  • Schonlau, M. and Couper, M. (2017). Options for conducting web surveys. Statistical Science, 32, 279– 292.
  • Schonlau, M., van Soest, A., Kapteyn, A. and Couper, M. (2009). Selection bias in web surveys and the use of propensity scores. Sociological Methods & Research, 37, 291–318.
  • Pasadas-del-Amo, S. (2018). Cell phone-only population and election forecasting in Spain: The 2012 regional election in Andalusia. Revista Española de Investigaciones Sociológicas (REIS), 162, 55–72.
  • Taylor, H. (2000). Does internet research work? International Journal of Market Research, 42, 51–63.
  • Taylor, H., Bremer, J., Overmeyer, C., Siegel, J. W. and Terhanian, G. (2001). The record of internet-based opinion polls in predicting the results of 72 races in the November 2000 US elections. International Journal of Market Research, 43, 127–135.
  • Valliant, R. and Dever, J. A. (2011). Estimating propensity adjustments for volunteer web surveys. Sociological Methods & Research, 40, 105–137.