Google Scholar as a source for scholarly evaluationa bibliographic review of database errors

  1. Enrique Orduna-Malea 1
  2. Alberto Martín-Martín 2
  3. Emilio Delgado López-Cózar 2
  1. 1 Universitat Politècnica de València, España
  2. 2 Universidad de Granada, España
Revista:
Revista española de documentación científica

ISSN: 0210-0614 1988-4621

Año de publicación: 2017

Volumen: 40

Número: 4

Tipo: Artículo

DOI: 10.3989/REDC.2017.4.1500 DIALNET GOOGLE SCHOLAR lock_openAcceso abierto editor

Otras publicaciones en: Revista española de documentación científica

Resumen

Google Scholar es un motor de búsqueda académico y herramienta de descubrimiento lanzada por Google (ahora Alphabet) en noviembre de 2004. El hecho de que para cada registro bibliográfico se proporcione información acerca del número de citas recibidas por dicho registro desde el resto de registros indizados en el sistema (independientemente de su fuente) ha propiciado su utilización en análisis bibliométricos y en procesos de evaluación de la actividad académica, especialmente en Ciencias Sociales y Humanidades. No obstante, la existencia de errores, en ocasiones de gran magnitud, ha provocado su rechazo y crítica por una parte de la comunidad científica. Este trabajo pretende precisamente realizar una revisión bibliográfica exhaustiva de todos los estudios que de forma monográfica o colateral proporcionan alguna evidencia empírica sobre cuáles son los errores cometidos por Google Scholar (y productos derivados, como Google Scholar Metrics y Google Scholar Citations). Los resultados indican que el corpus bibliográfico dedicado a los errores en Google Scholar es todavía escaso (n= 49), excesivamente fragmentado, disperso, con resultados obtenidos sin metodologías sistemáticas y en unidades no comparables entre sí, por lo que su cuantificación y su efecto real no pueden ser caracterizados con precisión. Ciertas limitaciones del propio buscador (tiempo requerido de limpieza de datos, límite de citas por registro y resultados por consulta) podrían ser la causa de esta ausencia de trabajos empíricos.

Información de financiación

Orduna-Malea holds a postdoctoral fellowship (PAID-10-14), from the Polytechnic University of Valencia (Spain). This manuscript has been translated by professional native translator Charles Balfour.

Referencias bibliográficas

  • Abram, S. (2005). Google Scholar: thin edge of the wedge?. Information Outlook, 9 (1), 44-46.
  • Adlington, J.; Benda, C. (2006). Checking under the hood: evaluating Google scholar for reference use. Internet Reference Services Quarterly, 10 (3/4), 135-148.
  • Adriaanse, L.; Rensleigh, C. (2011). Content versus quality: a Web of Science, Scopus and Google Scholar comparison. 13th Annual Conference on World Wide Web applications, pp. 5-18. Cape Peninsula University of Technology; Johannesburg, South Africa.
  • Adriaanse, L.; Rensleigh, C. (2013). Web of Science, Scopus and Google Scholar: A content comprehensiveness comparison. The Electronic Library, 31 (6), 727-744. https://doi.org/10.1108/EL-12-2011-0174
  • Aguillo, Isidro F. (2012). Is Google Scholar useful for bibliometrics? A webometric analysis. Scientometrics, 91 (2), 343-351. https://doi.org/10.1007/s11192-011-0582-8
  • Baneyx, A. (2008). "Publish or Perish" as citation metrics used to analyze scientific output in the humanities: International case studies in Economics, Geography, Social Sciences, Philosophy, and History. Archivium Immunologiae et Therapiae Experimentalis, 56 (6), 363–371. https://doi.org/10.1007/s00005-008-0043-0 PMid:19043670
  • Bar-Ilan, J. (2006). An ego-centric citation analysis of the works of Michael O. Rabin based on multiple citation indexes. Information Processing & Management, 42 (6), 1553-1566. https://doi.org/10.1016/j.ipm.2006.03.019
  • Bar-Ilan, J. (2008). Which h-index?—A comparison of WoS, Scopus and Google Scholar. Scientometrics, 74 (2), 257-271. https://doi.org/10.1007/s11192-008-0216-y
  • Bar-Ilan, J. (2010). Citations to the "Introduction to informetrics" indexed by WOS, Scopus and Google Scholar. Scientometrics, 82 (3), 495-506. https://doi.org/10.1007/s11192-010-0185-9
  • Bauer, K.; Bakkalbasi, N. (2005). An examination of citation counts in a new scholarly communication environment. D-Lib magazine, 11 (9). https://doi.org/10.1045/september2005-bauer
  • Beel, J.; Gipp, B. (2009). Google Scholar's ranking algorithm: an introductory overview. Proceedings of the 12th international conference on scientometrics and informetrics, pp. 230-241. ISSI. Rio de Janeiro, Brazil.
  • Belew R.K. (2005). Scientific impact quantity and quality: analysis of two sources of bibliographic data. Available at: http://www.cogsci.ucsd. edu/~rik/papers/belew05-iqq.pdf
  • Bensman, S.J. (2012). The impact factor: its place in Garfield's thought, in science evaluation, and in library collection management. Scientometrics, 92 (2), 263- 275. https://doi.org/10.1007/s11192-011-0601-9
  • Bosman, J; Mourik, I; Van Rasch, M; Sieverts, E; Verhoeff, H (2006). Scopus reviewed and compared. The coverage and functionality of the citation database Scopus, including comparisons with Web of Science and Google Scholar. Utrecht University Library. Available at: https://dspace.library.uu.nl/handle/1874/18247
  • Breeding, M. (2015). The future of library resource discovery. NISO Whitepapers. NISO; Baltimore, United States. Do?an, G.; ?encan, ?.; Tonta, Y. (2016). Does dirty data affect google scholar citations?. Proceedings of the Association for Information Science and Technology, 53 (1), 1-4.
  • Butler, D. (2004). Science searches shift up a gear as Google starts Scholar Engine. Nature, 432, p. 423. https://doi.org/10.1038/432423a PMid:15565113
  • Butler, L. (2011). The devil is in the detail: Concerns about Vanclay's analysis of Australian journal rankings. Journal of Informetrics, 5 (4), 693–694. https://doi.org/10.1016/j.joi.2011.04.001
  • De Winter, J.C.; Zadpoor, A.A.; Dodou, D. (2014). The expansion of Google Scholar versus Web of Science: a longitudinal study. Scientometrics, 98 (2), 1547-1565. https://doi.org/10.1007/s11192-013-1089-2
  • Dilger, A.; Müller, H. (2013). A citation-based ranking of German-speaking researchers in business administration with data of Google Scholar. European Journal of Higher Education, 3 (2), 140-150. https://doi.org/10.1080/21568235.2013.779464
  • Felter, L.M. (2005). The better mousetrap: Google Scholar, Scirus, and the Scholarly Search Revolution, Searcher, 13 (2), 43-48.
  • García-Pérez, M.A. (2010). Accuracy and completeness of publication and citation records in the Web of Science, PsycINFO, and Google Scholar: A case study for the computation of h indices in Psychology. Journal of the Association for Information Science and Technology, 61(10), 2070-2085. https://doi.org/10.1002/asi.21372
  • Gardner, S.; Eng, S. (2005). Gaga over Google? Scholar in the social sciences. Library Hi Tech News, 22 (8), 42-45. https://doi.org/10.1108/07419050510633952
  • Giles, J. (2005). Science in the web age: Start your engines. Nature, 438 (7068), 554–555. https://doi.org/10.1038/438554a PMid:16319857
  • Goodman, A. (2004). Google Scholar vs. Real Scholarship, Traffic. Available at: http://www.traffick.com/2004/11/ google-scholar–vs-real-scholarship.asp
  • Haddaway, N.R.; Collins, A.M.; Coughlin, D.; Kirk, S. (2015). The role of Google Scholar in evidence reviews and its applicability to grey literature searching. PloS one, 10 (9), e0138237. https://doi.org/10.1371/journal.pone.0138237 PMid:26379270 PMCid:PMC4574933
  • Harzing, A.W. (2010). The publish or perish book. Tarma software research; Melbourne.
  • Harzing, A.W. (2014). A longitudinal study of Google Scholar coverage between 2012 and 2013. Scientometrics, 98 (1), 565-575. https://doi.org/10.1007/s11192-013-0975-y
  • Harzing, A-W.; Alakangas, S. (2016). Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison. Scientometrics, 106 (2), 787-804. https://doi.org/10.1007/s11192-015-1798-9
  • Harzing, A.W; Van der Wal, R. (2008). Google Scholar as a new source for citation analysis. Ethics in science and environmental politics, 8 (1), 61-73. https://doi.org/10.3354/esep00076
  • Hirsch, J.E. (2005). An index to quantify an individual's scientific research output. Proceedings of the National academy of Sciences of the United States of America, 102 (46), 16569-16572. https://doi.org/10.1073/pnas.0507655102 PMid:16275915 PMCid:PMC1283832
  • Jacsó, P. (2004). Péter's digital ready reference shelf, (web-only document). Available at: https://goo.gl/ouV3PP
  • Jacsó, P. (2005a). As we may search: Comparison of major features of the Web of Science, Scopus, and Google Scholar citation-based and citation-enhanced databases. Current science, 89 (9), 1537-1547.
  • Jacsó, P. (2005b). Comparison and analysis of the citedness scores in Web of Science and Google Scholar. International Conference on Asian Digital Libraries, pp 360-369. Springer; Berlin; Heidelberg, Germany. https://doi.org/10.1007/11599517_41
  • Jacsó, P. (2005c). Google Scholar: the pros and the cons. Online information review, 29 (2), 208-214. https://doi.org/10.1108/14684520510598066
  • Jacsó, P. (2006a). Deflated, inflated, and phantom citation counts. Online Information Review, 30 (3), 297-309. https://doi.org/10.1108/14684520610675816
  • Jacsó, P. (2006b). Dubious hit counts and cuckoo's eggs. Online Information Review, 30 (2), 188-193. https://doi.org/10.1108/14684520610659201
  • Jacsó, P. (2008a). Google scholar revisited. Online Information Review, 32 (1), 102-114. https://doi.org/10.1108/14684520810866010
  • Jacsó, P. (2008b). The pros and cons of computing the h-index using Google Scholar. Online Information Review, 32 (3), 437-452. https://doi.org/10.1108/14684520810889718
  • Jacsó, P. (2008c). Testing the calculation of a realistic h-index in Google Scholar, Scopus, and Web of Science for F.W. Lancaster. Library Trends, 56 (4), 784-815. https://doi.org/10.1353/lib.0.0011
  • Jacsó, P. (2009a). Calculating the h-index and other bibliometric and scientometric indicators from Google Scholar with the Publish or Perish software. Online Information Review, 33(6), 1189-1200. https://doi.org/10.1108/14684520911011070
  • Jacsó, P. (2009b). Google Scholar's Ghost Authors. Library Journal, 134 (18), 26-27.
  • Jacsó, P. (2010). Metadata mega mess in Google Scholar. Online Information Review, 34 (1), 175-191. https://doi.org/10.1108/14684521011024191
  • Jacsó, P. (2011). Google Scholar duped and deduped–the aura of "robometrics". Online Information Review, 35(1), 154-160. https://doi.org/10.1108/14684521111113632
  • Jacsó, P. (2012a). Google Scholar Author Citation Tracker: is it too little, too late?. Online Information Review, 36(1), 126-141. https://doi.org/10.1108/14684521211209581
  • Jacsó, P. (2012b). Grim tales about the impact factor and the h-index in the Web of Science and the Journal Citation Reports databases: Reflections on Vanclay's criticism. Scientometrics, 92 (2), 325-354. https://doi.org/10.1007/s11192-012-0769-7
  • Jacsó, P. (2012c). Using Google Scholar for journal impact factors and the h-index in nationwide publishing assessments in academia – siren songs and air-raid sirens. Online Information Review, 36 (3), 462-478. https://doi.org/10.1108/14684521211241503
  • Jacsó, P. (2012d). Google Scholar Metrics for Publications: The software and content features of a new open access bibliometric service. Online Information Review, 36 (4), 604-619. https://doi.org/10.1108/14684521211254121
  • Kennedy, S.; Price, G. (2004). Big News: "Google Scholar" is Born. Resourceshelf. Available at: http:// web.resourceshelf.com/go/resourceblog/40511
  • Leslie M.A. (2004). A Google for academia. Science, 306 (5702), 1661-1663. https://doi.org/10.1126/science.306.5702.1661c
  • Levine-Clark, M.; Gil, E.L. (2009). A comparative citation analysis of Web of Science, Scopus and Google Scholar. Journal of Business and Finance Librarianship, 14 (1), 32-46. https://doi.org/10.1080/08963560802176348
  • Li, J.; Sanderson, M.; Willett, P.; Norris, M.; Oppenheim, C. (2010). Ranking of library and information science researchers: Comparison of data sources for correlating citation data, and expert judgments. Journal of Informetrics, 4 (4), 554-563. https://doi.org/10.1016/j.joi.2010.06.005
  • London School of Economics and Political Science (2011). Maximizing the impacts of your research: A handbook for social scientists. LSE; UK. Available at: http:// www2.lse.ac.uk/government/research/resgroups/ LSEPublicPolicy/Docs/LSE_Impact_Handbook_ April_2011.pdf
  • Maia, J.L.; Di Serio, L.C.; Alves Filho, A.G. (2016). Bibliometric research on strategy as practice: exploratory results and source comparison. Sistemas & Gestão, 10 (4), 654-669. https://doi.org/10.20985/1980-5160.2015.v10n4.662
  • Martín-Martín, A.; Ayllón, J.M.; Orduna-Malea, E.; Delgado López-Cózar, E. (2014a). Google Scholar Metrics 2014: a low cost bibliometric tool. EC3 Working Papers, 17. Available at: https://arxiv.org/abs/1407.2827
  • Martín-Martín, A., Orduna-Malea, E., Ayllón, J.M.; Delgado López-Cózar, E. (2014b). Does Google Scholar contain all highly cited documents (1950-2013)?. EC3 Working Papers, 19. Available at: https://arxiv.org/ abs/1410.8464
  • Martín-Martín, A.; Ayllón, J.M.; Orduna-Malea, E.; Delgado López-Cózar, E. (2016a). 2016 Google Scholar Metrics released: a matter of languages... and something else. EC3 Working Papers, 22. Available at: https://arxiv. org/abs/1607.06260
  • Martín-Martín, A.; Orduna-Malea, E.; Ayllón, J.M.; Delgado López-Cózar, E. (2016b). A two-sided academic landscape: snapshot of highly-cited documents in Google Scholar (1950-2013). Revista Espa-ola de Documentación Científica, 39 (4).
  • Martín-Martín, A.; Orduna-Malea, E.; Ayllón, J.M.; Delgado López-Cózar, E. (2016c). The counting house: measuring those who count. Presence of Bibliometrics, Scientometrics, Informetrics, Webometrics and Altmetrics in the Google Scholar Citations, ResearcherID, ResearchGate, Mendeley & Twitter. EC3 Working Papers, 21. Available at: https://arxiv.org/abs/1602.02412
  • Martín-Martín, A.; Orduna-Malea, E.; Harzing, A.W.; Delgado López-Cózar, E. (2017). Can we use Google Scholar to identify highly-cited documents?. Journal of Informetrics, 11 (1), 152-163. https://doi.org/10.1016/j.joi.2016.11.008
  • Meho, L.I.; Yang, K. (2007). Impact of data sources on citation counts and rankings of LIS faculty: Web of Science versus Scopus and Google Scholar. Journal of the American Society for Information Science and Technology, 58 (13), 2105-2125. https://doi.org/10.1002/asi.20677
  • Moed, H.F.; Bar-Ilan, J.; Halevi, G. (2016). A new methodology for comparing Google Scholar and Scopus. Journal of Informetrics, 10 (2), 533-551. https://doi.org/10.1016/j.joi.2016.04.017
  • Noll, H.M. (2008). Where Google Scholar Stands on Art: An Evaluation of Content Coverage in Online Databases. [Master Thesis]. University of North Carolina at Chapel Hill; North Carolina.
  • Noruzi, A. (2005). Google Scholar: the new generation of citation indexes. Libri, 55 (4), 170-180. https://doi.org/10.1515/LIBR.2005.170
  • Notess, G.R. (2005). Scholarly web searching: Google Scholar and Scirus. Online, 29 (4), 39-41.
  • Nunberg, G. (2009). Google's book search: A disaster for scholars. The chronicle of higher education, 31. Available at: http://www.chronicle.com/article/Googles-Book-Search-A/48245
  • Oder, N. (2009). Google, 'the last library', and millions of metadata mistakes. Library Journal Academic Newswire, 3.
  • Ojala, M. (2005). Scholarly mistakes. Online, 29 (3), 26.
  • Orduna-Malea, E.; Martín-Martín, A.; Ayllón, J.M.; Delgado López-Cózar, E. (2016). La revolución Google Scholar. Destapando la caja de Pandora académica. UNE (Unión de Editoriales Universitarias Espa-olas); Granada. PMid:27653216
  • Orduna-Malea, E.; Ayllón, J.M.; Martín-Martín, A.; Delgado López-Cózar, E. (2017). The lost academic home: institutional affiliation links in Google Scholar Citations. Online Information Review, 41 (6), 762-781. https://doi.org/10.1108/OIR-10-2016-0302
  • Ortega, J. L. (2014). Academic search engines: A quantitative outlook. Elsevier; Oxford. http://www.sciencedirect.com/science/book/9781843347910
  • Ortega, J. L. (2015). Relationship between altmetric and bibliometric indicators across academic social sites: The case of CSIC's members. Journal of Informetrics, 9 (1), 39-49. https://doi.org/10.1016/j.joi.2014.11.004
  • Pauly, D.; Stergiou, K.I. (2005). Equivalence of results from two citation analyses: Thomson ISI's Citation Index and Google's Scholar service. Ethics in Science and Environmental Politics, 9, 33-35. https://doi.org/10.3354/esep005033
  • Perkel (2005). The future of citation analysis. The Scientist, 19 (20), 24.
  • Pitol, S.P.; De Groote, S.L. (2014). Google Scholar versions: do more versions of an article mean greater impact?. Library Hi Tech, 32 (4), 594-611. https://doi.org/10.1108/LHT-05-2014-0039
  • Price, G. (2004). Google Scholar documentation and large PDF files. Search Engine Watch. Available at: https://searchenginewatch.com/sew/news/2063361/google-scholar-documentation-large-pdf-files
  • Robinson, M.L.; Wusteman, J. (2007). Putting Google Scholar to the test: A preliminary study. Program, 41 (1), 71-80. https://doi.org/10.1108/00330330710724908
  • Rosenstreich, D.; Wooliscroft, B. (2009). Measuring the impact of accounting journals using Google Scholar and the g-index. The British Accounting Review, 41 (4), 227-239. https://doi.org/10.1016/j.bar.2009.10.002
  • Sanderson, M. (2008). Revisiting h measured on UK LIS and IR academics. Journal of the American Society for Information Science and Technology, 59 (7), 1184- 1190. https://doi.org/10.1002/asi.20771
  • Shultz M. (2007). Comparing test searches in PubMed and Google Scholar. Journal of the Medical Library Association, 95 (4), 442–445. https://doi.org/10.3163/1536-5050.95.4.442 PMid:17971893 PMCid:PMC2000776
  • Sullivan, D. (2004). Google Scholar Offers Access to Academic Information. Search Engine Watch. Available at: https://searchenginewatch.com/sew/ news/2048646/google-scholar-offers-access-to-academic-information
  • Thelwall, M.; Kousha, K. (2017). ResearchGate versus Google Scholar: Which finds more early citations?. Scientometrics, 112 (2), 1125-1131. https://doi.org/10.1007/s11192-017-2400-4
  • Thor, A.; Bornmann, L. (2011). The calculation of the single publication h index and related performance measures: A web application based on Google Scholar data. Online Information Review, 35 (2), 291-300. https://doi.org/10.1108/14684521111128050
  • Torres-Salinas, D.; Ruiz-Pérez, R.; Delgado-López-Cózar, E. (2009). Google Scholar como herramienta para la evaluación científica. El profesional de la información, 18 (5), 501-510. https://doi.org/10.3145/epi.2009.sep.03
  • Vanclay, J.K. (2012). Impact factor: outdated artefact or stepping-stone to journal certification?. Scientometrics, 92 (2), 211-238. https://doi.org/10.1007/s11192-011-0561-0
  • Vaughan, L.; Shaw, D. (2008). A New Look at Evidence of Scholarly Citations in Citation Indexes and From Web Sources. Scientometrics, 74 (2), 317–330. https://doi.org/10.1007/s11192-008-0220-2
  • Verstak, A.; Acharya, A. (2013). Identifying multiple versions of documents. US Patents (US8589784 B1). Available at: https://www.google.com/patents/US8589784
  • Vine, R. (2005). Google Scholar is a full year late indexing Pubmed content. SiteLines: ideas about searching. Available at: http://web.archive.org/web/20060716085124/ http://www.workingfaster. com/sitelines/archives/2005_02.html
  • Walters, W.H. (2007). Google Scholar coverage of a multidisciplinary field. Information Processing & Management, 43 (4), 1121-1132. https://doi.org/10.1016/j.ipm.2006.08.006
  • White, B. (2006). Examining the claims of Google Scholar as a serious information source. New Zealand Library & Information Management Journal, 50 (1), 11-24.
  • Wleklinski, J.M. (2005). Studying Google Scholar: wall to wall coverage?. Online, 29 (3), 22-26.
  • Yang, K.; Meho, L.I. (2006). Citation analysis: a comparison of Google Scholar, Scopus, and Web of Science. Proceedings of the American Society for information science and technology, 43 (1), 1-15. https://doi.org/10.1002/meet.14504301185