Métodos de aprendizaje profundo para la extracción de nombres metafóricos de flores y plantas

  1. Ranasinghe, Tharindu
  2. Mitkov, Ruslan
  3. Haddad, Amal Haddad
  4. Premasiri, Damith
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Año de publicación: 2023

Número: 71

Páginas: 261-271

Tipo: Artículo

Otras publicaciones en: Procesamiento del lenguaje natural

Resumen

El dominio de la Botánica es rico en términos metafóricos. Estos términos tienen un papel importante en la descripción e identificación de flores y plantas. Sin embargo, la identificación de este tipo de términos en el discurso es una tarea difícil. Esto puede conducir a errores en los procesos de traducción y otras tareas lexicográficas. Este proceso es aún más difícil cuando se trata de traducción automática, tanto en el caso de las unidades monoléxicas, como en el caso de las unidades multiléxicas. Uno de los desafíos a los que se enfrentan las aplicaciones del Procesamiento del Lenguaje Natural y las tecnologías de Traducción Automática es la identificación de términos basados en metáfora a través de métodos de aprendizaje profundo. En este estudio, tenemos el objetivo de rellenar este vacío a través del uso de trece modelos populares basados en transformadores, además del ChatGPT. Asimismo, demostramos que los modelos discriminativos aportan mejores resultados que los modelos de GPT-3.5. El mejor resultado alcanzó una puntuación de 92,2349% F1 en las tareas de identificación de nombres metafóricos de flores y plantas.

Referencias bibliográficas

  • Beltagy, I., K. Lo, and A. Cohan. 2019. SciBERT: A pretrained language model for scientific text. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLPIJCNLP), pages 3615–3620, Hong Kong, China, November. Association for Computational Linguistics.
  • Brickell, C. 2012. Encyclopedia of plants and flowers. Dorling Kindersley, Santa Fe, New Mexico, USA.
  • Clark, K., M.-T. Luong, Q. V. Le, and C. D. Manning. 2020. ELECTRA: Pretraining text encoders as discriminators rather than generators. In ICLR.
  • Coll-Florit, M. and S. Climent. 2019. A new methodology for conceptual metaphor detection and formulation in corpora: A case study on a mental health corpus. SKY Journal of Linguistics, 32.
  • Conneau, A., K. Khandelwal, N. Goyal, V. Chaudhary, G. Wenzek, F. Guzmán, E. Grave, M. Ott, L. Zettlemoyer, and V. Stoyanov. 2020. Unsupervised crosslingual representation learning at scale. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 8440–8451, Online, July. Association for Computational Linguistics. de Lorenzo Cáceres, J. M. S. 1999. Los Árboles en España: Manual de Identificación. Mundi-Prensa, Spain.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Dweck, A. 2004. A review of edelweiss. Sofw Journal, 130(9):65–71.
  • Dębowiak, P. and J. Waniakowa. 2019. Semantic motivation of plant names as a part of their etymology. page 173–200, 08.
  • Goodman, J. S. 1963. Malayalam color categories. Anthropological Linguistics, pages 1–12.
  • Group, P. 2007. Mip: A method for identifying metaphorically used words in discourse. Metaphor and symbol, 22(1):1–39.
  • Gutierrez, E. D., E. Shutova, T. Marghetis, and B. Bergen. 2016. Literal and metaphorical senses in compositional distributional semantic models. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 183–193.
  • Jang, H., S. Moon, Y. Jo, and C. Rose. 2015. Metaphor detection in discourse. In Proceedings of the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 384–392.
  • Lakoff, G. and M. Johnson. 2008. Metaphors we live by. University of Chicago press.
  • Leong, C. W., B. B. Klebanov, C. Hamill, E. Stemle, R. Ubale, and X. Chen. 2020. A report on the 2020 vua and toefl metaphor detection shared task. In Proceedings of the second workshop on figurative language processing, pages 18–29.
  • Mitkov, R., H. M. Le An Ha, T. Ranasinghe, and V. Sosoni. 2023. Automatic generation of multiple-choice test items from paragraphs using deep neural networks. Advancing Natural Language Processing in Educational Assessment, page 77.
  • Mu, J., H. Yannakoudakis, and E. Shutova. 2019. Learning outside the box: Discourse-level features improve metaphor identification. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 596–601, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Nacey, S., A. G. Dorst, T. Krennmayr, and W. G. Reijnierse. 2019. Metaphor identification in multiple languages: MIPVU around the world, volume 22. John Benjamins Publishing Company.
  • Nissan, E. 2014. Multilingual lexis, semantics, and onomasiology. terminological database modelling, by using the cupros metarepresentation language: An xmlcompatible xml-precursor enabling flexible nested-relation structures. Language, Culture, Computation. Computational Linguistics and Linguistics: Essays Dedicated to Yaacov Choueka on the Occasion of His 75th Birthday, Part III, pages 122–173.
  • Orăsan, C., R. Evans, and R. Mitkov. 2018. Intelligent text processing to help readers with autism. Intelligent Natural Language Processing: Trends and Applications, pages 713–740.
  • Parsley, K. M. 2020. Plant awareness disparity: A case for renaming plant blindness. Plants, People, Planet, 2(6):598–601.
  • Peñas, J. and J. Lorite. 2019. Biología de la conservación de plantas en Sierra Nevada. Universidad de Granada.
  • Premasiri, D., A. H. Haddad, T. Ranasinghe, and R. Mitkov. 2022. Transformerbased detection of multiword expressions in flower and plant names. In Proceedings of the 5th Workshop on Multi-word Units in Machine Translation and Translation Technology (MUMTTT 2022), September.
  • Premasiri, D. and T. Ranasinghe. 2022. Bert (s) to detect multiword expressions. In Proceedinsg of the 4th International Conference on Computational and Corpus-Based Phraseology, Europhras 2022. Springer Cham.
  • Premasiri, D., T. Ranasinghe, W. Zaghouani, and R. Mitkov. 2022. DTW at qur’an QA 2022: Utilising transfer learning with transformers for question answering in a low-resource domain. In Proceedinsg of the 5th Workshop on Open-Source Arabic Corpora and Processing Tools with Shared Tasks on Qur’an QA and Fine-Grained Hate Speech Detection, pages 88–95, Marseille, France, June. European Language Resources Association.
  • Radford, A., K. Narasimhan, T. Salimans, I. Sutskever, et al. 2018. Improving language understanding by generative pretraining.
  • Ranasinghe, T., D. Sarkar, M. Zampieri, and A. Ororbia. 2021. WLV-RIT at SemEval- 2021 task 5: A neural transformer framework for detecting toxic spans. In Proceedings of the 15th International Workshop on Semantic Evaluation (SemEval-2021), pages 833–840, Online, August. Association for Computational Linguistics.
  • Ranasinghe, T. and M. Zampieri. 2021. MUDES: Multilingual detection of offensive spans. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Demonstrations, pages 144–152, Online, June. Association for Computational Linguistics.
  • Ranasinghe, T., M. Zampieri, and H. Hettiarachchi. 2019. BRUMS at HASOC 2019: Deep Learning Models for Multilingual Hate Speech and Offensive Language Identification. In Forum for Information Retrieval Evaluation, pages 199–207.
  • Rastall, P. 1996. Metaphor and the names of plants. English Today, 12(2):30–31.
  • Razali, M. S., A. A. Halin, Y.-W. Chow, N. M. Norowi, and S. Doraisamy. 2022. Deep and contextually engineered features for metaphor detection.
  • Rodríguez Penagos, C. et al. 2005. Metalinguistic information extraction from specialized texts to enrich computational lexicons. Ph.D. thesis, Universitat Pompeu Fabra.
  • Ruiz de Mendoza Ibáñez, F. J. 2017. Conceptual complexes in cognitive modeling. Revista Española de Lingüística Aplicada/ Spanish Journal of Applied Linguistics, 30(1):299–324.
  • Štajner, S., V. Yaneva, R. Mitkov, and S. P. Ponzetto. 2017. Effects of lexical properties on viewing time per word in autistic and neurotypical readers. Association of Computational Linguistics.
  • Su, C., F. Fukumoto, X. Huang, J. Li, R. Wang, and Z. Chen. 2020. Deepmet: A reading comprehension paradigm for token-level metaphor detection. In Proceedings of the second workshop on figurative language processing, pages 30–39.
  • Tjong Kim Sang, E. F. and F. De Meulder. 2003. Introduction to the CoNLL- 2003 shared task: Language-independent named entity recognition. In Proceedings of the Seventh Conference on Natural Language Learning at HLT-NAACL 2003, pages 142–147.
  • Turney, P., Y. Neuman, D. Assaf, and Y. Cohen. 2011. Literal and metaphorical sense identification through concrete and abstract context. In Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing, pages 680–690.
  • Urena Gomez-Moreno, J. M. and P. Faber. 2010. Strategies for the semi-automatic retrieval of metaphorical terms. Metaphor and Symbol, 26(1):23–52.
  • Uyangodage, L., T. Ranasinghe, and H. Hettiarachchi. 2021. Can multilingual transformers fight the COVID-19 infodemic? In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 1432–1437, Held Online, September. INCOMA Ltd.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems, 30.
  • Vitez, A. Z., M. Brglez, M. Robnik-Šikonja, T. Škvorc, A. Vezovnik, and S. Pollak. 2022. Extracting and analysing metaphors in migration media discourse: towards a metaphor annotation scheme. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 2430– 2439.
  • Yaneva, V. 2016. Assessing text and web accessibility for people with autism spectrum disorder