Extracción automática de colocacións e modismos

  1. Pamies Bertrán, Antonio
  2. Pazos Granada, José Manuel
Journal:
Cadernos de fraseoloxía galega

ISSN: 1698-7861

Year of publication: 2004

Issue: 6

Pages: 191-204

Type: Article

More publications in: Cadernos de fraseoloxía galega

Abstract

Statistical definitions of collocations describe them as combinations of words which co-occur more often than their respective frequencies and the length of the text would predict. Since Sinclair’s works (1970) proposed this assumption, many experimental works, with different methods and criteria, have been carried out with electronic corpora obtaining different results (e.g. Berry-Roghe 1973; Church & Hanks 1989; Clear 1993; Dunning 1993). Our work applies different methods to a small literary corpus in Spanish language, in order to evaluate, with the same text and the same criteria, each methodological tool that could be involved in automatic detection of collocations on the basis of strictly quantitative data, which should deal also with idioms and even proverbs.