Definició d'una metodologia experimental per a l'estudi de resultats en sistemes d'aprenentatge artificial

Martorell Rodon, Josep Maria

Definició d'una metodologia experimental per a l'estudi de resultats en sistemes d'aprenentatge artificial

Martorell Rodon, Josep Maria

Dirigida por:

Josep Maria Garrell Guiu Director/a

Universidad de defensa: Universitat Ramon Llull

Fecha de defensa: 05 de diciembre de 2007

Tribunal:

Francisco Herrera Triguero Presidente
Núria Agell Jané Secretario/a
Beatriz López Ibáñez Vocal
Susana Puig Sardá Vocal
Xavier Vilasís-Cardona Vocal

Tipo: Tesis

Teseo: 139599 DIALNET TDX editor

Resumen

The present work is all part of the work field of the Research Group in Intelligent Systems: the machine learning. The main areas are the evolutive computation and the case based reasoning, the investigation being focused on the classification, diagnosis and prediction issues. In all of these fields, great groups of data are studied, for which different techniques are applied, enabling the knowledge extraction and the application of the aforementioned problems. The big breakthroughs in these areas (many times in ways of algorithms) coexist with very partial works on suitable methodologies for the evaluation of these new proposals. Before this situation, the thesis herein presented proposes a new general approach for the assessment of a set of M algorithms behaviour which, in order to be analysed, are tested over N datasets. The thesis maintains that the analysis made for these results is clearly insufficient and consequently the conclusions put forward in the works published are very often partial and in some cases even erroneous. This work begins with an introductory study on the measures allowing to express the performance of an algorithm, through the test over a collection of datasets. At this point it is evidenced that a prior study of the inherent properties of these problems (for instance, based on complexity metrics) is needed, in order to assure the reliability of the conclusions that will be drawn. Next, the scope of application of a whole set of well known techniques of statistical inference is defined, for which the factors to be taken into account in the determination of their application analysed. The thesis proposes a general protocol for the study, from a statistical point of view, of the behaviour of a set of algorithms, including new graphic patterns which facilitate its analysis, as well as the detailed study of the inherent properties of the test problems used. This protocol determines the application domains of the methodologies for the comparison of the results obtained in each problem. The thesis demonstrates furthermore how this domain is directly related to the capability of this methodology to determine significant differences, as well as to its replicability. Finally, a set of cases on results already published are proposed, resulting from new algorithms developed by our Research Group, very specially in the application of the case-based reasoning. In all these cases the application of the methodologies developed in the previous chapters is proved to be correct, and the errors incurred in repeatedly, leading to unreliable conclusions, are highlighted.