Computación y Sistemas
versión impresa ISSN 1405-5546
Given a large hierarchical concept dictionary (thesaurus, or ontology), the task of selection of the concepts that describe the contents of a given document is considered. A statistical method of document indexing driven by such a dictionary is proposed. The method is insensible to inaccuracies in the dictionary, which allow for semi-automatic translation of the hierarchy into difíerent languages. The problem of handling non-terminal and especially top-level nodes in the hierarchy is discussed. Common sense-complaint methods of automatically assigning the weights to the nodes and links in the hierarchyare presented. The application of the method in the Classifier system is discussed.
Palabras llave : Document Characterization; Document Comparison; Ontology; Statistical Methods.