Scielo RSS <![CDATA[Computación y Sistemas]]> vol. 19 num. 4 lang. en <![CDATA[SciELO Logo]]> <![CDATA[Editorial]]> <![CDATA[ALICE Chatbot: Trials and Outputs]]> Abstract A chatbot is a conversational agent that interacts with users using natural language. Multi chatbots are available to serve in different domains. However, the knowledge base of chatbots is hand coded in its brain. This paper presents an overview of ALICE chatbot, its AIML format, and our experiments to generate different prototypes of ALICE automatically based on a corpus approach. A description of developed software which converts readable text (corpus) into AIML format is presented alongside with describing the different corpora we used. Our trials revealed the possibility of generating useful prototypes without the need for sophisticated natural language processing or complex machine learning techniques. These prototypes were used as tools to practice different languages, to visualize corpus, and to provide answers for questions. <![CDATA[Query Topic Classification and Sociology of Web Query Logs]]> Abstract In the paper, the objects, tasks, and a general procedure of the sociological analysis of Web search engine query logs are described and illustrated by a methodologically complete study of the cross-nation search image changes based on two-year spaced query logs of the national search audience. <![CDATA[Questions, Answers, and Presuppositions]]> Abstract. The paper deals with empirical questions that come attached with a presupposition. In case that the presupposition is not true, there is no unambiguous direct answer. In such a case an adequate complete answer is a negated presupposition. Yet these simple ideas are connected with a bunch of problems. First, we must distinguish between a pragmatic and semantic presupposition, and thus also between a presupposition and mere entailment. Second, we show that the common definition of a presupposition of a question as such a proposition that is entailed by every possible answer to the question is not precise. We follow Frege and Strawson in treating survival under negation as the most important test for presupposition. But a negative answer to a question is often ambiguous. The ambiguity consists in not distinguishing between two kinds of negative answers, to wit the answers applying narrow-scope or wide-scope negation. While the former preserves presupposition, the latter seems to be presupposition denying. We show that in order the negative answer to be unambiguous, instead of the wide-scope negation presumably denying presupposition, an adequate and unambiguous answer is just the negated presupposition. Having defined presupposition of a question more precisely, we then examine Yes-No questions, Wh-questions, and exclusive-or questions with respect to several kinds of presupposition triggers. These include inter alia topic-focus articulation, verbs expressing termination of an activity, factive verbs, the "whys and how comes", and past or future tense with reference time interval. Our background theory is Transparent Intensional Logic (TIL) with its procedural semantics. TIL is an expressive logic apt for analysis of questions and presuppositions, because within TIL we work with partial functions, in particular, with propositions with truth-value gaps. These features enabled us to define a general analytic schema of sentences associated with a presupposition. Our results are applicable in linguistics and artificial intelligence, in particular, in the systems the behavior of which is controlled by communication and reasoning of intelligent social agents. <![CDATA[A Rule-Based Meronymy Extraction Module for Portuguese]]> Abstract. In this article, we improve the extraction of semantic relations between textual elements as it is currently performed by STRING, a hybrid statistical and rule-based Natural Language Processing (NLP) chain for Portuguese, by targeting whole-part relation (meronymy), that is, a semantic relation between two entities of which one is perceived as a constituent part of the other, or between a set and its member. In this case, we focus on the type of meronymy involving human entities and body-part nouns (Nbp) (e.g., O Pedro partiu uma perna 'Pedro broke a leg': WHOLE-PART (Pedro, perna) WHOLE-PART (Pedro, leg) '). In orderto extract this type of whole-part relations, a rule-based meronymy extraction module has been built and integrated in the grammar of the STRING system. The module was evaluated with promising results. <![CDATA[Recognizing Textual Entailment by Soft Dependency Tree Matching]]> Abstract: We present a rule-based method for recognizing entailment relation between a pair of text fragments by comparing their dependency tree structures. We used a dependency parser to generate the dependency triples of the text-hypothesis pairs. A dependency triple is an arc in the dependency parse tree. Each triple in the hypothesis is checked against all the triples in the text to find a matching pair. We have developed a number of matching rules after a detailed analysis of the PETE dataset, which we used for the experiments. A successful match satisfying any of these rules assigns a matching score of 1 to the child node of that particular arc in the hypothesis dependency tree. Then the dependency parse tree is traversed in post-order way to obtain the final entailment score at the root node. The scores of the leaf nodes are propagated from the bottom of the tree to the non-leaf nodes, up to the root node. The entailment score of the root node is compared against a predefined threshold value to make the entailment decision. Experimental results on the PETE dataset show an accuracy of 87.69% on the development set and 73.75% on the test set, which outperforms the state-of-the-art results reported on this dataset so far. We did not use any other NLP tools or knowledge sources, to emphasize the role of dependency parsing in recognizing textual entailment. <![CDATA[Improved Statistical Machine Translation by Cross-Lingustic Projection of Named Entities Recognition and Translation]]> Abstract: One of the existing difficulties in natural language processing applications is the lack of appropritate tools for the recognition, translation, and/or transliteration of named entities (NEs), specifically for less- resourced languages. In this paper, we propose a new method to automatically label multilingual parallel data for Arabic-French pair of languages with named entity tags and build lexicons of those named entities with their transliteration and/or translation in the target language. For this purpose, we bring in a third well-resourced language, English, that might serve as pivot, in order to build an Arabic-French NE Translation lexicon. Evaluations on the Arabic-French pair of languages using English as pivot in the transitive model showed the effectiveness of the proposed method for mining Arabic- French named entities and their translations. Moreover, the integration of this component in statistical machine translation outperformed the baseline system. <![CDATA[Identification of Verbal Phraseological Units in Mexican News Stories]]> Abstract. Verbal Phraseological Units are phrases made up of two or more words in which at least one of the words is a verb that plays the role of the predicate. One of the characteristics of this type of expression is that its global meaning rarely can be deduced from the meaning of its components. The automatic recognition of this type of linguistic structures is a very important task, since they are a standard way of expressing a concept or idea. In this paper we present the results obtained when different supervised machine learning methods are employed for determining whether or not a verbal phraseological unit is present in a given story of a newspaper. The experiments have been carried out using a supervised corpus of news stories (written in Mexican Spanish). Beside the results obtained in the experiments aforementioned, we provide access to a new lexicon having phrases as entries (instead of single words), in which each entry is associated to a real value (normalized between zero and one) indicating its probability of being a verbal phraseological unit. <![CDATA[Natural language Generation Revision of the State of the Art]]> Resumen: El ser humano se comunica y expresa a través del lenguaje. Para conseguirlo, ha de desarrollar una serie de habilidades de alto nivel cognitivo cuya complejidad se pone de manifiesto en la tarea de automatizar el proceso, tanto cuando se trata de producir lenguaje como de interpretarlo. Cuando la acción comunicativa ocurre entre una persona y un ordenador y éste último es el destinatario de la acción, se emplean lenguajes computacionales que, como norma general, están sujetos a un conjunto de reglas fuertemente tipa-das, acotadas y sin ambigüedad. Sin embargo, cuando el sentido de la comunicación es el contrario y la máquina ha de transmitir información a la persona, si el mensaje se quiere transmitir en lenguaje natural, el procedimiento para generarlo debe lidiar con la flexibilidad y la ambigüedad que lo caracterizan, dando lugar a una tarea de alto nivel de complejidad. Para que las máquinas sean capaces de manejar el lenguaje humano se hacen necesarias técnicas de Lingüística Computacional. Dentro de esta disciplina, el campo que se encarga de crear textos en lenguaje natural se denomina Generación de Lenguaje Natural (GLN). En este artículo se va a hacer un recorrido exhaustivo de este campo. Se describen las fases en las que se suelen descomponer los sistemas de GLN junto a las técnicas que se aplican y se analiza con detalle la situación actual de esta área de investigación y su problemática, así como los recursos más relevantes y las técnicas que se están empleando para evaluar la calidad de los sistemas.<hr/>Abstract. Language is one of the highest cognitive skills developed by human beings and, therefore, one of the most complex tasks to be faced from the computational perspective. Human-computer communication processes imply two different degrees of difficulty depending on the nature of that communication. If the language used is oriented towards the domain of the machine, there is no place for ambiguity since it is restricted by rules. However, when the communication is in terms of natural language, its flexibility and ambiguity becomes unavoidable. Computational Linguistic techniques are mandatory for machines when it comes to process human language. Among them, the area of Natural language Generation aims to automatical development of techniques to produce human utterances, text and speech. This paper presents a deep survey of this research area taking into account different points of view about the theories, methodologies, architectures, techniques and evaluation approaches, thus providing a review of the current situation and possible future research in the field. <![CDATA[Output Regulation and Consensus of a Class of Multi-agent Systems under Switching Communication Topologies]]> Abstract This paper presents the design of a distributed control iaw for the output regulation and output consensus of a set of Ν agents. In this approach, each agent dynamics is represented by a switched linear system. The representation of the agents is neither constrained to be the same nor to have the same state dimension, and communication among agents is considered to be switching. It is also considered that some agents get the reference to be followed from the output of a virtual agent, and every agent gets the output information of its neighbors. Using this information, every agent computes the exosystem state to solve its individual regulation problem. The approach herein proposed employs a local switched stabilizing feedback for each agent based on a common Lyapunov function. A numerical example is provided in order to illustrate the proposed control law. <![CDATA[Observability and Observer Design for Continuous-Time Perturbed Switched Linear Systems under Unknown Switchings]]> Abstract In this work we address the observability and observer design problem for perturbed switched linear systems (SLS) subject to an unknown switching signal, where the continuous state and the evolving linear system (LS) are estimated from the continuous output in spite of the unknown disturbance. The proposed observer is composed of a collection of finite-time observers, one for each LS composing the SLS. Based on the observability results hereinafter derived and in the observer's output estimation error, the evolving LS and its continuous state are inferred. Illustrative examples are presented in detail. <![CDATA[Hybrid Heuristic for Dynamic Location-Allocation on Micro-Credit Territory Design]]> Abstract This paper presents a two phase mixed integer program for the commercial territory design problem of a micro financing institution. After the locations of the territory centers are determined, the customer allocation is done with respect to such planning criteria as total workload, amount of loans, and profit allocation. In order to solve this model for large instances, we propose a hybrid heuristic that includes fixing variables, perturbation analysis, and dynamic relocation of territory centers. We perform a comprehensive statistical analysis that provides novel insights about the interplay of the heuristics in a large scale mixed integer program. The efficiency of the hybrid heuristic is tested and its effectiveness to find near optimal solutions with a reasonably small computational effort is discussed. <![CDATA[Sharpening Minimum-Phase Interpolated Finite Impulse Response Filters]]> Abstract. This paper presents a novel simple method for the direct design of low-pass minimum-phase (MP) filters. The method is based on a linear-phase (LP) finite impulse response (FIR) prototype, sharpening and IFIR (interpolated finite impulse response) techniques, usually used for the design of LP filters. As a result, a more complex minimum phase filter can be designed by using less complex filters. The paper presents the rules and the methodology of the design and illustrates them with an example. The advantage of the method is demonstrated by comparisons with some existing designs.