Scielo RSS <![CDATA[Polibits]]> vol. num. 49 lang. es <![CDATA[SciELO Logo]]> <![CDATA[<b>Editorial</b>]]> <![CDATA[<b>Process for Unattended Execution of Test Components</b>]]> We describe the process to perform software tests. In an enterprise that produces a product line, even if they all have the same goal, they may differ with regard to its development platform, programming language, layer architecture or communication strategies. The process allows standardizing, coordinating and controlling the test execution for all workgroups, no matter their individual characteristics. We present roles, phases, activities and artifacts to address the centralization, reusing and publication of the test scripts and the results of their execution. Additionally, it involves the virtualization for creating test environments, defining steps for its management and publication. Also is presented a tool that supports the process and allow the unattended execution of test components. Finally, we describe two pilot projects demonstrating the applicability of the proposed solution. <![CDATA[<b>Reliable Web Services Composition</b>: <b>An MDD Approach</b>]]> This paper presents an approach for modeling and associating Policies to services' based applications. It proposes to extend the SOD-M model driven method with (i) the n-SCM, a Policy services' composition meta-model for representing non-functional constraints associated to services' based applications; (ii) the n-PEWS meta-model providing guidelines for expressing the composition and the policies; and, (iii) model to model and model to text transformation rules for semi-automatizing the implementation of reliable services' compositions. As will be shown within our environment implementing these meta models and rules, one may represent both systems' cross-cutting aspects (e.g., exception handling for describing what to do when a service is not available, recovery, persistence aspects) and constraints associated to services, that must be respected for using them (e.g., the fact that a service requires an authentication protocol for executing a method). <![CDATA[<b>MultiSearchBP</b>: <b>Entorno para búsqueda y agrupación de modelos de procesos de negocio</b>]]> El artículo presenta un entorno para búsqueda y agrupación de procesos de negocio denominado MultiSearchBP. Es basado en una arquitectura de tres niveles, que comprende el nivel de presentación, nivel de negocios (análisis estructural, la indización, búsqueda y agrupación) y el nivel de almacenamiento. El proceso de búsqueda se realiza en un repositorio que contiene 146 modelos de procesos de negocio (BP). Los procesos de indización y de consulta son similares a los del modelo de espacio vectorial utilizado en la recuperación de información, y el proceso de agrupación utiliza dos algoritmos de agrupación (Lingo y STC). MultiSearchBP utiliza una representación multimodal de los BP. También se presenta un proceso de evaluación experimental para considerar los juicios de ocho expertos evaluadores a partir de un conjunto de los valores de similitud obtenidos de comparaciones manuales efectuados con anterioridad sobre los modelos de BP almacenados en el repositorio. Las medidas utilizadas fueron la precisión gradual y el recall gradual. Los resultados muestran una precisión alta.<hr/>This paper presents a Business Process Searching and Grouping Environment called MultiSearchBP. It is based on a three-level architecture comprising Presentation level, Business level (Structural Analysis, Indexing, Query, and Grouping) and Storage level. The search process is performed on a repository that contains 146 Business Process (BP) models. The indexing and query processes are similar to those of the vector space model used in information retrieval and the clustering process uses two clustering algorithms (Lingo and STC). MultiSearchBP uses a multimodal representation of BPs. It also presents an experimental evaluation process to consider the judgments of eight expert evaluators from a set of similarity scores obtained from previous manual comparisons made between the BP models stored in the repository. The measures used were graded precision and graded recall. The results show high accuracy. <![CDATA[<b>Combining Active and Ensemble Learning for Efficient Classification of Web Documents</b>]]> Classification of text remains a challenge. Most machine learning based approaches require many manually annotated training instances for a reasonable accuracy. In this article we present an approach that minimizes the human annotation effort by interactively incorporating human annotators into the training process via active learning of an ensemble learner. By passing only ambiguous instances to the human annotators the effort is reduced while maintaining a very good accuracy. Since the feedback is only used to train an additional classifier and not for re-training the whole ensemble, the computational complexity is kept relatively low. <![CDATA[<b>Una propuesta para incorporar más semántica de los modelos al código generado</b>]]> Actualmente hay un amplio uso del paradigma Model Driven Architecture (MDA) para la generación de código a partir de modelos, pues esto garantiza menores tiempos de desarrollo y de puesta a punto. Los modelos creados a partir de los diagramas del Lenguaje Unificado de Modelado (UML) son de amplia utilización teniendo en cuenta que se trata de un estándar y además, la gran cantidad de herramientas de modelado que existen para ello. Cada diagrama de UML es un punto de vista diferente del sistema modelado, pero cada uno de estos, tiene su sintaxis y su semántica y aporta información para el código resultante. La forma de intercambiar estos diagramas entre las diferentes herramientas es a través del uso de ficheros XMI (XML Metadata Interchange). XMI es un estándar, sin embargo, no todas las herramientas de modelado tienen las opciones de importar / exportar para este formato y las que lo hacen, no permiten la total interoperabilidad entre herramientas, debido a que usan sus propias estructuras. En este trabajo se aborda la semántica del diagrama de clases y cómo se refleja esta en el código generado por la herramienta AndroMDA, precisando los aspectos que pueden mejorarse en función de la semántica de UML, a partir de la modificación de sus cartuchos.<hr/>Currently, there is a widely used paradigm called Model Driven Architecture (MDA) for code generation from models, because this ensures shorter development times. The models created from the diagrams of Unified Modeling Language (UML) are widely used, considering that it is standard and a large number of modeling tools exists for it. Each UML diagram is a different view of the modeled system, but each of them has its syntax and semantics and each of these elements provides information for the resulting code. These diagrams are exchanged between different tools using XMI files (XML Metadata Interchange). XMI is a standard; however, not all modeling tools have options to import / export to this format and they do not allow full interoperability between tools, because they use their own structures. This paper addresses the semantics of class diagram and how it is reflected in the code generated by the AndroMDA tool, specifying the aspects for improvement based on the semantics of UML through modification of their cartridges. <![CDATA[<b>Comparison of Different Graph Distance Metrics for Semantic Text Based Classification</b>]]> Nowadays semantic information of text is used largely for text classification task instead of bag-of-words approaches. This is due to having some limitations of bag of word approaches to represent text appropriately for certain kind of documents. On the other hand, semantic information can be represented through feature vectors or graphs. Among them, graph is normally better than traditional feature vector due to its powerful data structure. However, very few methodologies exist in the literature for semantic representation of graph. Error tolerant graph matching techniques such as graph similarity measures can be utilised for text classification. However, the techniques like Maximum Common Subgraph (mcs) and Minimum Common Supergraph (MCS) for graph similarity measures are computationally NP-hard problem. In the present paper summarized texts are used during extraction of semantic information to make it computationally faster. The semantic information of texts are represented through the discourse representation structures and later transformed into graphs. Five different graph distance measures based on Maximum Common Subgraph (mcs) and Minimum Common Supergraph (MCS) are used with k-NN classifier to evaluate text classification task. The text documents are taken from Reuters21578 text database distributed over 20 classes. Ten documents of each class for both training and testing purpose are used in the present work. From the results, it has been observed that the techniques have more or less equivalent potential to do text classification and as good as traditional bag-of-words approaches. <![CDATA[<b>Sistema de medición de distancia mediante imágenes para determinar la posición de una esfera utilizando el sensor Kinect XBOX</b>]]> En este documento se presenta un método para medir la distancia del centroide de un objeto segmentado en una imagen de color con respecto a un punto de referencia fijo. El algoritmo se probó mediante una secuencia de imágenes de color, analizando más de 100 posiciones verticales diferentes de una esfera alojada en el interior de una columna cilíndrica transparente de acrílico con diámetro y longitud constante. El algoritmo propuesto integra técnicas de corrección por balance de blancos y de calibración de la cámara con sus parámetros intrínsecos, además, se prueba un nuevo método de segmentación en color utilizado para calcular distancias del mundo real a partir de imágenes en color RGB. Los resultados obtenidos reflejan una alta confiabilidad ya que el 100% de las mediciones realizadas tuvo un error menor a 1.64% con un nivel de precisión más alto que el instrumento utilizado de referencia, en un rango de distancia de 0 a 1340 mm.<hr/>This paper presents a method to measure the distance from the centroid of a segmented object in a color image with respect to a fixed reference point into the image. The algorithm was tested using a color image sequence by analyzing over 100 different vertical positions of a ball housed inside a transparent acrylic cylindrical column of constant diameter and length. The proposed algorithm integrates technics of correction by white balance and calibration of the camera with its intrinsic parameters; in addition, a new color segmentation method is tested to calculate real-world distances into color images RGB. The results show high reliability because 100% of measurements had a relative error in percentage less than 1.64%, with a higher level of precision than the reference instrument used in a distance range from 0 to 1340 mm. <![CDATA[<b>Information Extraction in Semantic, Highly-Structured, and Semi-Structured Web Sources</b>]]> The evolution of the Web from the original proposal made in 1989 can be considered one of the most revolutionary technological changes in centuries. During the past 25 years the Web has evolved from a static version to a fully dynamic and interoperable intelligent ecosystem. The amount of data produced during these few decades is enormous. New applications, developed by individual developers or small companies, can take advantage of both services and data already present on the Web. Data, produced by humans and machines, may be available in different formats and through different access interfaces. This paper analyses three different types of data available on the Web and presents mechanisms for accessing and extracting this information. The authors show several applications that leverage extracted information in two areas of research: recommendations of educational resources beyond content and interactive digital TV applications. <![CDATA[<b>Computing Polynomial Segmentation through Radial Surface Representation</b>]]> The Visual Information Retrieval (VIR) area requires robust implementations achieved trough mathematical representations for images or data sets. The implementation of a mathematical modeling goes from the corpus image selection, an appropriate descriptor method, a segmentation approach and the similarity metric implementation whose are treated as VIR elements. The goal of this research is to found an appropriate modeling to explain how its items can be represented to achieve a better performance in VIR implementations. A direct method is tested with a subspace arrangement approach. The General Principal Component Analysis (GPCA) is modified inside its segmentation process. Initially, a corpus data sample is tested, the descriptor of RGB colors is implemented to obtain a three dimensional description of image data. Then a selection of radial basis function is achieved to improve the similarity metric implemented. It is concluded that a better performance can be achieved applying powerful extraction methods in visual image retrieval (VIR) designs based in a mathematical formulation. The results lead to design VIR systems with high level of performance based in radial basis functions and polynomial segmentations for handling data sets. <![CDATA[<b>Mejoramiento de la consistencia entre la sintaxis textual y gráfica del lenguaje de <i>Semat</i></b>]]> Semat (Software Engineering Method and Theory) es una iniciativa que permite representar prácticas comunes de metodologías ya existentes mediante los elementos de su núcleo, los cuales se describen en términos de un lenguaje. Este lenguaje tiene una sintaxis gráfica y una textual. La sintaxis textual se describe mediante el metalenguaje EBNF (Extended Backus-Naur Form) que se utiliza como notación de gramáticas de libre contexto para describir un lenguaje formal. Sin embargo, la sintaxis textual de los elementos del núcleo en algunos casos presenta inconsistencia con la sintaxis gráfica. Por ello, en este artículo se propone la modificación del lenguaje textual mediante un análisis gramatical al lenguaje de Semat con el fin de lograr una relación consistente entre la sintaxis textual y gráfica de los elementos del núcleo de Semat.<hr/>Semat (Software Engineering Method and Theory) is an initiative that allows representing common practices of existing methodologies by its core elements, which are described in terms of a language. This language has a graphical and a textual syntax. The textual syntax is described using meta-language EBNF (Extended Backus-Naur Form), which is used as context-free grammar notation to describe a formal language. However, the textual syntax of core elements in some cases is inconsistent with the graphical syntax. Therefore, in this paper we propose a modification of textual language by parsing the language of Semat in order to achieve a consistent relationship between textual and graphical syntax of the core elements of Semat.