SciELO - Scientific Electronic Library Online

 
 issue53Data Reduction and Regression Using Principal Component Analysis in Qualitative Spatial Reasoning and Health InformaticsImproving Corpus Annotation Quality Using Word Embedding Models author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Polibits

On-line version ISSN 1870-9044

Abstract

JEBARI, Chaker. A Segment-based Weighting Technique for URL-based Genre Classification of Web Pages. Polibits [online]. 2016, n.53, pp.43-47. ISSN 1870-9044.  http://dx.doi.org/10.17562/PB-53-4.

We propose a segment-based weighting technique for genre classification of web pages. This technique exploits character n-grams extracted from the URL of the web page rather than its textual content. The main idea of our technique is to segment the URL and assigns a weight for each segment. Experiments conducted on three known genre datasets show that our method achieves encouraging results.

Keywords : URL; genre classification; web page; segment weight.

        · text in English     · English ( pdf )