AKEA: An Arabic keyphrase extraction algorithm

Eslam Amer*, Khaled Foad

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Keyphrase extraction is a critical step in many natural language processing and Information retrieval applications. In this paper, we introduce AKEA, a keyphrase extraction algorithm for single Arabic documents. AKEA is an unsupervised algorithm as it does not need any type of training in order to achieve its task. We rely on heuristics that collaborate linguistic patterns based on Part-Of-Speech (POS) tags, statistical knowledge, and the internal structural pattern of terms (i.e. word-occurrence). We employ the usage of Arabic Wikipedia to improve the ranking (or significance) of candidate keyphrases by adding a confidence score if the candidate exist as an indexed Wikipedia concept. Experimental results show that on average AKEA has the highest precision value, the highest F-measure value which indicates it presents more accurate results compared to its other algorithms.

Original languageEnglish
Title of host publicationProceedings of the International Conference on Advanced Intelligent Systems and Informatics, 2016
EditorsAboul Ella Hassanien, Khaled Shaalan, Ahmad Taher Azar, Tarek Gaber, Mohamed F. Tolba
PublisherSpringer Verlag
Pages137-146
Number of pages10
ISBN (Electronic)9783319483085
ISBN (Print)9783319483078
DOIs
Publication statusPublished - 18 Oct 2016
Event2nd International Conference on Advanced Intelligent Systems and Informatics, AISI 2016 - Cairo, Egypt
Duration: 24 Oct 201626 Oct 2016

Publication series

NameAdvances in Intelligent Systems and Computing
Volume533
ISSN (Print)2194-5357

Conference

Conference2nd International Conference on Advanced Intelligent Systems and Informatics, AISI 2016
Country/TerritoryEgypt
CityCairo
Period24/10/1626/10/16

Keywords

  • Keyphrase extraction
  • Natural language processing

Fingerprint

Dive into the research topics of 'AKEA: An Arabic keyphrase extraction algorithm'. Together they form a unique fingerprint.

Cite this