Amharic Text Summarization for News Items Posted on Social Media

Abaynew Guadie, Debela Tesfaye, Teferi Kebebew
2021 International Journal of Intelligent Information Systems  
This paper introduces Amharic Text Summarization for News Items posted on social media, to summarize the news items posted Amharic texts over a time posted documents from social media on Twitter and Facebook; The main problems of the social media posted texts are that most people would probably read their posted in Amharic texts with duplicate posted documents. However, to find the information the user is looking for she or he will have to find summary posted texts and read important portions
more » ... posts as Amharic documents to extract desired information on social media. Summarization is dealing with information overload presenting and posted with a text document for the current time representation of the posted documents to summarize. Our proposed approach has three main components: First, calculate the similarity between each posted document within the two pair of sentences. Second, clustering based on the similarity results of the documents to group them by using Kmeans algorithm. Third, summarizing the clustered posted document individually using TF-IDF algorithms that involve finding statistical ways for the frequent terms to rank the documents. We applied the summarization technique is an extractive summarization approach that is assigned an extract the sentences with highest ranked sentences in the posted documents to form the summaries and the size of the summary can be identified by the user. In the experiment one the highest F-measure score is 87.07% for extraction rate at 30%, in the clustered group of protests posts. The second experiment the highest F-measure score is 84% for extraction rate at 30%, in droughts post groups. In the third experiment the highest F-measure score is 91.37% for extraction rate at 30%, in the sports post groups and also the fourth experiments the highest F-measure score is 93.52% for extraction rate at 30% to generate the summary post texts. If the system to generate the size of summary is increased, the extraction rate also increased in posted texts. For this the evaluation system shown that a very good results to summaries the posted texts on social media.
doi:10.11648/j.ijiis.20211006.14 fatcat:fwsrvepqnnav7iwq6yffcr6ntm