BAAC: Bangor Arabic Annotated Corpus

Ibrahim S Alkhazi, William J.
<span title="">2018</span> <i title="The Science and Information Organization"> <a target="_blank" rel="noopener" href="" style="color: black;">International Journal of Advanced Computer Science and Applications</a> </i> &nbsp;
This paper describes the creation of the new Bangor Arabic Annotated Corpus (BAAC) which is a Modern Standard Arabic (MSA) corpus that comprises 50K words manually annotated by parts-of-speech. For evaluating the quality of the corpus, the Kappa coefficient and a direct percent agreement for each tag were calculated for the new corpus and a Kappa value of 0.956 was obtained, with an average observed agreement of 94.25%. The corpus was used to evaluate the widely used Madamira Arabic
more &raquo; ... ch tagger and to further investigate compression models for text compressed using partof-speech tags. Also, a new annotation tool was developed and employed for the annotation process of BAAC.
<span class="external-identifiers"> <a target="_blank" rel="external noopener noreferrer" href="">doi:10.14569/ijacsa.2018.091120</a> <a target="_blank" rel="external noopener" href="">fatcat:bbrxyukzbvahjbrkhjvmbpb7hm</a> </span>
<a target="_blank" rel="noopener" href="" title="fulltext PDF download" data-goatcounter-click="serp-fulltext" data-goatcounter-title="serp-fulltext"> <button class="ui simple right pointing dropdown compact black labeled icon button serp-button"> <i class="icon ia-icon"></i> Web Archive [PDF] <div class="menu fulltext-thumbnail"> <img src="" alt="fulltext thumbnail" loading="lazy"> </div> </button> </a> <a target="_blank" rel="external noopener noreferrer" href=""> <button class="ui left aligned compact blue labeled icon button serp-button"> <i class="unlock alternate icon" style="background-color: #fb971f;"></i> Publisher / </button> </a>