A copy of this work was available on the public web and has been preserved in the Wayback Machine. The capture dates from 2021; you can also visit the original URL.
The file type is application/pdf
.
AraDIC: Arabic Document Classification Using Image-Based Character Embeddings and Class-Balanced Loss
2020
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop
unpublished
Classical and some deep learning techniques for Arabic text classification often depend on complex morphological analysis, word segmentation, and hand-crafted feature engineering. These could be eliminated by using character-level features. We propose a novel end-to-end Arabic document classification framework, Arabic document imagebased classifier (AraDIC), inspired by the work on image-based character embeddings. AraDIC consists of an image-based character encoder and a classifier. They are
doi:10.18653/v1/2020.acl-srw.29
fatcat:l5lgorfhhzf2rbwvlttui5hxky