Towards Machine-Readable Lexicons for South African Bantu languages * †

Sonja Bosch, Laurette Pretorius, Jackie Jones
2007 Nordic Journal of African Studies   unpublished
Lexical information for South African Bantu languages is not readily available in the form of machine-readable lexicons. At present the availability of lexical information is restricted to a variety of paper dictionaries. These dictionaries display considerable diversity in the organisation and representation of data. In order to proceed towards the development of reusable and suitably standardised machine-readable lexicons for these languages, a data model for lexical entries becomes a
more » ... site. In this study the general purpose model as developed by Bell and Bird (2000) is used as a point of departure. Firstly, the extent to which the Bell and Bird (2000) data model may be applied to and modified for the above-mentioned languages is investigated. Initial investigations indicate that modification of this data model is necessary to make provision for the specific requirements of lexical entries in these languages. Secondly, a data model in the form of an XML DTD for the languages in question, based on our findings regarding Bell and Bird (2000) and Weber (2002) is presented. Included in this model are additional particular requirements for complete and appropriate representation of linguistic information as identified in the study of available paper dictionaries.