The Modern Standard Arabic (MSA) is the formal language used in the Arab world. In Algeria, the MSA and other varieties of informal Arabic dialects are used in the everyday matter communication. These dialects are by no means subject to further regional variations: eastern, western, central or southern. The Oranee dialect is the most important and used one in the west of Algeria. However, it is an under-resourced language, which lacks both audio and textual corpora. In this paper, we present the most particularities of this western Algerian dialect and introduce a natural language processing on an Oranee textual corpus. A MSA transcribed discourse could contain some dialect vocabularies and viceversa. Therefore, we propose to interpolate dialectal language models and MSA ones with respect to some topics. The best obtained interpolation weights are related to Religion topic data.
Algerian dialect Oranee dialect Modern Standard Arabic Natural Language Processing Language Modelling Speech Recognition topics
Primary Language | English |
---|---|
Subjects | Software Engineering (Other) |
Journal Section | Articles |
Authors | |
Publication Date | December 30, 2019 |
Acceptance Date | January 29, 2020 |
Published in Issue | Year 2019 Volume: 2 Issue: 2 |
International Journal of Informatics and Applied Mathematics