The African continent is a multilingual mosaic, with as many as 3,000 languages in use. Despite this linguistic richness, African local languages fall behind in the race of being integrated into artificial intelligence (AI) systems. The performance of Natural Language Processing (NLP) tools such as ChatGPT with African languages has been subpar, prompting calls to address the barriers hindering the development of these tools for African users.
South African AI research and product lab Lelapa AI, established in 2022, has launched its large language model (LLM) with a focus on advancing 5 African languages. The InkubaLM NLP model is specifically tailored to support and elevate low-resource African languages: Hausa, Swahili, Zulu, Yoruba, and Xhosa with approximately 364 million speakers.
The InkubaLM release aims to enhance capacities for African languages through two main initiatives. Currently, it offers tools for translation, transcription, and NLP. Its datasets are also accessible to improve the efficiency of current models for training. This robust and concise model, InkubaLM, is designed to cater to African communities without the need for extensive resources.
The Ethiopian Artificial Intelligence Institute is dedicated to incorporating Ethiopian languages into AI technology. With a primary focus on NLP, the institute has effectively integrated four wide spoken languages: Amharic, Afaan Oromo, Aff-Somali, and Tigrigna. This integration has led to the development of various solutions for public services, including the Smart Court system. Ongoing efforts are being made to include additional local languages. These advancements are pivotal in exploring the application of AI solutions through an African lens.