Experts aim to close the language gap

Experts aim to close the language gap

Despite containing over a quarter of the world’s languages, Africa is significantly underrepresented in the development of artificial intelligence (AI). This lack of representation is attributed to both insufficient investment and limited access to data. Most AI tools currently in use, such as ChatGPT, are predominantly trained on English and other European and Chinese languages with ample available online text, whereas many African languages are primarily oral and lack a written corpus.

The absence of training data for these languages disadvantages millions across the continent. Researchers have sought to address this gap; they recently released what is believed to be the largest dataset of African languages. A project called Africa Next Voices, which involves linguists and computer scientists, has created AI-ready datasets in 18 African languages. Although these languages represent only a fraction of the over 2,000 estimated languages spoken in Africa, the project aims for future expansion.

In two years, the team collected 9,000 hours of speech across Kenya, Nigeria, and South Africa. The data includes recordings in Kikuyu and Dholuo from Kenya, Hausa and Yoruba from Nigeria, and isiZulu and Tshivenda from South Africa. The researchers emphasize the importance of using local voices and capturing the nuances of how people live and communicate, making the data more representative of the continent’s diversity.

The initiative received a $2.2 million grant from the Gates Foundation and aims to provide open access to the datasets. This will enable developers to create tools that facilitate communication in African languages for services like healthcare and banking. The project founders believe that integrating these languages in AI technology not only addresses practical challenges but also preserves cultural identity, as language is connected to history and knowledge.

Source: https://www.bbc.com/news/articles/crkzgkkpx0lo?at_medium=RSS&at_campaign=rss

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top