Maria Fasli, University of Essex, UNESCO Chair in Data Science and Analytics on developing AI solutions in Africa

Play video by Maria Fasli, University of Essex, UNESCO Chair in Data Science and Analytics at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Play video by Maria Fasli, University of Essex, UNESCO Chair in Data Science and Analytics at workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on at the moment?

My name is Maria Fasli, I am a professor of computer science and my area of expertise is in Artificial Intelligence. I work for the University of Essex in the UK. I work in arrange of projects, with both, industry as well as public sector organizations, trying to help them to understand the data that they have, their needs around data and how to make better use of their data.

How do you perceive development and Artificial Intelligence?

This is a really interesting question; I think AI has a really big roll to play in development. We need to bring AI into the developing countries and transitioning countries, to make a difference on the ground. It is not about us making up solutions in the west, but it is about developing solutions here locally.

There is a whole area that we need to work on around developing capacity and helping people create the right networks here in Africa as well as in other areas in the world, South Africa, Southeast Asia, to make a difference.

There is a big scope to use AI to support sustainable development goals and make progress, help developing and transitioning countries, develop into knowledge economies so that they can be the ones that have the power to make a difference for their own citizens.

What is your blue sky project in Africa?

This is another really good question. In the west, we’ve been using surveys to collect data and we’ve been doing clinical trials, we’re always trying to learn in a very structured kind of way. What I would like to work on if I had an unlimited budget is techniques that can learn and reason from observation on data.

Where are you trying instead of running a survey and collecting data about the population where you can control what it is that you’re getting back. Learning from the kind of data that is already available, because there is an abundance of data, but we’re currently lacking the techniques and trying to make sense out of this data.

How do you feel about the workshop?

I think it has been amazing, we’ve made a lot of progress, we’ve had concrete ideas coming out as the next steps and I look forward to personally supporting the initiative going forward if I’m needed in whichever way is possible.

Do you have a one-liner for us? One line?

A slogan. AI for all!

December Review; AI4D- African Language Dataset Challenge // Bilan de decembre; Défi AI4D – Jeu de Données sur les Langues Africaines

The close of 2019 marked the second month of the AI4D African Dataset Challenge, an effort aimed at incentivizing the uncovering and creation of African language datasets for improved representation in NLP. This challenge is hosted on Zindi and has been ongoing since the 1st of November. Each month we take stock and award a total of USD 1000 to the two most outstanding submissions.

In December, these two were as follows;

  • A Yoruba dataset submitted by David Adelani. This submission was put together by three individuals, David, Damilola Adebonojo and Omo Yooba, the latter two of whom are major Yoruba contributors for Global Voices Lingua, a movement which aims to bridge worlds and amplify voices through translating stories into dozens of languages. Beyond including some of the news stories from the Global Voices website, they translated several chapters of a book, got parallel sentences from a Twitter account that posts Yoruba proverbs, translated part of a movie dialogue found on YouTube and supplemented these with multi-domain sentences containing scientific and medical terms to work towards a representative dataset.
  • A Fongbe submission composed of datasets prepared for two tasks; 
    • Fongbe-French Machine Translation with data sourced from Bible translations, scraping a website and translating a book freely available online.
    • Automatic Speech Transcription data consisting of phoneme labels, single-speaker audio sentences as well as multi-speaker conversational audios.

We received 6 submissions in December, composed of data from 4 languages, Fongbe, Igbo, Swahili and Yoruba. This brings our overall language total, taking into consideration November and December submissions, to 6; Fongbe, Hausa, Igbo, Swahili, Wolof and Yoruba.

We observed one novel data collection process that involved first scanning text from a book containing a collection of folk-tales then digitizing these using Google’s Text Recognition software for Optical Character Recognition(OCR).  There was also a notable submission of Igbo names, a valuable resource that can be incorporated into the task of Named Entity Recognition. To learn more about other techniques being to create datasets, be sure to check the November round-up here.

As we begin evaluation of the January submissions, we continue to be impressed by the calibre of datasets submitted and the effort put into their creation. 

This work actively challenges us to think deeper about the various copyright implications of some of these data collection sources and processes and the modality of finally making all this data open. In addition to the choice of dataset to use for a Machine Learning task in the second phase of this challenge, as each month brings us closer to the end of the dataset creation phase.

Contribution by:
Kathleen Siminyu, AI4D-Africa Network Coordinator
Sackey Freshia, Jomo Kenyatta University of Agriculture and Technology
Daouda Tandiang Djiba, GalsenAI


La fin de l’année 2019 a marqué le deuxième mois du défi AI4D African Dataset Challenge, un effort visant à encourager la découverte et la création de jeux de données sur les langues africaines pour une meilleure représentation en NLP. Ce défi est hébergé sur Zindi et se déroule depuis le 1er novembre. Chaque mois, nous faisons le point et attribuons un total de 1000 USD aux deux meilleures soumissions.

En décembre, il s’agissait des deux suivantes ;

  • Un jeu de données Yoruba soumis par David Adelani. Cette soumission a été réalisée par trois personnes, David, Damilola Adebonojo et Omo Yooba, ces deux derniers étant des contributeurs yorubas majeurs pour Global Voices Lingua, un mouvement qui vise à rapprocher les mondes et à amplifier les voix en traduisant des histoires dans des dizaines de langues. En plus d’inclure certains des articles du site web de Global Voices, ils ont traduit plusieurs chapitres d’un livre, obtenu des phrases parallèles d’un compte Twitter qui publie des proverbes yorubas, traduit une partie d’un dialogue de film trouvé sur YouTube et complété ces derniers par des phrases multi-domaines contenant des termes scientifiques et médicaux pour travailler sur un jeu de données représentatif.
  • Une soumission Fongbe composée d’un jeu de données préparées pour deux tâches ; 
    • La traduction automatique Fongbe-français avec des données provenant de traductions de la Bible, en grattant un site web et en traduisant un livre disponible gratuitement en ligne.
    • Données de transcription automatique de la parole comprenant des étiquettes de phonèmes, des phrases audio à un seul locuteur ainsi que des audios conversationnels à plusieurs locuteurs.

 

Nous avons reçu 6 soumissions en décembre, composées de données provenant de 4 langues, le fongbe, l’igbo, le swahili et le yoruba. Cela porte à 6 le nombre total de langues, en tenant compte des contributions de novembre et de décembre : le fongbe, le haoussa, l’igbo, le swahili, le wolof et le yoruba.

Nous avons observé un nouveau processus de collecte de données qui consistait à scanner le texte d’un livre contenant un ensemble de contes populaires, puis à numériser ces derniers à l’aide du logiciel de reconnaissance de texte de Google pour la reconnaissance optique de caractères (OCR). 

Il y a également eu une soumission notable de noms Igbo, une ressource précieuse qui peut être incorporée dans la tâche de reconnaissance des entités nommées. Pour en savoir plus sur les autres techniques de création de jeu de données, consultez le résumé de novembre ici.

Alors que nous commençons l’évaluation des soumissions de janvier, nous continuons à être impressionnés par la qualité des jeux de données soumis et par les efforts déployés pour leur création. 

Ce travail nous met activement au défi de réfléchir plus en profondeur aux diverses implications en matière de droits d’auteur de certaines de ces sources et processus de collecte de données et à la modalité de rendre enfin toutes ces données ouvertes. Outre le choix de l’ensemble de données à utiliser pour une tâche d’apprentissage automatique dans la deuxième phase de ce défi, puisque chaque mois nous rapproche de la fin de la phase de création de l’ensemble de données.

Contribution de:
Kathleen Siminyu, Coordinatrice du réseau AI4D-Africa
Sackey Freshia, Jomo Kenyatta University of Agriculture and Technology
Daouda Tandiang Djiba, GalsenAI

Delmiro Fernandez-Reyes form UCL on how AI can deliver better medicines in Africa

Delmiro Fernandez-Reyes, University College London at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Delmiro Fernandez-Reyes, University College London at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on at the moment?

I’m based at the department of computer science, University College London and as well at the College of Medicine at the University of Ibadan. My work is related to solutions for global health challenges such as paediatric infections, malaria, or communicable or noncommunicable diseases.

The work has been basically harnessing algorithms we develop, to actually look at the data that can improve diagnostics, or can improve clinical pathways, or can actually as well make decisions faster, therefore savings on the healthcare systems which are a stretch.

So basically, we can focus on challenges on this global health problems. What I do at the moment is develop the hardware that AI is going to work on, we develop a microscope itself that have a lot of AI components, which for the diagnostics like navigation, detection of the specific objects, like malaria, parasites and all the etymological aspects of malaria screening.

Another important part of what we do, I think that the role of AI as I see as a person who works in challenges in health in the region is that in Africa it is more transformative because it creates opportunity. For example, these projects, the ones I’m talking about, are already running, they are generating employment, they are generating teams. This is being now developed to use technology in the frontline.

We have a tool that improves MRI resolution and that is now being used by radiologists in Nigeria. Through those tools you can train people, professionals, increase interdisciplinarity, so it opens opportunity, which is the opposite as you see in the north countries or in the west, AI seems to be to take jobs out of people or doing tasks. I think in Africa you can use it as challenges that will increase development or the region.

How do you perceive development and Artificial Intelligence?

The way to facilitate development is focusing on the challenges the region has. The region of many challenges, from technological gaps to the ones of governance.

I want to focus on the ones closest to me, because of my background as a basic scientist in medicine and computer science. In those areas, we can clearly see that we can aid the developing areas of improving the key drivers of lack of development, which is inequality, neonatal mortality, maternal mortality. Those are actually three axes that drive the region.

The region has still too many communicable diseases, HLB, Tuberculosis, malaria, those are now the challenge. Another challenge is, as people are getting older in southern Africa, like Nigeria, span is increasing with the GDP increase, you will have a bigger impact on noncommunicable diseases.

For those, I think we can bring a lot of management, health care systems, policy-making and strategies for that. Of course, there is another area on the development that you cannot do that only for the health, you have to develop, power, infrastructure and water – sanitation, so there needs to be a concerted element to this, you cannot have only the health people working alone, has to be the engineers of infrastructure at the same time or telecommunications.

What is your blue sky project in Africa?

The main project that we will focus on is what we are already doing. We would like to have AI-driven platform for diagnosis of diseases fast in clinical labs. You can achieve that.

November Review; AI4D- African Language Dataset Challenge // Bilan de novembre ; Défi AI4D – Jeu de Données sur les Langues Africaines

 

 

On the 1st of November, we launched the AI4D-African Language Dataset Challenge on Zindi, an effort towards incentivizing the uncovering and creation of African language datasets for improved representation in NLP. This first phase of what is expected to be a two-phase challenge, is taking place over 5 months, November 2019 to March 2020, with evaluation of submissions done on a monthly basis. Each month, the top 2 submissions will receive a cash prize of USD 500.

Being well into December we are excited to announce that the top two submissions for November were received from;

  • Oshingbesan Adebayo who submitted a dataset composed of three West African indigenous languages(Hausa, Igbo and Yoruba). The dataset was acquired from a wide variety of sources ranging from transcriptions of songs, online news sites, excerpts from published books, websites in indigenous languages to blogs, Twitter, Facebook and more. 
  • Thierno Diop who submitted an Automatic Speech Recognition dataset for Wolof in the domain of transportation services. The data was prepared through a collaboration between BAAMTU Datamation, a senegalease company focused on using data to help companies to leverage AI and Big Data, and WeeGo, an app which help passengers to get information about urban transport in Senegal.

Overall, we received 9 submissions in the month of November, composed of data from a total of 4 unique languages. These are Hausa, Igbo, Wolof and Yoruba.

Majority of the data came from online sources. Scraping of newspaper sites such as BBC, DW and VOA which curate news in several African languages emerged as one of the top ways that participants went about creating datasets. A great strategy for putting together a sizeable dataset over the coming months would be to keep going back to the site(s) every so often and keeping your dataset up to date with the site as news is regularly published. Capturing a wide variety of news categories would go a long way in ensuring the dataset is well balanced and representative of language variety. Wikipedia sites published in various languages also featured as a data source. 

  • BBC publishes news in Afaan Oromoo, Amharic, Hausa, Igbo, Kirundi, Pidgin, Somali, Swahili, Tigrinya and Yoruba 
  • DW publishes news in Amharic, Hausa and Kiswahili 
  • VOA publishes news in Afaan Oromoo, Amharic, Bambara, Hausa, Kinyarwanda/Kirundi, Ndebele, Shona, Somali, Kiswahili and Tigrinya

A closely related online source is Twitter data, which we have seen particularly curated for the task of sentiment analysis. A good place to start would be the accompanying Twitter profiles of the above news sites. While we haven’t had any data sourced from Facebook yet, I imagine that the profiles maintained by these news outlets for various languages would also be a good place to start.  

Manual translation also emerged with some submissions compiled as a result of one or several individuals coming together to translate pieces of text as well as custom applications such as mobile applications being used to crowdsource voice overs for the dataset created for Automatic Speech Recognition. 

I am also excited to announce that we will have a workshop at ICLR 2020, “AfricaNLP – Unlocking Local Languages”, which will be held in Addis Ababa in April of next year.
Part of the agenda of this workshop is set aside to showcase exceptional work and resulting datasets that will emerge as output from this exercise.

We will also use the workshop as an opportunity to launch the second phase of this challenge. If you have been following our thought process since the beginning, then you will know that the second phase of the challenge is largely dependent on the outcomes of this first phase. The one(or hopefully two) downstream NLP tasks that will be the object of the 2nd phase will utilise datasets that result from this first phase.

Finally, we have a Call for Papers for the workshop, specifically for research work involving African languages. Feel free to start making your submissions on this page. Here’s some key dates to keep in mind:

  • Submission deadline: 1st February, 2020
  • Notification to authors: 26th February, 2020
  • Workshop: 26th April, 2020

Happy Holidays!

Contribution by:
Kathleen Siminyu, AI4D-Africa Network Coordinator
Sackey Freshia, Jomo Kenyatta University of Agriculture and Technology
Daouda Tandiang Djiba, GalsenAI


Le 1er novembre, nous avons lancé le Défi AI4D – Ensemble de données sur les langues africaines sur Zindi, un effort pour encourager la découverte et la création  jeux de données sur les langues africaines pour une meilleure représentation en NLP. Cette première phase de ce qui devrait être un défi en deux phases, se déroule sur 5 mois, de novembre 2019 à mars 2020, avec une évaluation de la soumission faite sur une base mensuelle. Chaque mois, les deux meilleures soumissions recevront un prix en espèces de 500 USD.

Nous sommes heureux d’annoncer que les deux meilleures soumissions pour novembre ont été reçues ;

  • Oshingbesan Adebayo qui a soumis un jeu  de données composé de trois langues autochtones d’Afrique de l’Ouest (haoussa, igbo et yoruba). Le jeu  de données a été acquis auprès d’une grande variété de sources allant de transcriptions de chansons, de sites d’information en ligne, d’extraits de livres publiés, de sites Web en langues autochtones à des blogues, Twitter, Facebook et autres. 
  • Thierno Diop qui a soumis un ensemble de données de reconnaissance automatique de la parole pour le wolof dans le domaine des services de transport. Les données ont été préparées grâce à une collaboration entre BAAMTU Datamation, une société sénégalaise spécialisée dans l’utilisation des données pour aider les entreprises à tirer parti de l’intelligence artificielle et de Big Data, et WeeGo, une application qui aide les passagers à obtenir des informations sur le transport urbain au Sénégal.

Au total, nous avons reçu 9 soumissions au mois de novembre, composées de données provenant de 4 langues uniques au total. Il s’agit du haoussa, de l’igbo, du wolof et du yoruba.

La majorité des données provenaient de sources en ligne. Le grattage(scraping) de sites de journaux tels que la BBC, DW et VOA qui organisent des actualités dans plusieurs langues africaines est apparu comme l’un des principaux moyens utilisés par les participants pour créer des jeux  de données. Une excellente stratégie pour constituer un jeu de données important au cours des mois à venir serait de retourner sur le(s) site(s) de temps en temps et de garder le jeu de données à jour avec le site car des nouvelles sont régulièrement publiées. La saisie d’une grande variété de catégories de nouvelles contribuerait grandement à assurer que le jeu  de données est bien équilibré et représentatif de la variété des langues. Les sites Wikipédia publiés dans différentes langues sont également présentés comme une source de données. 

  • La BBC publie des nouvelles en afaan oromo, amharique, haoussa, igbo, kirundi, pidgin, somali, swahili, tigrinya et yoruba 
  • DW publie des nouvelles en Amharique, Hausa et Kiswahili 
  • VOA publie des informations en Afaan Oromoo, Amharique, Bambara, Haoussa, Kinyarwanda/Kirundi, Ndebele, Shona, Somali, Kiswahili et Tigrinya

Une source en ligne étroitement liée est celle des données de Twitter, que nous avons vu particulièrement bien conservée pour la tâche d’analyse des sentiments. Un bon point de départ serait les profils Twitter des sites d’information ci-dessus. Bien que nous n’ayons pas encore eu de données provenant de Facebook, j’imagine que les profils tenus par ces sites d’information dans différentes langues seraient également un bon point de départ.  

La traduction manuelle a également fait son apparition, certaines soumissions ayant été compilées à la suite de la collaboration d’une ou de plusieurs personnes pour traduire des morceaux de texte ainsi que des applications personnalisées telles que des applications mobiles utilisées pour créer des voix hors champ pour un ensemble de données créé pour la reconnaissance automatique de la parole. 

Je suis également heureux d’annoncer que nous aurons un atelier à la conférence ICLR 2020, “AfricaNLP – Unlocking Local Languages“, qui se tiendra à Addis-Abeba en avril prochain. Une partie de l’ordre du jour de cet atelier est réservée à la présentation des travaux exceptionnels et des jeux  de données qui résulteront et qui seront le fruit de cet exercice.

Nous profiterons également de l’atelier pour lancer la deuxième phase de ce défi. Si vous avez suivi notre processus de réflexion depuis le début, vous savez que la deuxième phase du défi dépend en grande partie des résultats de cette première phase. Les une (ou, espérons-le, deux) tâches de NLP en aval qui feront l’objet de la deuxième phase utiliseront les ensembles de données qui résultent de cette première phase.

Enfin, nous avons un appel à communications pour l’atelier, spécifiquement pour les travaux de recherche impliquant les langues africaines. N’hésitez pas à commencer à faire vos soumissions ici.

  • Date limite de soumission: 1er février 2020
  • Notification de la décision: 26 février 2020
  • Atelier  : 26 avril 2020

Joyeuses Fêtes!

Contribution de:
Kathleen Siminyu, Coordinatrice du réseau AI4D-Africa
Sackey Freshia, Jomo Kenyatta University of Agriculture and Technology
Daouda Tandiang Djiba, GalsenAI

Vukosi Marivate from University of Pretoria on Africa’s position in AI

Play video by Vukosi Marivate, University of Pretoria, CSIR, Deep Learning at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Play video by Vukosi Marivate, University of Pretoria, CSIR, Deep Learning Indaba  at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on at the moment?

I am doctor Vukosi Marivate. I am a chair of data science at the University of Pretoria in South Africa and I am also here representing Deep Learning Indaba. My work mostly is involved in machine learning and natural language processing as well as how we use data science for society.

How do you perceive development and Artificial Intelligence?

I see AI as being a tool that we can use in society, so not restricting it for development and on the continent, I believe that we all have our own challenges, doesn’t matter where you are and how can we use AI as one of the tools that could be used to improve the lives of Africans. If we start from there all the other things follow.

What is your blue sky project in Africa?

As Africa, I think we are in an interesting position when we’re trying to look at AI and how it can be used. One of the things that become important is also demystifying it for the public and decision-makers. I think the blue sky is how do we get AI to be interpretable and transparent. That’s one big part, there should be more work done in that. It’s great having very accurate models or high accuracy, low error, but how then does somebody else interpret what is going on and understand it. Because I think that is where a lot of the bias creeps in is, things are used without them being understood of why they are working the way that they work.

How do you feel about the workshop?

The workshop has been great, its been really meeting with a lot of great minds from across the continent and beyond. I am looking forward to seeing what we do with the network after this.

Short one-liner if you have one?

Okay. For what? For the workshop? Just a slogan. Oh, we said we need to capacitate AI strength on the African continent through our communities.

 

Benjamin Rosman from University of the Witwatersrand on AI and development

Benjamin Rosman, University of the Witwatersrand / CSIR at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Benjamin Rosman, University of the Witwatersrand / CSIR at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on at the moment?

I am Benjamin Rosman. I work at the University of the Witwatersrand in South Africa, Johannesburg. And I also work at the CSIR, which is the Council of Scientific and Industrial Research in South Africa. And then with another head, I am also one of the founders and organizers of the Deep Learning Indaba.

In my research lab, which is based mainly at the University of the Witwatersrand, we focus mainly on questions around machine learning and decision theory, so we work in predominately in reinforcement learning in deep learning and areas around that and we recently started working in applied areas as well.

How do you perceive development and Artificial Intelligence?

I think that the combination of AI and development is an interesting one. I think AI provides an opportunity to solve a lot of problems in the developing world, as it is currently around the world in general.

I think there is a lot of opportunities for students and society more generally to get involved in acquiring these tools, which can be used in a wide variety of industries and application areas. if we think about this right, there is an opportunity to make a large impact and train a lot of people in very impactful areas.

What is your blue sky project in Africa?

What I would really love to see and there are so many research topics that I would love to work on, but what I would really love to see is a pipeline from fundamental research in Africa to applied research, considering aspects of ethics and society and finally with the pipeline running through all the way through to commercialization, so that we can be training academics, we can be, educating society in general, we can be starting start-ups and improving the way that large corporates and governments work across the continent.

 

AI4D – African Language Dataset Challenge // Défi AI4D – Jeu de Données sur les Langues Africaines

NLP Challenge

Getting started with programming is easy, a well-trodden path. Whether it be picking up the skill itself, a new programming language or venturing into a new domain, like Natural Language Processing (NLP), you can be sure that a variety of beginner tutorials exist to get you started. The ‘Hello World!’s, as you may know them. 

Where NLP is concerned, some paths tend to be better trodden than others. It is infinitely easier to accomplish an NLP task, say Sentiment Analysis, in English than it is to do the same in my mother tongue, Luhya. This reality is an extrapolation of the fact that the languages of the digital economy are major European languages.

The gap between languages with plenty of data available on the Internet and those without is ever increasing. Pre-trained language models in recent times have led to significant improvement in various NLP tasks and Transfer Learning is rapidly changing the field. While leading architectures for pre-training models for Transfer Learning in NLP are freely available for use, most are data-hungry. The GPT-2 model, for instance, used millions, possibly billions of text to train. (ref)

The only way I know how to begin closing this gap is by creating, uncovering and collating datasets for low resource languages. With the AI4D – African Language Dataset Challenge, we want to spur on some groundwork. While Deep Learning techniques now make it possible to dream of a future where NLP researchers and practitioners on the continent can easily innovate in the languages their communities speak, a future where literacy and mastery of a major European language is no longer a prerequisite to participation in the digital economy, these techniques require data. Data that can only be created by the communities that speak these languages, by individuals that have the technical skills, by those of us who understand the importance of this work and have the desire to undertake it.

The challenge will run for 5 months(November 2019 to March 2020), with cash prizes of USD 500 awarded as an incentive to the top 2 submissions each month. This is the first of a two-phase challenge. In this first phase, the creation of datasets. We would like to see some of these datasets developed for specific downstream tasks but this is not necessary. 

We have however earmarked four downstream NLP tasks and anticipate that one(or two) of these will be the framing of the second phase of this challenge; Sentence Classification, Sentiment Analysis, Question Answering and Machine Translation. Other downstream tasks that participants may be interested in developing datasets for, or have already developed datasets for, are also eligible. Our intention is that the datasets are kept free and open for public use under a Creative Commons license once the challenge is complete.

The challenge is hosted on Zindi, head on over to this page for full details, the prize money provided through a partnership between the International Development Research Centre (IDRC) and the Swedish International Development Cooperation Agency (SIDA), the facilitation of the challenge through combined efforts of the Artificial Intelligence for Development Network(AI4D-Africa) and the Knowledge 4 All Foundation(K4All), and finally, our expert panel that have volunteered their time to undertake the difficult qualitative aspect of dataset assessment; Jade Abbott – RetroRabbit, John Quinn – Google AI/Makerere University, Kathleen Siminyu – AI4D-Africa, Veselin Stoyanov – Facebook AI and Vukosi Marivate – University of Pretoria. 

The rest, we leave up to the community.  

Contribution by Kathleen Siminyu, AI4D-Africa Network Coordinator

Photo by Eva Blue on Unsplash.


Se lancer dans la programmation est facile, c’est un chemin bien balisé. Qu’il s’agisse de l’acquisition de la compétence elle-même, un nouveau langage de programmation ou vous aventurer dans un nouveau domaine, tel que le traitement du langage naturel (NLP), vous pouvez être sûr qu’il existe une variété de tutoriels pour débutants pour vous aider à démarrer. Les “Hello World!”, Comme vous les connaissez peut-être.

 

En ce qui concerne le traitement des langages (NLP) , certains chemins ont tendance  à être mieux balisés que d’autres. Par exemple en analyse sentimental, il est beaucoup plus facile d’accomplir une tâche de NLP  que de faire de même dans ma langue maternelle, Luhya. Cette réalité est une extrapolation du fait que les langues de l’économie numérique sont en majeur partie des  langues européennes.

L’écart entre les langues contenant beaucoup de données disponibles sur Internet et celles qui n’en possèdent pas ne cesse de se creuser. Les modèles linguistiques pré-entraînés  de ces dernières années ont conduit à une amélioration significative de diverses tâches du traitement des langages (NLP) et l’apprentissage par transfert (Transfer Learning) change rapidement le domaine. Bien que les principales architectures pour les modèles de pré-entraînés  à l’apprentissage par transfert en NLP soient librement utilisables, la plupart ont besoin de beaucoup de données. Le modèle GPT-2, par exemple, utilise des millions, voire des milliards de textes pour apprendre . (ref)

La seule façon pour moi de commencer à combler cet écart consiste à créer, à découvrir et à assembler des ensembles de données pour des langages disposant de peu de ressources. Avec le défi AI4D – Jeu de données sur les langues africaines, nous souhaitons stimuler le travail préparatoire. Bien que les techniques d’apprentissage en profondeur permettent désormais de rêver d’un avenir où les chercheurs et les praticiens en NLP  du continent pourront facilement innover dans les langues parlées par leurs communautés, un avenir où l’alphabétisation et la maîtrise d’une grande langue européenne n’est plus une condition préalable à la participation à la l’économie numérique, ces techniques nécessitent des données. Des données qui ne peuvent être créées que par les communautés qui parlent ces langues, par des personnes possédant les compétences techniques, par ceux d’entre nous qui comprenons l’importance de ce travail et qui souhaitent le faire.

Le défi durera 5 mois (de novembre 2019 à mars 2020), avec des prix en espèces de 500 USD attribués sous forme d’encouragement aux 2 meilleurs projets chaque mois. C’est le premier d’un défi en deux phases. Dans cette première phase, la création de jeux de données. Nous aimerions voir certains de ces jeux de données développés pour des tâches spécifiques en aval, mais ce n’est pas nécessaire.

Nous avons toutefois réservé quatre tâches du NLP  en aval et prévoyons qu’une (ou deux) d’entre elles constitueront le cadre de référence de la deuxième phase de ce défi. Classification de textes , analyse des sentiments, réponses aux questions et traduction automatique. Les autres tâches en aval pour lesquelles les participants pourraient  être intéressés par le développement de jeux de données ou pour lesquels ils ont déjà développé des jeux de données sont également éligibles. Notre intention est que les jeux de données restent libres et ouverts au public sous une licence “Creative Commons” une fois le challenge terminé.

Le défi est hébergé sur Zindi, rendez-vous sur cette page pour obtenir tous les détails, l’argent du prix fourni grâce au partenariat entre le Centre de recherches pour le développement international (CRDI) et l’Agence suédoise de coopération pour le développement international (SIDA), la facilitation du défi par les efforts combinés du réseau de l’intelligence artificielle pour le développement (AI4D-Africa) et de la fondation Knowledge 4 All (K4All), et enfin de notre groupe d’experts qui ont offert de leur temps pour aborder le difficile aspect qualitatif de l’évaluation d’un jeu  de données; Jade Abbott – RetroRabbit, John Quinn – Google AI / Université Makerere, Kathleen Siminyu – AI4D-Africa, Veselin Stoyanov – Facebook AI et Vukosi Marivate – Université de Pretoria.

Le reste, nous laissons à la communauté.

Contribution de Kathleen Siminyu, Coordinatrice du réseau AI4D-Africa

Photo par Eva Blue sur Unsplash.

 

Isaac Rutenberg, Strathmore University on development of AI in Africa

Isaac Rutenberg, Strathmore University at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Isaac Rutenberg, Strathmore University at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on?

My name is Issac Rutenberg, I am the director of the centre for intellectual property and information technology law, CIPIT, at the Strathmore law school, here in Nairobi, Kenya. We are working at the intersection of intellectual property and IT, particularly in the ways that people utilize both of those for various reasons, including development.

How do you perceive development and Artificial Intelligence?

I think at the moment it is quite early, there are some very nascent projects in AI on the continent, there is actually quite a lot of them. I think that the impact of those so far has been quite minimal.

I think that we are at an early stage of determining how we want to use AI. In some ways that is really good, because the rest of the world has shown us, or has allowed us to see some of the pitfalls, some of the major problems that we are going to encounter as we develop AI, we will encounter that in everyday life on a regular basis. We do already in some instances but it’s only going to grow.

What is your blue sky project in Africa?

If I could have AI solve any problems, it would be getting products to international markets. A lot of agriculture in Africa is wasted for variety of reasons, I know a lot of those are structural and AI is obviously not going to solve all of them, but somehow if we could use AI to help the distribution systems, the analysis of all of the data that is required or there is generated, that impacts how products are moved around. I think that would have a very big impact on the people in their daily lives.

 

Kathleen Siminyu from Africa’s Talking on women in African AI

Play the video by Kathleen Siminyu, Africa’s Talking at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on?

My name is Kathleen Siminyu, my background is in math and computer science, and from there I’ve put it into data science. So, I am a data scientist at a company called Africa Talking. That is kind of a job that pays the bills, but I wear a couple of other hats. I do a lot of work with building machine learning communities, so I run the Nairobi Women in Machine Learning and Data Science Community here in Nairobi. Then I also work with Deep Learning Indaba which is a wide organization that works with communities across the continent. Okay. AI and development.

How do you perceive development and Artificial Intelligence?

I think particularly in Africa we have a lot of problems, so there is a lot of development to be done. There are the routes that have been set, like industrialization is how countries come up, then AI brings a whole other aspect, which is how we’ve ended up in this age of Artificial Intelligence.

I think it gives us opportunities to transform a lot of things, and not necessarily follow the path that is set out by how other societies and countries have come up. I am really excited about AI and development. I think that the fact that there is a need for development that makes AI even that more exciting for us to be applying.

What is your blue sky project in Africa?

Well, my pet project at the moment is NLP. I am just going to go with that. The reason I think NLP for African languages is very important, it gives you the ability to reach the individual. I could be here with all my English, but I am not the average African.

The average African is in a village somewhere and they speak their mother tongue and they can communicate and they can function in their life with that. But this average African is not able to participate in the digital economy. And it is not because they are stupid, they may be illiterate, but they can speak, they can understand, they can think.

If we could just talk to them then I think the opportunities are limitless. It’s not that the technology does not exist, because it does. We have Siri, and you give it a command, you ask it a question and it answers you. The technology exists, we just need it to be applied in this context. Once we have that, then we can go into healthcare, we can go into education, we can go into agriculture. So much opportunity.

I think language is the first thing which we need to crack. So, I’d say, let’s unlock language and then unlock Africa.

 

Prateek Sibal from UNESCO and policy in Artificial Intelligence

Prateek Sibal, UNESCO at workshop "Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa", Nairobi, Kenya, April 2019
Prateek Sibal, UNESCO at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on at the moment?

My name is Prateek, I am a policy researcher, I studied economics and public policy and now I work at the intersection of technology, policy and society. Some of the things that we are trying to understand are how is technology influencing human rights, access to information, openness about the information and how is the governance about AI and other emerging technologies changing in the world.

How do you perceive development and Artificial Intelligence?

I think it’s rather interesting the way you put it how AI is powering development and how development is using AI, I think it’s both ways. But at the heart of the issue is people. We have to be cognizant that there is a significant digital divide in this world and there are a lot of people who are still not online.

Even as we talk about development in the discussions that we had today, there are so many issues that emerge, we talk about online learning, but the internet is so expensive in some countries, so they have to use WhatsApp.

There are very fundamental challenges in development that we need to address, along with communities and being informed by their way of doing things. I think that is super important as we go ahead framing the AI and development agenda.

What is your blue sky project in Africa?

I think human capacity is something which I believe we all need to focus on, and there is so much need in developing countries to breach that divide, to build capacity to help governments to shape policies to support research centres and it will have a ripple effect. This is something which cannot happen overnight and hence building developing capacities is the key, I think if we were to go forward in this.

Announcing the #AI4D Africa Innovation 2019 Winners

The AI for Development (AI4D) Initiative is pleased to announce the winners of the AI4D-Africa Innovation Call for Proposals 2019.

Sign up and join us to celebrate the winners at  Deep Learning Indaba 2019 at the #AI4D Network of Excellence Innovation Grant Award Ceremony:

    • Tuesday, 27th August 2019 at 7 PM (Nairobi Time)
    • (LOCATION UPDATE) Interaction Hall – KUCC, Kenyatta University, Nairobi, Kenya.

The first named individual is the Principle Investigator. Funding for these innovation seed grants is made available with the support of Canada’s International  Development Research Centre. To learn more about our Network of Excellence in Artificial Intelligence for Development in Sub-Saharan Africa click here. 

Congratulations to all recipients. Follow us at @AI4Dev. 

 

Dr. Abdelhak Mahmoudi  
Mohammed V University of Rabat, Morocco
Arabic Speech-to-MSL Translator: ‘Learning for Deaf’
To develop an Arabic text to Moroccan Sign Language (MSL) translation product through building two corpora of data on Arabic texts for the use of translation into MSL. The collected corpora of data will train Deep Learning Models to analyze and map Arabic words and sentences against MSL encodings.

 

Dr. Adewale Akinfaderin, Olamilekan Wahab and Olubayo Adekanmbi
Data Duality Lab, Data Science Nigeria, MTN Nigeria, Nigeria
Using Artificial Intelligence to Digitize Parliamentary Bills in Sub-Saharan Africa 
To improve and expand the categorizing of parliamentary bills in Nigeria using Optical Character Recognition (OCR), document embedding, and recurrent neural networks to three other countries in Africa: Kenya, Ghana, and South Africa. 

 

Dr. Amelia Taylor, Eva Mfutso-Bengo and Binart Kachule
University of Malawi and the Polytechnic, University of Malawi, Malawi
A Semi-Automatic Tool for Meta-Data Extraction from Malawi Court Judgments 
To develop a methodology for a semi-automatic classification of judgments disseminated by the High Court Library of the Malawi Judiciary with the purpose of enabling ‘intelligent searching’ within this body of knowledge.

 

Dr. Aminata Zerbo Sabane, Dr. Tegawendé Bissyande, and T. Idriss Tinto 
L’université Joseph Ki-Zerbo and La Communauté Afrique Francophone des Données Ouvertes, Burkina Faso
Preservation of Indigenous Languages 
To initiate a research roadmap for the preservation of indigenous languages through the means of collecting, categorizing and archiving of translation and voice synthesis to perform the automatic translation in official and indigenous languages. 

 

Denis Pastory Rubanga, Dr. Zekaya Never, Dr. Machuve Dina, Lilian Mkonyi, Loyani K. Loyani, Richard Mgaya.
Tokyo University of Agriculture, The Nelson Mandela African Institution of Science and Technology, and Sokoine University of Agriculture, Tanzania
A Computer Vision Tomato Pest Assessment and Prediction Tool    
Pest monitoring by using a data-driven computer vision technique in directing the extension officers support services across sub-Sahara Africa in a real-time pest damage assessment and recommendation support system for small scale tomato farmers.

 

Martha Shaka, Nyamos Waigama, Emilian Ngatunga, Halidi Maneno, Said Said, Said Mmaka, Frederick Apina, Simon Chaula, Emani Sulutya, Merikiadi Mashaka
University of Dodoma and Benjamin Mkapa Hospital, Tanzania
Effective Creation of Ground Truth Data-Set for Malaria Diagnosis Using Deep Learning 
To create an automatic data annotation tool and ground truth dataset for malaria diagnosis using deep learning. The ground truth dataset and the tool will streamline the development of AI tools for pathology diagnosis.

 

Dr. Moes Thiga and Dr. Pamela Kimeto
Kabarak University, Kenya
Early Detection of Pre-Eclampsia Using Wearable Devices and Long Short Term Memory Networks
To determine the effectiveness of Long Short Term Memory Network in the prediction of pregnant mothers at high risk of developing pre-eclampsia and the effectiveness of prophylaxis of preeclampsia.

 

Ronald Ojino and Khushal Brahmbhatt
Cooperative University of Kenya, Kenya
A Public Dataset on Poaching Trends in Kenya and a Study on the Predictive Modeling of Poaching Attacks
To test the feasibility of the deployment of Unmanned Ground Vehicles (UGVs) for automated intelligent patrol, detection, wildlife monitoring, identification across the national parks and reserves in Kenya. 

 

Steven Edward, Edward James, and Deo Shao
Nelson Mandela African Institute of Science and Technology, Tanzania
Improving the Pharmacovigilance system using Natural Language Processing  on Electronic Medical Records
To improve the pharmacovigilance system by proposing a novel algorithm for the auto-extract of adverse drug reaction cases from Electronic Medical Records and reduce the time taken and introduce the confidentiality of reporting.

 

Dr. Tegawendé F. Bissyande, Dr. Aminata Zerbo Sabane, and T. Idriss Tinto 
Université Joseph Ki-Zerbo and La Communauté Afrique Francophone des Données Ouvertes, Burkina Faso
Building a Medicinal Plant Database for Preserving Ethnopharmacological Knowledge in the Sahel 
To initiate the collection and construction of a medicinal plant database on top of which a search engine and AI-based image recognition for plants to enable scalable search of preserved knowledge.

Moses Thiga from Kabarak University on Artificial Intelligence

 

Play video by Moses Thiga, Kabarak University at the workshop “Toward a Network of Excellence in Artificial Intelligence for Development (AI4D) in sub-Saharan Africa”, Nairobi, Kenya, April 2019

What are you working on?

My name is Dr Moses Thiga. I work at Kabarak University as a researcher, lecturer and researcher administrator. I am working on health informatics and specifically looking to how we can apply Artificial Intelligence and the Internet of things in the area of health, specifically my current project, it is in the area of predicting blood pressure for expected mothers in order to be able to prevent and manage preeclampsia. We collect data using smartwatches and want to take that into a machine learning pipeline to be able to predict future occurrences of abnormal blood pressure.

How do you perceive development and Artificial Intelligence?

AI for development in the context of Africa needs to be an initiative that begins by identifying what are Africa’s real problems on the ground. Once these problems are identified, then we need to have a viable approach to solving these problems, that is both stakeholder’s focus, community, engagement, and it needs to have a keen emphasis on the capacity building of Africans, to solve African problems, and funding needs to be there. And not necessarily from donors, but especially from the government.

What is your blue sky project in Africa?

My blue sky project for Africa would be an activity that helps Africa deal with its health challenges. Something that helps us to, both, predict and manage medical conditions, especially epidemics. Even in the area of noncommunicable diseases. Lifestyle diseases are beginning to increase in Africa. Those would be areas of keen interest for me.