Machine learning models for the identification of swollen shoot disease in Côte d’Ivoire.
Keywords: Machine Learning, Image classification, swollen-shoot, deep-learning.
Description of the project
Cocoa cultivation plays an important role in the Ivorian economy. Imported into Côte d’Ivoire during colonization, cocoa cultivation still plays a major role in the Ivorian economy, accounting for nearly 15% of GDP and 50% of export earnings (AfDB, 2019). Although cocoa farming is an important part of the Ivorian economy, cocoa producers do not receive a significant share of the value of the crop. Indeed, according to estimates by the Ivorian authorities, cocoa farmers receive only 6% of the global value of cocoa. In addition to receiving a small share of cocoa products, farmers also face crop losses due to cocoa diseases, including swollen shoot.
Swollen shoot, in addition to threatening the income of farmers, represents a threat to the sustainability of cocoa farming. Cocoa Swollen shoot disease is a viral disease that has been present in Côte d’Ivoire since 1943. In addition to being endemic, virulent forms of the disease were confirmed in 2003 by the CNRA in large production areas. It causes crop losses that can vary from 25% to 50% in the first two years and up to 100% in the third year. The prevalence of the disease is estimated at around 9% to 10%.
Swollen shoot research identifies the disease through several channels: stem and root swellings, reddish or yellowish leaf spots and pod shape. One of the most effective methods of controlling this disease is the development of resistant strains through outbreeding (Andres, C., 2017). In view of the threat posed by swollen shoot, the Ivorian government, through the Cocoa Coffee Council (CCC), has initiated a national programme to combat swollen shoot. This programme aims to identify infected orchards, uproot them and then replant new cocoa plants.
This research project is in line with the objective of identifying the swollen shoot disease through models. Indeed, based on the known symptoms of the swollen shoot disease, this project aims to develop mathematical models and then implement them in order to help producers identify the disease early.
The main objective of this study is to set up a model for identifying the presence of swollen shoot on a plantation. More specifically, this project consists of
1. Collecting information in the form of images and GPS coordinates on cocoa orchards;
2. Develop and train a model to identify the swollen shoot disease;
3. To carry out forecasts of the said disease.
The mission will consist of :
- Carrying out an inventory of existing data;
- Collecting additional data;
- Building models (statistical, probabilistic, mathematical) for the recognition of the disease;
- Develop a profile of concept (POC) for the application of the models.
The main stages of the implementation of this study include three main phases:
o Inventory of available data on cocoa from the INS, CCC, ANADER and CNRA;
o Collection of missing information;
o Modelling and then forecasting;
Data collection
Once the information needs are known, data collection will be carried out in the major cocoa production areas with a high prevalence of the disease. The collection will focus, in principle, on images of stems, leaves and pods. Also, GPS coordinates of the location of the cocoa plants will be collected.
Modelling
Once the data is available, processing will be carried out to label the images before training the models. Several competing models will be trained to identify a model with a satisfactory forecast.
Methodological approach
Scope of the study
Information will be collected in cocoa-producing areas. The INS identifies 51 cocoa-producing departments which have been classified into three zones: (i) a low production zone (the west and centre-east), (ii) a medium production zone (south-east) and a high production zone (centre-west and south-west).
Supervisors
MOUSSA K. Richard, PR at the École Nationale Supérieure de Statistique et d’Économie Appliquée (Côte d’Ivoire)
Beatrice Akissi Boutin, Researcher at the Department of European, American and Intercultural Studies of La Sapienza University, Rome, Italy (Position “French on the African Continent”) and at the Institute of Applied Linguistics of the Félix Houphouët Boigny University, Abidjan. HDR in Language Sciences (Université Paris Ouest Nanterre La Défense, France).
Anne-Françoise Yao, PR at the University of Clermont Auvergne (France) and Lecturer at the Ecole Polytechnique Paris (France)
Application form
- A letter of motivation addressed to the Director of ENSEA;
- A recent detailed curriculum vitae (CV);
- A legalized copy of the High School Diploma;
- A legalized copy of diplomas obtained after the High School Diploma and transcripts, and any other evidence that may support the application;
- A thesis research proposal of five (5) pages maximum clearly indicating the title, the problem, the objectives, the analysis approach, a literature review, the hypotheses and the expected results as well as the corresponding bibliographical references. Particular attention will be paid to the quality of the proposed research topic and its relevance to the research interests of ACE researchers or to economic development issues;
- Two (2) letters of recommendation justifying the student’s research capacity (preferably two letters from professors qualified to supervise a thesis);
Applications including all certified documents are received exclusively by electronic means at ecoledoctorale@ensea.edu.ci
Please mention in the subject line: “Thesis project: Statistics, Machine Learning and Linguistics”.
Submission deadline is 12 August 2022
For further information, please visit ENSEA at Office 802 or contact the following numbers Tel: (+225) 27 22 44 08 42
Deliverables
- Papers on the methods to be developed.
- A context-specific image recognition software.
Some examples of spin-offs of the results of the thesis work
- Collection of identification for the implementation of an alert system.
- Implementation of an A.I. (Artificial Intelligence) allowing the identification of the disease.
Profile required
- Master’s Degree in statistics, data science or equivalent.
- Good knowledge of R and Python software.
- Ability to work in a multidisciplinary context
Supervisors
- (To be completed).
- Anne-Françoise Yao, PR at the University of Clermont Auvergne (France) and Lecturer at the École Polytechnique Paris (France)
- MOUSSA K. Richard, PR at the École Nationale Supérieure de Statistique et d’Économie Appliquée (Côte d’Ivoire)
Some references
- Abrokwah, F., Dzahini-Obiatey, H., Galyuon, I., Osae-Awuku, F., & Muller, E. (2016). Geographical distribution of cocoa swollen shoot virus molecular variability in Ghana. Plant Disease, 100(10), 2011-2017.
- Andres, C., Gattinger, A., Dzahini-Obiatey, H. K., Blaser, W. J., Offei, S. K., & Six, J. (2017). Combatting cocoa swollen shoot virus disease: what do we know? Crop protection, 98, 76-84.
- Coulibaly, M., Kouassi, K. H., Kolo, S., & Asseu, O. (2020). Detection of “Swollen Shoot” Disease in Ivorian Cocoa Trees via Convolutional Neural Networks. Engineering, 12(03), 166.