Enhancing Utilization of Open Research Data and Shared Infrastructure in Tanzania through Improved Expertise in Big Open Data Analytics (BODA)

31 Jul 2023 (modified: 01 Aug 2023)InvestinOpen 2023 OI Fund SubmissionEveryoneRevisionsBibTeX
Funding Area: Capacity building / Construcción de capacidad
Problem Statement: There is a shortage of qualified faculty and experts in ML and open data analytics. Faculty members have limited experience and skills in these fields and may struggle to effectively teach the subject matter to students. Sokoine University of Agriculture (SUA) is currently conducting three research projects aimed at generating big data: YEESI lab, AI4MoreCrops, and ACHE Projects. While the YEESI Lab dataset is already available on Zenodo, the AI4MoreCrops and ACHE Project datasets are expected to be made public by the end of this year. The use of open Machine Learning (ML) data in industries such as agriculture and health has the potential to benefit a wide range of stakeholders, including researchers, policymakers, and the general public. It has the potential to promote long-term agricultural development, increase productivity, and drive socioeconomic growth in developing countries like Tanzania. Furthermore, open data promotes transparency, collaboration, and knowledge sharing among various stakeholders. However, effective use of open ML data in Tanzania faces several challenges scarcity of expert data analysts skilled in big data analytics. Without the required expertise, valuable insights from datasets may go untapped. The current ICT curricula in Tanzania lack comprehensive courses on ML and deep learning (DL), which are essential for preparing undergrad and grad students on data analysis using ML and DL.
Proposed Activities: Act01: Conduct Training and Awareness Program (Nov 2023 – Oct 2024): The project’s initial phase will be to recruit participants, mainly undergraduate students from SUA studying towards their degrees in IT or Engineering. The project will then implement a comprehensive capacity-building program to train Tanzanian youth data analysts in big data analytics. The 12-month program will cover fundamental data preprocessing, data visualization, statistical analysis, ML, and DL techniques. The training will be delivered in both in-person and online training formats as free courseware using the YEESI Lab training modality. Act02: Conduct Competition/Hackathons Participation - (May 2024 – Dec 2024): The trained students will use the datasets and the master Quad AI workstation GPU server from the three projects as real-world case studies, primarily through competitions/hackathons. The initial plan is to involve undergraduate students from three Tanzanian universities, namely SUA, University of Dodoma (UDOM), and the Nelson Mandela Institute of Science and Technology (NM-IST). Act03: Hands-on Practice, Collaboration and Create a Community of Practice (Nov 2024 – Feb 2025): The intervention will coordinate workshops, conferences, and networking events with the aim of promoting knowledge sharing, facilitating the exchange of best practices among young data analysts, researchers, policymakers, and stakeholders in the open data analytics. The project will provide an opportunity to work on hands-on practice projects using real-world datasets and the open GPU server. We will also provide mentorship programs and establish online forums, discussion groups, and networking events to facilitate continuous learning and knowledge exchange. Act04: Engaging Stakeholders, Conduct monitoring and Impact Assessment - (Nov 2024 – Feb 2025): Finally, the project will also conduct a qualitative study to assess the socioeconomic impact of using open data in sectors such as agriculture and health, highlighting and quantifying the benefits in terms of increased productivity, reduced resource waste, and improved livelihoods. The project will generally require a combination of subject matter experts, trainers, mentors, facilitators, and resources such as datasets, online learning platforms, venues, and relevant materials to carry out these activities within budget and time constraints. The timeline, expertise, and resources required are summarized in the table below. Timeline Act01 - (Nov 2023 – Oct 2024): expertise needed include Trainers in big data analytics (statistics, visualization, ML, and DL). Act02 - (May 2024 – Dec 2024): expertise needed include (Mentors with experience in ML and DL). Act03 - (Nov 2024 – Feb 2025): Expertise needed include Workshop/seminar facilitators. Act04 - (Nov 2024 – Feb 2025): Expertise needed in impact assessment and qualitative research methods.
Openness: Openness of Infrastructure Learning Facilities: This project will employ open access platforms such as the open GPU-server (http://www.yeesi.sua.ac.tz) and the e-Learning portal (http://41.59.85.2:839) to share the training materials and conduct online training sessions (webinars) to reach more young innovators in the country and beyond. Open Datasets: The project will use free and openly available datasets from SUA, UDOM, and NM-AIST to facilitate training. Open collaboration platform and discussion forums: The project will set up an open collaboration platform that allows research participants to engage in various project activities, including asking questions and sharing ideas. Activities to Engage a Broader Community Webinars and Workshops will be conducted to empower research participants with key skills and knowledge in data management and analysis. The project will also set up a competition for the young innovators from SUA, UDOM, and NM-AIST in Tanzania. More importantly, the research participants and other stakeholders will be encouraged to provide feedback and contributions through meetings, workshops, and online platforms. Plans to Share Project Output Openly The proposed project expects different outputs, including training materials, research reports, and research papers. The project will use open licenses and open platforms such as the SUA institutional repository and YouTube to ensure accessibility.
Challenges: This project anticipates the following challenges: Receiving a larger number of research participants than the capacity of the project-the project team will set up the criteria to include only the required number of young innovators in the project. Diverse participants-the project will include university youths with different backgrounds and different levels of ICT skills; thus, the preparation of the training materials and environment must consider these factors. Maintaining participants' engagement throughout the training can be difficult, hence, the training sessions will be interactive with engaging hands-on activities to keep participants alive. Covering all relevant topics within a given time and other project activities can be a challenge, therefore, proper planning of the project activities is essential to ensure effective use of the project time. Financial limitation. The analysis of research data requires specialised software (proprietary or non-proprietary). This project will use open and freely available software to ensure cost savings without impacting the project outcome. Gender inequality. To address this problem, the project is designed and planned with a gender perspective in mind to encourage the diverse participation of young innovators, especially female researchers and students.
Neglectedness: As previously stated, this initiative will make use of open data from three projects: YEESI Lab, funded by USAID; AI4MoreCrops, funded by IDRC via ATPS; and ACHE, funded by Meridian Institute via Lacuna Fund. While YEESI Lab includes a capacity-building component, the other projects primarily focus on collecting and curating big data for ML and AI applications, with little emphasis on capacity building for effective data utilization. With the completion of the project phase of YEESI Lab, there is a strong desire to broaden the project's reach to include more Tanzanian youths and early-career researchers. Tanzanian youths and early-career researchers face a critical shortage of ML and AI expertise, required for the effective use of open data. As a result, this project seeks funding to enhance the ML and AI skills of Tanzanian youths and early-career researchers at three centers: SUA in Morogoro, UDOM in Dodoma, and NM-AIST in Arusha, broadening the project's impact compared to YEESI Lab's limited scale.
Success: The success of the project will be measured using Key performance Indicators (KPIs) and other metrics as explained below: Number of Experts and Researchers Trained: We will measure the number of experts who successfully complete training and mentorship programs or workshops in ML, DL and big data analytics and open research data utilization. Research Mini-projects and Hackthons Utilizing Open Research Data: We will track the number of trained experts and mini -research projects conducted in hackthons Open Data Utilization Rate: We will measure the percentage of open research data that is utilized by trained experts during hackthons competitions, reports, or publications. Skills Improvement: We will also evaluate the level of improvement in participants' skills and expertise in ML, AI, DL, big data analytics and data management. Stakeholder Feedback: We will further collect feedback from trained experts, researchers, participating institutions and policymakers involved to understand their satisfaction level with the project's outcomes and impact. The Rate of Data Sharing and Collaboration: Additionally, we will conduct an assessment of the increase in data sharing and collaboration among researchers and institutions, facilitated by the project's activities. The Rate of Technology Adoption and Data Accessibility: We will measure the adoption rate of ML and data analytics tools by trained experts, researchers and institutions in Tanzania.
Total Budget: 25000
Budget File: pdf
Affiliations: Sokoine University of Agriculture (SUA), Morogoro, Tanzania
LMIE Carveout: This project will be implemented in three higher learning institutions based in Tanzania. Therefore, the proposed project fits this category because the project will be implemented in Tanzania which is also among the Low- and Middle- Income Economies from Sub Saharan Africa. The project team and the leading organizations of the project are based in Tanzania. Additionally, the project’s community who are the beneficiaries of this project are located in Tanzania.
Team Skills: The project team consists of Dr. Alcardo Alex Barakabitze (PI), Dr. Joseph Telemala, Dr. Ester Ernest, Dr. Kadeghe Fue and Prof. Camilius Sanga. Dr. Barakabitze has more than 8 years of experience working in the fields of ML/AI, big data analytics and open infrastructure, telecommunication systems, and ICT4D. Dr. Barakabitze has knowledge of conducting successful training and mentoring students in AI/ML and has lived in China (2 years), UK (5 years), Ireland (2 years), Italy (3 months) and Switzerland (4 months). Dr. Kadeghe Fue from the School of Engineering and Technology has expertise and research experience in precision agriculture and farm automation. He has lived in the USA (8 years) and has been conducting training, mentorship and collaborating with various industry partners in different projects. Dr. Joseph Telemala with expertise and skills in information retrieval specifically in multilingual information retrieval and natural language processing (NLP). Dr. Telemala has lived in India (2 years) and 4 years (South Africa). Dr. Ester Ernest has more than 10 years working experience in training students, developing training and assessment tools in information use and management, and has knowledge of information and data use ethics.
How Did You Hear About This Call: Word of mouth (e.g. conversations and emails from IOI staff, friends, colleagues, etc.) / Boca a boca (por ejemplo, conversaciones y correos electrónicos del personal del IOI, amigos, colegas, etc.)
Submission Number: 179
Loading