Unleashing Data Science in Africa for Infectious Disease Research: Hands-On Workshop on AI/ML Tools

28 Jul 2023 (modified: 01 Aug 2023)InvestinOpen 2023 OI Fund SubmissionEveryoneRevisionsBibTeX
Funding Area: Capacity building / Construcción de capacidad
Problem Statement: Scientists in LMIE face numerous challenges to conduct their research, including scarce funding, poor infrastructure, and lack of skills development opportunities. Traditionally, the philanthropic and academic sectors from HIC have led the R&D activities for endemic diseases of the Global South. As a result, LMIEs have largely relied on solutions devised abroad, which are often not adapted to the specific needs of their population. For example, 94% of malaria deaths worldwide occur in Africa, but most of the malaria drugs are being investigated and manufactured in the Global North. Data science and, in particular, artificial intelligence (AI) are transforming the field of drug discovery by reducing the costs and number of necessary experiments at all stages of the process, from the identification of early drug candidates to drug administration and patient follow-up. The advent of these AI tools offers a unique opportunity for low-resource settings where, traditionally, the costs of healthcare research and innovation have been prohibitive. Unfortunately, though, multiple barriers hamper adoption of AI tools by researchers based in LMIE, including lack of data science training and expertise, biased or unavailable AI models tailor-made for endemic disease areas, insufficient computing infrastructure, and accessibility to software and datasets. Previous workshops have been offered in South Africa, which limited attendance of researchers from West Africa.
Proposed Activities: We propose a skills development workshop on open source artificial intelligence (AI) and machine learning (ML) tools for infectious disease drug discovery. The aims of the event are to increase the community knowledge about data science tools, create a space for dialogue on how to use them effectively in day-to-day research and gather community feedback on what infrastructure, documentation and training opportunities are missing for LMIE scientists to sustainably incorporate those novel technologies into their work. At the end of the course, participants will have a deeper understanding of the basic concepts of AI/ML, know about current initiatives in the field and be equipped to utilize existing open source tools for their own research. The event will be in-person to facilitate engagement and include hands-on practical sessions and group work. The target audience are early-career researchers (MSc or PhD) in the field of biomedicine, including chemists, biologists, biotechnologists, pharmacologists, and other related health careers in West Africa. No prior expertise in data science or computational pharmacology will be required. The event will be held during four days in Accra, Ghana, and will include the following activities: keynote talks by Ersilia and H3D researchers, hands-on workshops with live coding sessions and breakout discussions where participants will have to collectively solve a problem applying the concepts and tools introduced in the keynote and hands-on sessions. At the end of each day, break-out groups will present and discuss their learnings with the rest of the group. In addition, team and community building will be encouraged via several activities, such as a welcoming session, an after-work outing, and a course dinner. To maintain the communication at the event conclusion, participants will be invited to join a Slack channel. The topics of the event will be: a) publicly available data for drug discovery, where to find it and how to treat it. b) supervised machine learning methods for drug profiling and activity prediction. c) Open Source and generative models. Sharing data, code, and machine learning assets. Regarding resources, we’d like to offer free registration to encourage participation, fully funded travel grants for scientists outside Accra, honorariums for two Ersilia and three H3D scientists for the workshop development and facilitation. Participants will need to have their own laptop/computer for live coding sessions. In order to support inclusivity and accessibility, we will also offer childcare stipends to any researcher who would otherwise be unable to attend the workshop
Openness: The workshop is centered around openness in research at several levels. First, during the event there will be dedicated sessions related to Open Data, how to identify useful datasets in public databases, treat them using data-driven approaches and apply them to specific research problems. Second, the software used throughout the course, the Ersilia Model Hub, is fully open-source, well-documented and free to use (https://ersilia.io/model-hub). Third, all the course materials will be made public under CC BY4.0 licenses. To ensure proper sharing of all the resources, we will leverage the existing Ersilia infrastructure on GitHub for code sharing (https://github.com/ersilia-os), GitBook for documentation (https://ersilia.gitbook.io/ersilia-book) and Slack workspace for networking and discussions. Keynote sessions will be recorded and shared via YouTube (https://www.youtube.com/@H3DFoundation/videos.) To engage a broader community, an open seminar will be organized during the week of the course to welcome other local scientists and introduce them to the work and tools developed by the course facilitators, and the Slack channel will remain open, hosted by Ersilia, to welcome new contributors and users to the community.
Challenges: H3D Foundation and Ersilia have previously hosted a similar capacity building workshop in South Africa (SA), with a majority of SA attendees. Our goal for this workshop is to extend the capacity building work to West Africa. The drug discovery research community in Ghana is less developed than in SA and will require more support and continued engagement to implement the use of the drug discovery data science tools. H3D Foundation is actively supporting the establishment of the Ghana Drug Discovery Hub, through support from the Bill and Melinda Gates Foundation and the African Capacity Building Foundation. This workshop will help to strengthen the community, and we are committed to supporting the researchers beyond this workshop and will leverage Ersilia’s existing online community. We anticipate the main challenge during the event will be internet connectivity and electricity supply. To mitigate that, we will select a venue with backup power and budget for the purchase of mobile SIM cards with data packs that facilitators can hotspot if the venue’s Wi-Fi does not work. In terms of computing resources, the infrastructure used during the event has been specially designed to run on a standard laptop, with all expensive computations taking place in the cloud. Finally, we will leverage the existing partnership with the University of Ghana so they can act as the local host for the course, in order to facilitate organisation, venue hire and participant accommodation
Neglectedness: Funding for AI courses and conferences is usually aimed at software development, with rare opportunities to focus on the implementation rather than the development of the tools. This proposal fills the gap on the need for courses aimed at supporting end-users learn about and adopt open source solutions instead of license-paying software like Schrodinger. We applied for funding for a similar event to the CS&S Event Fund, and we ran the course in Cape Town, South Africa, in September 2022, with excellent feedback from the participants and, at least, 5 follow up projects stemming from the learnings of the participants. H3D Foundation and Ersilia have been engaging with other potential funders for skills development workshops, including the Bill and Melinda Gates Foundation and Schmidt Futures, but have not yet been able to secure support. If we are successful in securing additional funding through these sources, we will be able to host additional workshops, and expand the course content and modules to cover more advanced topics.
Success: We will measure the success of the event in the following areas: Number and diversity of participants: we expect to reach 25-30 participants from Ghana and neighboring countries, from MSc to early-career and gender-balanced. Participation and engagement in activities: a successful event will allow enough time and space to debate and contribute to the workshops. Accomplishment of curricula: we aim to ensure students are able to complete sentences of the type “The purpose of X technique is: …” at the end of each session. The entry level of participants will be gathered in a pre-event survey, and learning pace will be monitored during the hands-on training. Application of learnt concepts in own research: we expect at least 10 participants to be able to adopt one or a few of the assets to their own needs. To measure this, the end session will include a brief presentation of each participant on their plan to try one or various assets in their research. Participant comfort and satisfaction: the event is designed to provide a safe and inclusive learning environment. We will measure this using an end-of-event survey. Networking: the event aims to create new relationships amongst western African researchers. Success in this area will be measured by the activity on the Slack channel set up for the event. A pre and a post workshop survey of participants, evaluating impact and participant satisfaction.
Total Budget: 20900
Budget File: pdf
Affiliations: Ersilia Open Source Initiative and H3D Foundation
LMIE Carveout: Our project is fully focused on LMIEs. One of the partner organizations, the H3D Foundation, is located in a LMIE (South Africa), and the event will take place in Ghana and will be limited to participants in LMIE of the region. All the tools and infrastructure used in the event are geared towards users in LMIE, and half of the contributors to the Ersilia Model Hub, the main piece of infrastructure, are from LMIE.
Team Skills: Ersilia and the H3D Foundation have been working together since 2021. The goal of both organizations is to strengthen the biomedical research ecosystem in Africa, focusing on the identification of new drugs for endemic diseases like malaria and tuberculosis, and aiming at creating a world-class scientific leadership in the continent. The team of event facilitators are skilled scientists with excellent track record in the field of AI/ML, as well as experience in performing research in low resource areas, implementing computational tools in experimental pipelines and delivering courses and training. To ensure code of conduct compliance and deal with any potential issues, one of the event facilitators has undergone the Code of Conduct Incident Response Training offered by Otter Technologies.
Submission Number: 59
Loading