Open-Source AI Infrastructure for Social Impact: Enhancing Livelihoods in Marginalized Communities

Elimboto Mwiki Yohana

Open-Source AI Infrastructure for Social Impact: Enhancing Livelihoods in Marginalized Communities

Elimboto Mwiki Yohana

31 Jul 2023 (modified: 01 Aug 2023)InvestinOpen 2023 OI Fund SubmissionEveryoneRevisionsBibTeX

Funding Area: Critical shared infrastructure / Infraestructura compartida critica

Problem Statement: Advances in machine learning and artificial intelligence have the potential to transform lives, especially in the developing world. However, access to the tools, infrastructure, and expertise required to develop impactful AI solutions remains limited for researchers, social enterprises, and public institutions in low-resource settings. As a result, populations that could potentially benefit the most from AI-driven innovations are deprived of the solutions they need. The proposed work aims to address this challenge by building an open AI infrastructure that lowers the barriers to developing and deploying AI models that could improve livelihoods. The infrastructure will provide computing resources, developer tools, and deployment platforms that will enable innovators, researchers, data scientists, social enterprises, and public institutions serving marginalized communities to build AI solutions that respond to local challenges, needs, and contexts with a focus on applications in domains like healthcare, agriculture, financial inclusion, and energy. The proposed solutions could enhance the lives of millions of people who currently lack access to basic services and opportunities. The open source and collaborative nature of the initiative also aims to cultivate responsible and ethical AI practices by promoting transparency, verifiability, public accountability, and effective, accessible, and equitable AI use. The AI infrastructure will be integrated into TSSFL Technology Stack.

Proposed Activities: Project will: 1. Build an AI infrastructure backend (1 - 3 Months): - 2-year subscription for VPS - Install and configure open source software stack: SageMath, Sage Cell Server on Node.js - Hire 2 full stack DevOps/Software engineers to set up and maintain the infrastructure. 2. Integrate AI infrastructure into TSSFL Technology Stack (3 - 12 Months): - Integrate AI stack into open-source TSSFL Technology Stack - Develop/integrate deployment platforms: API services and web apps - Partner with TANESCO and the University of Dar es Salaam for pilot developments, deployments, and user testing of energy/power ML model for energy optimization (data is available) as a use case of the AI infrastructure. 3. Refine and scale solution (12 - 20 months) - Continuously retrain the model using feedback from end users - Integrate the updated model into the deployment platform - Provide well-documented instructions to help users get started in developing, testing, and deployment of models 4. 20 - 24 Months: - Expand partnerships to 4 organizations serving marginalized communities - Reach and train new organizations on using the AI solutions VPS requirements: 128 GB RAM Dual 12-core 2.2GHz processors 240GB SSD + 2TB SSD Network up to 1 Gbps Nvidia RTX A4000 GPUs (x2) with: - 6144 CUDA cores - 192 Tensor cores - 16 GB GDDR6 memory - 19.2 TFLOPS FP32 performance Support Linux OS Expertise requirements: - Full stack software engineers: 2 - Data scientist/AI engineer/ML researcher: 1 - Domain experts in energy: 1

Openness: The proposed work is highly open in the following ways: Infrastructure: The AI backend will be built using open-source software like SageMath, Sage Cell Server, and Node.js. By using open infrastructures, this work can easily be built upon and extended by the broader community. Activities: The AI solutions and deployment platforms will be integrated into the open-source TSSFL Technology Stack - https://www.tssfl.com, making them openly accessible to social/public enterprises and developers. In addition, the collaborative development of the models will use control versions such as Github. Community Engagement: clear documentation will be provided to help users get started in developing and deploying AI solutions. Online pieces of training and workshops will also be organized to enable broader participation. Output Sharing: The refined AI models and deployment platforms will be released as open-source software under a permissive license. This will allow any developer or organization to freely build upon and extend the work.

Challenges: Quality: Many social issues lack the high-quality curated datasets required to develop accurate AI models. Collecting and labeling large datasets to train effective models may be challenging, especially for health data that requires patient consent. Communicating Complexity: AI and ML can be difficult concepts for non-technical stakeholders to understand. Effectively communicating the capabilities and limitations of the solutions will be important for managing expectations and building trust. Model Bias: Since the training data often comes from limited sources, there is a risk of introducing unintended biases into the models that disproportionately impact marginalized groups. Mitigating bias through ethical AI practices will require expertise and diligence. Technological Constraints: Many communities have limited access to reliable internet, smartphones, and computers needed to use AI solutions. Ensuring the technologies are accessible to the most under-resourced populations may require creative approaches. Scaling: Scaling the solutions beyond the initial pilot deployments will require expanding partnerships with social enterprises and NGOs. Identifying the right partners and aligning on goals and timelines can be challenging. Measuring Impact: It can be difficult to clearly attribute positive social impacts to AI solutions. Developing rigorous practical frameworks for evaluating the real-world impact of the models will be important for transparency and improvement.

Neglectedness: To the best of my knowledge, there are several sources of funding available for this type of work from organizations focusing on AI for social good. While I have not personally applied for funding before, I understand there are grants and donor funds targeting AI solutions for challenges in developing countries. Currently, I'm the Principal investigator of an ongoing and locally funded project by the University of Dar es Salaam. This project aims to develop machine learning solutions using real consumer data to detect energy theft and improve power services. The data used is provided by TANESCO, Tanzania's power distribution company. The funding and data access for this current initiative demonstrates there is local interest and support for using AI to address important socioeconomic issues. However, additional funding would be required to build an AI platform, refine the ML models, deploy the solutions, and evaluate the real-world impact - which is the focus of the proposed work we have been discussing. While there are likely multiple funding options available internationally for AI for social good initiatives, having at least one local project already underway is a good starting point. Securing more funding to build upon our current work and scale the solutions can help maximize the potential social impact.

Success: There are a few key ways we could measure success for the proposed work: Social impact: While technical performance metrics are important, we would place the greatest emphasis on measures that indicate the solutions are actually making a tangible difference in meaningfully improving the livelihoods and well-being of our underserved communities. Gathering qualitative and quantitative feedback from our partners and end users would be critical to iteratively refine and improve the work to maximize real-world impact over time. Model performance: We would evaluate the performance of the developed AI models on relevant metrics like accuracy, error rate, recall, precision, and F1 score. Continuous improvement of these metrics over time would indicate success. Partners/stakeholders satisfaction: We would survey our partner organizations regularly to understand how useful and effective they find the solutions, and what could be improved. High levels of satisfaction and retention of partners would show we are meeting their needs. Scalability: We would track metrics like the number of partners, users, and deployments over time. The ability to scale the solutions to impact more people and organizations would demonstrate success. Open adoption: We would monitor how many external developers and organizations begin using and building upon our open-source AI infrastructure and solutions. The wider adoption of our work by the community would be a measure of success.

Total Budget: US$ 25,000

Budget File: pdf

Affiliations: Yes, This proposal is affiliated with Tanzania Students and Scholars Foundation Limited (TSSFL): https://www.tssfl.co

LMIE Carveout: Yes, this project is led by TSSFL, an organization focused on transforming communities with innovative digital solutions to address challenges in marginalized, developing, and African communities. TSSFL is a registered company in Tanzania aiming to innovate and solve numerous problems by prioritizing the use of technology. Its mission is "to develop, educate, train and equip African women, men, student and scholar community in particular, with knowledge, information, skills, competence and technical know-how that will generate a mass of future leaders capable of evolving African answers to Africa's challenges and empower them to solve their own problems and realize social economic development for the betterment of Africa in particular and mankind worldwide in general."

Team Skills: Our team at Tanzania Students and Scholars Foundation Limited (TSSFL) brings diverse technical skills and experience vital to the success of this project. TSSFL has over 10 years of existence, developing and integrating various technologies to solve problems in marginalized communities. Our team members have expertise in web development, mobile app development, data science, machine learning, and software engineering. We have deep experience building e-learning platforms and cloud solutions, see https://www.tssfl.com/viewforum.php?f=278. As a technology-focused organization, we combine our technical skills with an understanding of the social and cultural needs of the communities we aim to serve. TSSFL has partnered with local Universities, NGOs, schools, and governments that provide valuable insights into the specific needs of the communities in Tanzania and Africa as a whole. Our team members come from these same communities and have firsthand knowledge of technological barriers and opportunities. As an organization, we are motivated by a vision of transforming communities in Africa through innovative technological solutions. We believe that by leveraging our team's diverse technological skills, experience in building solutions, partnerships with local organizations, and lived experience within African communities, we can develop technologies that meaningfully improve lives and achieve our vision of positive change through technology.

How Did You Hear About This Call: Word of mouth (e.g. conversations and emails from IOI staff, friends, colleagues, etc.) / Boca a boca (por ejemplo, conversaciones y correos electrónicos del personal del IOI, amigos, colegas, etc.)

Submission Number: 168

Loading