Not just another research product visualization platform: assuming the commitments of transparency and purpose with Wapi

31 Jul 2023 (modified: 01 Aug 2023)InvestinOpen 2023 OI Fund SubmissionEveryoneRevisionsBibTeX
Funding Area: Critical shared infrastructure / Infraestructura compartida critica
Problem Statement: Wapi is a platform that will offer an accessible dashboard for research data visualization, guided by transparency and data integrity principles, addressing the problem of limited accessibility and tools for comprehending academic data. Wapi will enable users, such as researchers, administrators, and science communicators, to navigate the research landscape by facilitating analysis of research products and their actors through connections to academic indexers such as ([http://OpenAlex.org](http://openalex.org/)) and, primarily, enriching their data with affiliation ([http://Orcid.org](http://orcid.org/)), and also by providing an interactive visual layer to make it more accessible and easier to comprehend and communicate. One innovative aspect that sets Wapi apart from traditional research visualization platforms is its integration of ActivityPub. This protocol fosters cross-platform communication and decentralization of users and information. Embracing ActivityPub opens a new channel to access information and receive updates for any user with a research interest. The first objective of Wapi will be to provide the visualization of publications from 15 Chilean Universities to enrich data by identifying authors affiliated with these specific universities through ORCID integration, allowing, in the end, to extract the proportional academic production of the institution considering the actual affiliation of its current researchers.
Proposed Activities: 1 System Architecture Design (1 week): We will design the overall flow of the system, considering all components and how they interact with each other - NodeJS, MongoDB, SvelteKit, D3.js, ActivityPub, OpenAlex and ORCID. Resources: Working hours from the full-stack programmer. 2 User Requirements Definition (2 weeks): We'll understand what users expect from the platform regarding functionality and user experience. Resources: Working hours from the main researcher. 3 Selection of Universities for initial analysis (1 week): We will select 15 Chilean universities that will serve as the initial testing ground. Resources: Working hours from the main researcher. 4 Backend and development server setup (3 weeks): We will configure a testing server environment for the project's development, install all the needed software and packages, and start the project's documentation. Resources: Working hours from the main researcher and the full-stack programmer. Virtual server on a hosting service. 5 API integration with OpenAlex and Orcid (6 weeks): We'll program de API connections and write scripts to query these academic indexers. Resources: Working hours from the team. Virtual server on a hosting service. 6 Data storage design in MongoDB (2 weeks): We will design and test the MongoDB schemas to store the responses from the queries. Resources: Working hours from the team. Virtual server on a hosting service. 7 Front-end development with SvelteKit (4 weeks): We will develop the system's front end with SvelteKit. We'll create interfaces allowing users to interact effectively with the system and visualize the data. Resources: Working hours from the team. Virtual server on a hosting service. 8 Integration of ActivityPub (8 weeks): We plan to incorporate ActivityPub into the platform to enable a communication channel reachable by other social platforms that also use ActivityPub. Resources: Working hours from the full-stack programmer. Virtual server on a hosting service. 9 Data visualization (5 weeks): We will design and program in D3.js the data visualizations to be displayed in the user interface, its animations and interactions. Resources: Working hours from the team. Virtual server on a hosting service. 10 Testing and refinement (2 weeks): We will extensively test and refine the system based on the results. This will involve evaluating the queries' effectiveness, the data storage's robustness, the NodeJS management's responsiveness, the front-end's intuitiveness, and the seamless integration of ActivityPub. Resources: Working hours from the team. Virtual server on a hosting service. 11 Implementation and deployment (2 weeks): We will launch the platform on a secondary virtual server and monitor its performance and user feedback for further improvements. Resources: Working hours from the team. Virtual server on a hosting service. 12 Holidays: Between February 1st and March 1st.
Openness: The information from the publications of Chilean Institutions will be obtained by querying their GRID, ROR, or another feature from OpenAlex, making the queries transparent and allowing system improvement. The server configuration will be maintained as a Docker image, available on GitHub. Having ActivityPub as a communications channel will allow users in social media platforms that also use this protocol to access information and receive updates. The potential “user” mentioned in the proposal refers to: 1. **University administrators and librarians:** The platform would be helpful to university administrators as it would enable them to visualize and understand their institution's academic production in relation to others, offering insights into the university's research strengths and areas for improvement. 2. **Policy makers and funders:** Those who influence and direct funding policies at universities could use the Wapi platform to help identify trends, strengths, and weaknesses in academic production. 3. **Journalists and science communicators:** They could use Wapi to easily find trends in academic research for their articles or reports, providing a new channel to access and receive updates on research news and trends. 4. **Open data advocates and developers:** Considering the principles of transparency, data integrity, and decentralization, these users could take advantage of the platform to further their own projects or analyses.
Challenges: 1. **API Rate limiting**: Both ORCID and OpenAlex have limits to the number of requests that can be made within a certain timeframe. These limitations can slow down data collection, requiring careful management of request rates and error handling. OpenAlex is limited to 100,000 calls per day and a burst rate limit of 10 requests per second ([link](https://docs.openalex.org/how-to-use-the-api/rate-limits-and-authentication)). ORCID allows 24 requests per second and a burst of 40 ([link](https://info.orcid.org/ufaqs/what-are-the-api-limits/)). 2. **Data Integrity and quality**: The accuracy of the data and its completeness can impact the overall system's effectiveness. 3. **Scalability**: As the project seeks to scale to accommodate an expanding corpus of academic information, it will face challenges related to database management, performance, system design and future fundings. 4. **Integration of Components**: This project integrates various components, including external APIs, MongoDB, NodeJS, SvelteKit, and ActivityPub. Each of these technologies has its requirements and potential pitfalls. Ensuring that these components work seamlessly together is a complex task. 5. **Time Constraints**: Given the 9-month timeframe, the project will require rigorous project management to ensure all tasks are completed in a timely manner.
Neglectedness: No, we haven't found any funding sources focused on Latin America.
Success: Primarily, the platform will be successful in itself by enabling the transparent and citable display of academic output, which is currently not possible with the existing national platform ([https://dataciencia.anid.cl](https://dataciencia.anid.cl/)) that does not provide sufficient background for the replicability of its results. Thus, our first measure of success will be to facilitate transparency, replicability, and citation of academic production, surpassing the current national platform's capabilities. Furthermore, success will be demonstrated by: 1. **User Engagement**: We will track the number of active users, the frequency of usage, and user engagement, measured as requests to the database by IP. 2. **Positive User Feedback**: We will facilitate the user feedback and measure success by the number of degree of user satisfaction. 3. **Accurate Data Visualization**: The accuracy and completeness of the visualized data will also be a measure of success, making each inconsistency found between the metadata or the % of NA public. 4. **System Performance**: Success will also be determined by the system's performance, including query response times, the rate of successful requests to ORCID and OpenAlex, the reliability of the MongoDB database and platform uptime.
Total Budget: 13600
Budget File: pdf
Affiliations: Universidad Central de Chile
LMIE Carveout: No
Team Skills: **Ricardo Hartley Belmar**. Medical Technologist, Master's and Doctorate in Cellular and Molecular Biology. Educator, self-taught, and enthusiastic about open science and scientific data management. I hold a diploma in Engineering and Data Science from the University of Chile, as well as various courses on reproducibility and information access offered by FSCI (Force11 Scholarly Communications Institute), being the latest course, the Essential for Data Support offered by the Research Data Netherlands (RDNL). **Luis Alfonso Orellana Martinez**. Bioengineer, Master in statistics, Data Scientist. Data professional skilled in technology and passionate about digital media and literature, with vast experience in different data centric problems, such as regulatory affairs compliance, wildfire prediction platforms and modeling of biological networks, and in web development with javascript architectures, MongoDB databases, and front-end frameworks with React, Vue and more recently Svelte.
How Did You Hear About This Call: Word of mouth (e.g. conversations and emails from IOI staff, friends, colleagues, etc.) / Boca a boca (por ejemplo, conversaciones y correos electrónicos del personal del IOI, amigos, colegas, etc.)
Submission Number: 145
Loading