Developing an automated tool to examine the limited openness of Open Science infrastructures

27 Jul 2023 (modified: 01 Aug 2023)InvestinOpen 2023 OI Fund SubmissionEveryoneRevisionsBibTeX
Funding Area: Critical shared infrastructure / Infraestructura compartida critica
Problem Statement: The rapid expansion of Open Science (OS) detracts attention from existing inequities in equitable access to shared resources. Many resources in the OS landscape are not designed to support access from countries with low-bandwidth and/or geo-political barriers such as sanctions. This limits researchers in these countries from engaging with open data, journals and source code. To date, discussions about equitable access have mainly been regional and focused on large-scale infrastructural developments and policies. Most other information has come from anecdotal evidence. There is a lack of comprehensive data that maps this problem in real-time and allows the granularity to start meaningful discussions. The level of accessibility to OS-supporting infrastructures is extremely heterogeneous. In 2022 we led a pilot study to test varying access to repositories and digital OS tools from 14 countries around the world. The results from this study clearly illustrated considerable variability to OS resources around the world due to timeouts and geoblocking. The study strongly indicated that more systematic and dynamic monitoring was urgently needed. The EAO will extend the pilot work. Through the data it gathers it will enable OS supporting infrastructures to make directed changes to their structures to support equitable access. The development of this observatory is crucial to the advancement of an OS landscape that is truly without barriers, marginalizations or restrictions.
Proposed Activities: Month 1 - 6: Refine and extend existing accessibility-checking code for use in the Equitable Access Observatory (EAO). This code was developed by Prof Shanahan for the Shanahan and Bezuidenhout 2022 study. The software extensions will support the observatory to measure how things are going on a dynamic basis. It will check the accessibility of relevant OS sites from as many territories/countries as possible (over 190). This will be by checking the return codes from http requests to those sites. Code will also be developed for a tool to enable repositories to test their own accessibility. Expertise needed: Development work will be led by Prof Shanahan at RHUL. Dr Bezuidenhout will oversee the development of the data management planning for the EAO. She will also extend the lists of OS resources developed in the Shanahan and Bezuidenhout 2022 and Bezuidenhout and Havemann 2021 studies to develop a comprehensive list of OS infrastructural elements (software and data repositories, digital OS tools, publication platforms etc). Resources needed: Confirmed donated access from Bright Initiative will provide VPNs for over 190 territories/countries. A server (100 Gbyte storage, single CPU + minimal main storage) is also required to do the queries and to store the data. The server Current discussions with Confederation of Open Access Repositories (COAR) and World Data System (WDS) about hosting. Month 6 - 12: develop an outward facing tool, e.g. a dashboard and regular report plus a method for rapid identification of access issues for repositories. The project team with also engage with related projects/organisations with relevant data sets (ie. RAMP, global digital inclusion partnership, Freedom House, Netbloks) to assess data integration and visualisation possibilities. Expertise needed: Development work will be led by Prof Shanahan at RHUL. Resources needed: developer time to build the dashboard and integrate data integration, visualisation and management strategies. Month 12 - 24: collect data from the observatory. These data will be used to refine the presentation and visualisation of the data. During this period the EAO will be promoted within the OS community, with specific attention paid to promoting the use of the EAO by infrastructure hosts to self-test accessibility. A series of workshops will also be hosted to discuss the data collected by the EAO. These workshops will involve infrastructure stakeholders as well as legal scholars specialising in issues relating to OS. The purpose of these workshops will be to develop equitable access guidelines that will inform the design and deployment of the OS landscape in the future. Expertise needed: Dr Bezuidenhout will take the lead on the dissemination activities and the design and roll-out of the workshops. Resources needed: Dissemination activities (conference attendance, flyers and graphic design etc). Workshop hosting (1 x in-person, 2 x online).
Openness: The code from the pilot study is freely available on GitHub and archived at Zenodo. The findings of the study were published in an OA journal. Further developments of the code will be similarly published on GitHub, and reuse, pull requests and forking strongly encouraged. Major releases will be mirrored onto Zenodo with appropriate metadata. The implementation will be documented so that others can easily deploy it as they see fit. Though the code developed will make use of the Bright Initiative VPN service, it will be designed to ensure that other VPN services can be used. The data collected by the observatory will be made as open as possible as per a data management plan that takes into account the sensitive elements of the data to be collected. The objective of the EAO is to foster discussion on limits to openness and to inform evidence-based policy and OS monitoring. The EAO data will support revisions of existing OS infrastructures to improve accessibility and the responsible development of new OS tools and infrastructures. The EAO will play an important role in the OS community, and considerable effort will be directed towards capacity building and engagement with key stakeholders. This will be done through presentations and representation at conferences. Engagement with key stakeholders will be supported by existing dialogue with international organisations such as the WDS, RDA, CODATA and COAR.
Challenges: The main challenge of this work is to ensure that the data collected by the EAO supports positive discussion and a commitment to advancing openness around the world. There is always the possibility that the data collected could be misused to “name and shame” certain OS infrastructure providers, or by providers to block certain geographic regions from the use of their resources. These issues will be carefully considered in the data management plan and a clear set of ethical guidelines developed for both the internal use and external reuse of the data collected. In terms of software development the main challenges will be updating the API to Bright Initiative and optimising the analysis software so that it can be used to automatically generate reports. There is always a small possibility that Bright Initiative may not provide such a wide range of VPNs. In that case other possible vendors can be approached.
Neglectedness: No, recent funding applications by the team have not been successful. The difficulty of finding funding for this project relates to its focus (technical development rather than research) as well as its interdisciplinarity (computer and social sciences as well as law). We believe that IOI is a good fit for this project that will support further openness in the OS landscape. This field of research, as well as the development of a practical tool for use within the Open Science community differs from most other OS activities, as it focuses on the limits of openness. Enabling the OS community - in particular infrastructure providers - with the data to facilitate critical self-reflection is vital to the equitable development of the OS movement. To date, however, such activities have been almost entirely neglected.
Success: Basic success: a working observatory able to collect and visualise data by month 24, as well as a policy brief for OS infrastructure providers. Awareness within key stakeholder communities about the EAO and the data it is collecting. Moderate success: basic success together with a robust discussion within the OS community and integration of the EAO data into other studies. Adoption of the policy brief by stakeholders to update infrastructures and governing policies. Significant success: integration of the EAO into Open Science training courses to empower students to make non-exclusionary decisions about openness. Adoption of EAO monitoring by key OS bodies such as COAR, RDA, ArXiv, GitHub and CoreTrustSeal.
Total Budget: USD25000
Budget File: pdf
Affiliations: no
LMIE Carveout: This project definitely supports researchers in LMIEs, as the majority of the countries experiencing access issues are located in these regions. The ability for researchers in these regions to start conversations about limited access by using the data from the EAO will empower them to vocalise ways in which the global OS community can be more inclusive. Both Prof Shanahan and Dr Bezuidenhout have extensive network within LMIEs through their work as co-chairs of the CODATA-RDA Schools for Research Data Science. This network is specifically aimed at training ECRs from LMIEs. Furthermore, as a South African, Dr Bezuidenhout has strong research contacts throughout the SADC region. These networks of contacts will ensure that the EAO services are widely promoted and adopted across LMIEs.
Team Skills: Hugh is a professor of Open Science at the department of Computer Science at Royal Holloway, University of London. He has been a vocal advocate for the OS throughout his career and has made software he developed freely available since the 1990’s. He has a background in Computational Physics and Computational Biology. Through this he has expertise in developing and managing software projects and providing corresponding web services. He has been involved in work on adapting the FAIR data principles to software and investigating how computational notebooks can be made publishable and findable. Louise is a senior data expert at the Data Archiving and Networked Services (DANS), Royal Netherlands Society for Arts and Sciences. She is a sociologist and has spent much of her career researching barriers to OS as experienced by researchers in LMIEs. This research has involved considerable embedded research to identify daily challenges that shape individual researcher’s ability to engage with OS resources online. Through this research she has a deep understanding of challenges to OS practices relating to infrastructural design and contextual idiosyncrasies. As long-standing colleagues through the CODATA-RDA Schools for Research Data Science, Hugh and Louise have established an effective interdisciplinary collaboration. The ability to combine technical expertise with sociological and ethical insights offers a unique pathway to developing robust and value-informed OS infrastructures.
How Did You Hear About This Call: Word of mouth (e.g. conversations and emails from IOI staff, friends, colleagues, etc.) / Boca a boca (por ejemplo, conversaciones y correos electrónicos del personal del IOI, amigos, colegas, etc.)
Submission Number: 47
Loading