A data infrastructure to bridge BI and research analytics tools and methodsOpen Website

15 May 2020 (modified: 05 May 2023)VIVO2020 aspresentationReaders: Everyone
Keywords: Research analytics, Research systems, Research evaluation, Research assessment, Scientometrics, Bibliometrics, data infrastructure
TL;DR: Development and deployment of a data infrastructure that provides a bridge between traditional business intelligence (BI) components (e.g. SQL and Tableau) with research analytics tools and methods (e.g. VIVO RDF ontologies and VosViewer).
Abstract: The traditional toolsets employed in business intelligence and research analytics applications have been historically developed in separate silos. As a result, although some of their underlying challenges are similar, it is not always simple to repurpose solutions. This creates data and tool portability problems, inhibits collaboration and leads to duplicated efforts. In the context of a "National Open science Research Analytics" (NORA) pilot for 8 Danish universities, we have developed a data infrastructure that provides a bridge between traditional business intelligence (BI) components with research analytics tools and methods. Key elements of this infrastructure include 1) a pipeline orchestrator to manage data updates and maintenance tasks 2) a "single source of truth" to store structured and unstructured data in the form of a service-agnostic NoSQL document database that collects data from multiple external sources (e.g. APIs and CSVs) and 3) interfaces between the NoSQL document database and the BI toolset (e.g. Graph Database and Tableau) and research analytic toolset (e.g. VIVO RDF and VosViewer). Benefits of this data infrastructure include: a) Making scientometric data more widely available across a range of tools and services. This means that different areas and stakeholders can access the data in the format and through the tools that they are more used to. b) Improves the fit between the database and the application. Nowadays we have relational (e.g. PostgreSQL), document (e.g. MongoDB NoSQL), and graph databases (e.g. Neo4j and VIVO RDF triplestore). Having a data infrastructure that combines them provides a more flexible and powerful architecture, able to bridge the gap between the traditional BI and research analytic worlds. c) The inclusion of tools and libraries with a large and robust community that maintains a wide range of open source initiatives. In our presentation, we will share the benefits, challenges and learnings of setting up this data infrastructure through a guided tour that takes us from data ingestion all the way to an interactive BI-inspired dashboard for research collaboration.
2 Replies

Loading