Prototyping an HEALTH DCAT-AP data catalogue to support population health indicator identification and quality assessment
Keywords: W3C DCAT, EHDS, metadata, data quality, dataspace
Abstract: This paper describes prototyping experiences in a population health use case of the draft HEALTH DCAT-AP specification for health data catalogues under the European Health Dataspaces Regulation. This included the development of a data catalogue metadata model, catalogue population via direct data entry and scraping of open data, and development of health indicator quality and feasibility reports. It was found necessary to extend the catalogue with new classes and properties for this use case, some of which were from the Data Privacy Vocabulary (DPV), and a number of limitations in the current HEALTH DCAT-AP specification draft were discovered. Stakeholders were generally positive in their assessment of the contribution of this novel structured approach to health data indicator discovery and assessment. This shows the potential for the semantic data governance infrastructure specified by the European Health Dataspaces Regulation to influence future data-driven decision making at all levels of European health services. The catalogue metadata model, report queries and data scraping code are all made available as open source resources for reuse by others. One new property has been added to DPV as a result of this work and it will feed into the HEALTH DCAT-AP standardisation process in the ETSI/TC Data. This paper describes a population health use case based on defining a health and wellbeing profile for older adults, data catalogue competency questions for this use case, a metadata model for the catalogue that meets these requirements, and a data quality feasibility and assessment reporting workflow along with stakeholder feedback.
Submission Number: 4
Loading