The SDG ontology is part of an emerging system of SDG-related ontologies that aim to provide data inter-operability and a flexible interface for querying linkages across independent information systems. Mapping the identifiers described in these ontologies to each other and to external vocabularies allows SDG data to be clearly identified and found by semantic web agents for establishing further links and connections thereby facilitating knowledge discovery.

\begin{comment}
\begin{figure*}[h]		
\centering
		\includegraphics[width=12.2cm]{"04.05.app.linkedsdg.01.png"}

		\caption{Extracted concepts and geographic locations from one of the Voluntary National Review documents (French version).}
		\label{fig:sdg-app-linkedsdg}
\end{figure*}
\end{comment}

\begin{table}[h]
	\small
	\begin{minipage}{0.48\textwidth}
		\centering
		\begin{tabular}{@{}l ccc@{}}
			\includegraphics[angle=180,scale=0.44]{"05.fig.3.1.pdf"}			
		\end{tabular}
	\end{minipage}
	\begin{minipage}{0.48\textwidth}
		\centering
		\begin{tabular}{@{}l ccc@{}}	
			\includegraphics[scale=0.84]{"05.fig.3.2.pdf"}	\\
			\includegraphics[scale=0.32]{"05.fig.3.3.pdf"}	
		\end{tabular}
	\end{minipage}\vspace{5mm}\captionof{figure}{Application displaying extracted concepts and corresponding tag cloud, and a map highlighted with geographic regions mentioned in a document.}
\label{fig:sdg-app-linkedsdg}
\end{table}
A pilot application, LinkedSDG\footnote{\href{http://linkedsdg.apps.officialstatistics.org/}{http://linkedsdg.apps.officialstatistics.org/}, source code for the application available at:  \href{https://github.com/UNGlobalPlatform/linkedsdg}{https://github.com/UNGlobalPlatform/linkedsdg}.}, has been built to showcase the usefulness of adopting SDG KOS for extracting SDG related metadata from documents and establishing the connections among various SDGs. The application automatically extracts relevant SDG concepts mentioned in a given document using SDG KOS and provides their unified overview. All SDGs related to the identified concepts are displayed in an interactive wheel chart that users can further explore by drilling into associated goals, targets, indicators and series. Figure \ref{fig:sdg-app-linkedsdg} and Figure \ref{fig:sdg-app-linkedsdg-data} depicts different components of application including concept extraction, map, SDG wheel chart and associated data series for one of the Voluntary National Reviews (VNRs)\footnote{See \href{https://sustainabledevelopment.un.org/}{https://sustainabledevelopment.un.org/} for VNR documents}. The application can process documents written in any one of the six UN official languages. 

\begin{table}[h]
	\small
	\begin{minipage}{0.48\textwidth}
		\centering
		\begin{tabular}{@{}l ccc@{}}
		\includegraphics[angle=0,scale=0.55]{"05.fig.4.1.pdf"}
		\\
		\end{tabular}
	\end{minipage}\hfill % maximize space between the minipages
	\begin{minipage}{0.48\textwidth}
		\centering
		\begin{tabular}{@{}l ccc@{}}	
		\includegraphics[scale=0.52]{"05.fig.4.2.pdf"}	\\
		\includegraphics[scale=0.44]{"05.fig.4.3.pdf"}	
		\end{tabular}
	\end{minipage}\vspace{5mm}\captionof{figure}{Application showing the most relevant SDG keywords corresponding to Target 3.3, an interactive SDG wheel corresponding to a document and related data series for one of the Indicators}
\label{fig:sdg-app-linkedsdg-data}
\end{table}
The two key analytical techniques employed in LinkedSDG are taxonomy-based \emph{term extraction} and \emph{knowledge graph traversal}. The term extraction mechanism, implemented using the \emph{spaCy} library \footnote{\href{https://spacy.io/}{https://spacy.io/}}, scans the submitted document for all literal mentions of the relevant UNBIS and EuroVoc concept labels, based on the initially detected language of the document, and associates them with their respective concept identifiers. The traversal, performed using SPARQL via the underlying \emph{Apache Jena} RDF store\footnote{\href{https://jena.apache.org/}{https://jena.apache.org/}}, starts from these extracted concept identifiers, following to broader ones, to finally reach those connected directly to the elements of the SDG system via the \emph{dct:subject} and \emph{skos:exactMatch} predicates. Then, the algorithm traces the paths to broader SDG entities in the SDG KOS hierarchy. For instance:
\begin{itemize}
	\item {\textbf{text}: "[...] beaches, estuaries, dune systems, mangroves, marshes lagoons, swamps, reefs, etc are [...]"}
	\item{\textbf{extracted concept}: WETLANDS (unbis:1007000) via the matched synonym "marshes"}
	\item{\textbf{traversed path}: WETLANDS - (broader)  $\rightarrow$ SURFACE WATERS (unbis:1006307) - (broader) $\rightarrow$ WATER (unbis:030500)}
	\item{\textbf{connected target}: 6.5 By 2030, implement integrated water resources management at all levels, including through transboundary cooperation as appropriate}
	\item{\textbf{connected goal}: 6. Clean water and sanitation}
\end{itemize}


The computation of the final relevance scores for specific goals, targets and indicators relies on their exact positioning in the SDG hierarchy, which is reflected by the SKOS representation of the system, and on their types, asserted in the SDG ontology. Intuitively, the broader the terms (i.e., the higher in the hierarchy) the higher score they receive, as they aggregate the scores in the lower parts of the hierarchy. 

The application also provides access to the statistical data of the specific SDG series, which is represented as linked open statistical data using the \emph{RDF Data Cube} vocabulary\footnote{\href{https://www.w3.org/TR/vocab-data-cube/}{https://www.w3.org/TR/vocab-data-cube/}}. The relevant SDG series identifiers are referenced from the extraction results delivered by the application and independently served by the platform’s dedicated GraphQL API \footnote{\href{http://linkedsdg.apps.officialstatistics.org/graphql/}{http://linkedsdg.apps.officialstatistics.org/graphql/}}.  Consequently, SDG KOS fueling the LinkedSDG platform supports the user in the entire journey from a text document to the relevant statistics, helping put the originally unstructured, third-party information, in the context of narrowly focused, UN-owned structured data.

