
Metadata about the linked open dataset has been reported in a machine-accessible VoID \cite{alexander2009describing} and LIME \cite{fiorelli2015lime} files at UN Metadata site\footnote{\href{http://metadata.un.org/sdg/void.ttl}{http://metadata.un.org/sdg/void.ttl}}. The VoID file also contains entries linking the download dumps.

VoID is an RDF vocabulary for describing linked datasets, which has become a W3C Interest Group Note\footnote{\href{http://www.w3.org/TR/void/}{http://www.w3.org/TR/void/}}. VoID  provides the policies for its publication and linking to the data \cite{gandon2015semantic} and also defines a protocol to publish dataset metadata alongside the actual data, making it possible for consumers to discover the dataset description just after encountering a resource in a dataset. Developed within the scope of the OntoLex W3C Community Group\footnote{\href{https://www.w3.org/community/ontolex/
}{https://www.w3.org/community/ontolex/
}}, LIME is an extension of VoID for linguistic metadata. While being initially developed as the metadata module of the OntoLex-Lemon model\footnote{\href{https://www.w3.org/2016/05/ontolex/}{https://www.w3.org/2016/05/ontolex/}} \cite{mccrae2017ontolex}, LIME intentionally provides descriptors that can be adapted to different scenarios (e.g. ontologies or thesauri being lexicalized, resources being onomasiologically or semasiologically conceived) and models adopted for the lexicalization work (\emph{rdfs:labels}, SKOS or SKOS-XL terminological labels or Ontolex lexical entries).

The VoID description is organized by first providing general information about the dataset through the usual Dublin core \cite{bird2003extending} properties, such as description, creator, date of publication, etc. Of particular notice is the dct:conformsTo property, pointing in this case to the SKOS namespace and which can be adopted by metadata consumers in order to understand the core modelling vocabularies being adopted representing the dataset. The description is followed by a few \emph{void:classPartitions} providing statistics about the types of resource characterizing the type of dataset (as informed by the aforementioned \emph{dct:conformsTo}). 

In the case of SDG, the template for SKOS has therefore been applied, providing statistics for \emph{skos:Concepts}, \emph{skos:Collections} and \emph{skos:ConceptSchemes}, reporting a total of 812 concepts, 1 concept scheme and 1 collection. The description continues with typical \emph{void} information, such as statistics about the number of distinct subjects (1002), objects (6842), triples (14645), availability of a SPARQL endpoint and downloadable data dump (\emph{void:dataDump}), which we provided for the full dump as well as for some partitioned versions of the dataset (e.g. ontology only, $\langle$ Goal, Target, Indicator $\rangle$ only, etc..). Finally, a list of subsets are then described in detail in the rest of the file. This list is mainly composed of \emph{void:Linksets}, which are datasets consisting of a series of alignment triples between the described dataset and other target datasets, and of \emph{lime:Lexicalizations}, the portions of the described dataset containing all the triples related to the (possibly multiple, as in this case) lexicalizations that are available for it. Each lexicalization is described in terms of its lexicalization model (which in the case of the SDGs is SKOS), of the natural language covered by the lexicalization, expressed  in terms of ISO639-1 2-digit code as a literal (through property \emph{lime:language}), and of ISO639-1 2-digit code and ISO639-3 3-digit code in the form of URIs using the vocabulary of languages\footnote{\href{http://id.loc.gov/vocabulary/}{http://id.loc.gov/vocabulary/}}  of the Library of Congress\footnote{\href{https://loc.gov/}{https://loc.gov/}}. It also includes information about the lexicalized dataset (\emph{lime:referenceDataset}) and void-like statistical information such as the total number of lexicalized references, the number of lexicalizations, the average number of lexicalizations per reference and the percentage of lexicalized references. More details about the available lexicalizations will be provided in section \ref{section:multilingualism} on multilingualism.