The following paragraphs explain how to reproduce the experiments in the paper and provide detailed information about the training of KGE models and hyperparameter searches performed.
Reproducing the experiments is shown exemplary for the ComplEx model on the dataset CoDEx-M, e.g, the entry with an MRR equal to 0.349 in the table in Appendix A and the corresponding marker of Figure 1 in the main text.
To reproduce the example for ComplEx on CoDEx-M, please follow the steps below:
Follow the quick start installation instructions of the libKGE
library. Also run download.sh
as mentioned in the installation instructions.
To check if the installation was successful run kge
in your console,
which should output
usage: kge [-h] {start,create,resume,eval,valid,test,dump,package} ...
kge: error: the following arguments are required: command
Additionally keep track of the folder you cloned from Github which contains the library and is termed kge
.
The folder will be referenced as /path/to/libKGE/kge
below.
Download the file codex-m-lp-complex.pt
from here,
it contains the pre-trained model checkpoint. Place the file in this folder, i.e, the folder where this readme.html is located.
Ensure the files anyburl-test
and anyburl-valid
, which contain the original AnyBURL rankings,
are located in this folder, i.e, the folder where the readme.html is located.
You also need the location to the Codex-m dataset folder which is located in /path/to/libKGE/kge/data/codex-m
Ensure that you are within the python environment under which libKGE is installed.
Navigate into this folder (alternatively, simply ensure that all the relativ/absolute paths to the files are correct)
and run
python rescorex.py --dataset_folder /path/to/libKGE/kge/data/codex-m
--checkpoint codex-m-lp-complex.pt --ab_ranking anyburl-test --model_name complex
subsequently, run
python rescorex.py --dataset_folder /path/to/libKGE/kge/data/codex-m
--checkpoint codex-m-lp-complex.pt --ab_ranking anyburl-valid --model_name complex
When you are using Windows, additionally use the flag -Xutf8
.
After running the two commands, the two files anyburl-test-complex
and anyburl-valid-complex
have been created in the same folder.
For help on the parameters of the script run python rescorex.py --help
.
When running these scripts some warnings are generated. These warnings can be ignored. They are based on the fact that the checkpoints available at the libKGE page have been generated with an older libKGE version.
The code for aggregating the ranking files is written in Java and uses AnyBURL as dependency. This means that you have to download the latest AnyBURL version, which is available as AnyBURL-JUNO.jar. The code for learning and performing the aggregation is available in the jar file JoinRE.jar located within this folder, i.e, the folder where this readme.html is located. Both jar files are compiled against the java 1.8 profile, which means that they should be also executable with older java versions.
Run the aggregation code with the following command (on Windows you have to separate the two jar files in the command with a ; semicolon, on Linux with a : colon). You have to specify seven input parameters seperated by blanks: (1) path to training file, (2) path to validation file, (3) path to ranking file created by a rule learner for the validation set, (4) path to rule-based ranking file with scores from kge model for the validation set, (5) path to ranking file created by a rule learner for the test set, (6) path to rule-based ranking file with scores from kge model for the test set, (7) output path where the aggregated ranking file should be stored.
You can use this command line call and run it from this folder, i.e, the folder where the readme.html is located. You have to change the first two arguments, to refer to your libKGE installation. The other parameters should fit to the content of this directory, if you conducted all previous steps.
java -cp JoinRE.jar;AnyBURL-JUNO.jar x.y.z.Rescorer /path/to/libKGE/kge/data/codex-m/train.txt /path/to/libKGE/kge/data/codex-m/valid.txt anyburl-valid anyburl-test anyburl-valid-complex anyburl-test-complex output
As results, an additional ranking file named output
is stored in this folder.
You can use the evaluation method from AnyBURL to evaluate it. This requires to modify the first three paths (which refer to training, validation and test set of the used evaluation dataset) to /path/to/libKGE/kge in the file config-eval.properties
, which is also available in this folder. Compute hist@k and MMR with this command line call.
java -cp AnyBURL-JUNO.jar de.unima.ki.anyburl.Eval
The output should look like this:
... 0.2772 0.3838 0.4922 0.3494
The numbers have the followng meaning: hits@1, hits@3, hits@10, and MRR.
For all KGE models except of the transformer implementation we use the pre-trained models provided by libKGE. The AnyBURL rankings can be generated as described on the AnyBURL webpage. You can use the AnyBURL default parameter setting. There are only two exceptions:
TOP_K_OUTPUT = 100
to the config-apply.configuration
. This generates the top 100 rankings (10 in the default setting).config-learn.configuration
for the WN18RR dataset. The default is 3, which shold be used for the other datasets.
As mentioned in the main text, for all KGE models except of the transformer implementation we use the pre-trained models provided
by the libKGE library which can be downloaded here.
For the transformer implementation, we provide detailed config information in the folder
transformer-configs
within the supplementary materials. All the configs can be run seamlessly under the current libKGE master (June 2021).
For the transformer model, we obtained hyperparameters for the FB237 dataset from the libKGE developers (transformer-configs/training-configs/transformer-fb237-config.yaml
).
For the remaining datasets, we use the Ax hyperparameter search provided by libKGE where we centered the search space around the FB237 parameters. For each dataset, we use the same search space.
On Codex-L we run 15 trials out of which 7 are quasi-random. For the remaining datasets we use 30 trials with 15 quasi-random trials. The detailed specifications can be found in transformer-configs/search-configs
.
The final models for the transformer are obtained by using the configurations of the
search trials achieving the highest MRR on the validation split. The resulting configurations can be found in
transformer-configs/training-configs
.