The Transformer-on-QuPairs.py file shows the code for training the transformer model with reinforcement learning. It has a model initialization part and the training part. If we would like to train the model we will need to comment the initialization codes.

When applying this code to different environment training like using the different fidelity distribution (X40std0.txt, X40std3.txt, X40std6.txt, X40std9.txt) or using different qubit number scaling (X40std9.txt, X80std9.txt, X120std9.txt, X160std9.txt). Just need to change the loading line for variable X and R as well as adjusting the Nqubit and Npath number, then we can adopt this code for different situations.