Abstract: Highlights•A Trainer Software Agent (TA) makes its behaviour available via blockchains.•Trainee Agents seek effective training to interact with similar to TAs environments.•Reinforcement Learning is used for the training, i.e., reward/penalize.•We propose RL’s enrichment with behavioural cloning and imitation learning.•Blockchain smart contracts are utilized for ‘storing’ demonstration files.•Blockchains reassure quality and non-modifiability of the demonstration files.
Loading