Abstract: Highlights•Crowd simulation details affect the results when trained with reinforcement learning.•It is typically beneficial for agents to directly observe others’ positions.•Egocentric observations and actions are better than global ones.•Reward design is highly impactful and nontrivial.•Simple reward functions do not produce human-like behavior.
Loading