I have one reward function with its design idea and code as follows.
Design Idea: {design_idea}
Code: {reward_function}
We trained a RL policy using the provided reward function code and tracked the values of the individual components in the reward function as well as global policy metrics such as success rates and episode lengths after every {epoch_freq} epochs and the maximum, mean, minimum values encountered:
{trained_results}
Analysis tips for trained results:
{trained_result_analysis_tip}

Please create a new reward function that has a different form but can be a modified version of the provided reward function. The new reward function should have a higher task score. Try to introduce more novel idea and add or remove some basic reward components from the environment.