purpose: "Image Observation Unknown Dynamics"
domain: MiniGrid-DoorKey-5x5-v0
date: "2025 July 10"
base_prompt: "You need to solve Minigrid Doorkey.
The state is depicted in an image. 
The possible actions are:
1. TURN LEFT
2. TURN RIGHT
3. MOVE FORWARD
4. PICK UP THE KEY
5. UNLOCK THE DOOR"

if_optimal_prompt_cot: "
Is action ACTION the best action you can take? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"feedback\": <FEEDBACK>}
Where <FEEDBACK> is one of \"YES\" or \"NO\", <REASONING> is a string of your thinking steps.
"
action_advising_base_prompt_cot: "
Which action do you choose? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"action\": <ACTION>}
Where <ACTION> is one of \"TURN LEFT\", \"TURN RIGHT\", \"MOVE FORWARD\", \"PICK UP THE KEY\" or \"UNLOCK THE DOOR\", <REASONING> is a string of your thinking steps.
"

preference_base_prompt_cot: "
Given ACTION1 or ACTION2, which action is better? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"preference\": <PREFERENCE>}
Where <PREFERENCE> is one of \"FIRST\" or \"SECOND\", <REASONING> is a string of your thinking steps.
"