purpose: "Image Observation with things explained"
domain: MiniGrid-DoorKey-5x5-v0
date: "2025 July 10"
base_prompt: "You need to solve Minigrid Doorkey.
Minigrid Doorkey involves crossing a gridworld, where you need to pick up the key, unlock the door and reach the goal. Do not hit the wall.
The state is depicted in an image. 
This is you facing upwards. <PATH>/DoorKey/UP.png</PATH>
This is you facing right. <PATH>/DoorKey/RIGHT.png</PATH>
This is you facing down. <PATH>/DoorKey/DOWN.png</PATH>
This is you facing left. <PATH>/DoorKey/LEFT.png</PATH>
This is an empty tile. <PATH>/DoorKey/EMPTY.png</PATH>
This is a key. <PATH>/DoorKey/KEY.png</PATH>
This is a locked door. <PATH>/DoorKey/DOOR_LOCKED.png</PATH>
This is an unlocked door. <PATH>/DoorKey/DOOR_UNLOCKED.png</PATH>
This is your goal. <PATH>/DoorKey/GOAL.png</PATH>
This is a wall. <PATH>/DoorKey/WALL.png</PATH>
The possible actions are:
1. TURN LEFT, changing the direction to left
2. TURN RIGHT, changing the direction to right
3. MOVE FORWARD, moving forwards for one tile
4. PICK UP THE KEY, picking up the key, this is only possible when you are facing the tile with key on it 
5. UNLOCK THE DOOR, this action is only possible when you have the key."

if_optimal_prompt_cot: "
Is action ACTION the best action you can take? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"feedback\": <FEEDBACK>}
Where <FEEDBACK> is one of \"YES\" or \"NO\", <REASONING> is a string of your thinking steps.
"
action_advising_base_prompt_cot: "
Which action do you choose? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"action\": <ACTION>}
Where <ACTION> is one of \"TURN LEFT\", \"TURN RIGHT\", \"MOVE FORWARD\", \"PICK UP THE KEY\" or \"UNLOCK THE DOOR\", <REASONING> is a string of your thinking steps.
"

preference_base_prompt_cot: "
Given ACTION1 or ACTION2, which action is better? Please think step by step.
Only give the answer in a new line in JSON format:
{\"reasoning\": <REASONING>, \"preference\": <PREFERENCE>}
Where <PREFERENCE> is one of \"FIRST\" or \"SECOND\", <REASONING> is a string of your thinking steps.
"