type,prompt
base,"I'm $AGENT_NAME$. My friend $OPPO_NAME$ and I want to transport as many target objects as possible to the bed with the help of containers within 3000 steps. I can hold two things at a time, and they can be objects or containers. I can grasp containers and put objects into them to hold more objects at a time. Given our shared goal, dialogue history, my progress, and previous actions, please help me choose the best available action to achieve the goal as soon as possible. Note that a container can contain three objects, and will be lost once transported to the bed. I can only put objects into the container I hold after grasping it. All objects are denoted as <name> (id), such as <table> (712). Actions take several steps to finish. It may be costly to go to another room or transport to the bed, use these actions sparingly. Also, I cannot hand over objects to the other agent. If both of my hands are already full, I cannot grasp any additional item, including containers, until I place something on the bed.
Goal: $GOAL$
Progress: $PROGRESS$
Dialogue history:
$DIALOGUE_HISTORY$
Previous actions: $ACTION_HISTORY$
Available actions:
$AVAILABLE_ACTIONS$
Answer:"
check,"'I'm $AGENT_NAME$. I'm examining my reasoning trace for uncertainties before acting.
Reasoning Trace = the planner's internal chain-of-thought used to select the next action in the previous planning step.
Goal: $GOAL$
Progress: $PROGRESS$
Dialogue history:
$DIALOGUE_HISTORY$
Previous actions: $ACTION_HISTORY$
Available Actions:
$AVAILABLE_ACTIONS$
Reasoning Trace:
$REASONING_TRACE$

Your task is to convert the Reasoning Trace into a deeper tree-of-thinking that enumerates major uncertainties (including multi-agent cases), where each leaf maps to exactly one next action (copied verbatim from Available Actions). The evaluator will score each scenario leaf (Likelihood, Gain, CostPenalty) to select the best action.

Rules:
1. Identify the uncertain statements in the Reasoning Trace that materially affect action choice.
   - Uncertain = assumptions, guesses, speculation, or information that may be invalid due to collaborators’ past actions.
   - Example: 'The item might be in the kitchen.' or 'Bob probably already took the mouse.'

2. For each uncertain statement, create a binary branch and limit the maximum tree depth to 3.
   - 'True': the assumption is correct.
   - 'False': the assumption is incorrect.

3. Expand the tree in order of (i) higher uncertainty, then (ii) greater expected impact on goal success, steps, or token/communication usage.
   - Include multi-agent third-cases (e.g., the object existed but was already taken by the collaborator and my information is stale).
   - For each branch, continue with the next uncertainty; if none remain, attach exactly one action string copied verbatim from Available Actions.

4. Ground leaf nodes using only actions from Available Actions.
   - Do not invent new objects/IDs. Use only entities present in Reasoning Trace, Progress, or Available Actions.
   - You may communicate with the collaborator via send a message <'...'> even if it is not listed in Available Actions. All other actions must come from Available Actions.
   - If two different scenarios lead to the same action, keep separate leaves (the evaluator still needs distinct scenarios for expected-value computation).

5. Output only the nested JSON object that represents the scenario tree (no extra text).
   - Prefer explicit node types in the JSON for clarity: {'type': 'hypothesis' | 'action', ...}.
   - Each hypothesis should include a short 'reason' field (why this assumption is considered).
   - Each action must preserve its full format as listed in Available Actions (including 'cost: X steps' or 'my cost: X steps').

Example (paired with Reasoning Trace and Available Actions):

Example Reasoning Trace:
- You are in the <Bedroom> (8000) and holding nothing. Goal: Transport 3 purses, 2 iphones, 1 mouse, 4 calculators to the bed.
- I have the option to explore the current room, go to the <Office> (4000), go to <Kitchen> (5000), or go grasp the <mouse> (8739695).

Available Actions:
- go to <Kitchen> (5000) - my cost: 116 steps
- go to <Office> (4000) - cost: 162 steps
- go to <Office> (1000) - cost: 80 steps
- explore current room <Bedroom> (8000) - cost: 1 steps
- go grasp target object <mouse> (8739695) - my cost: 86 steps
- send a message

Example Output (valid JSON):
{
  'type': 'hypothesis',
  'text': 'Grasping <mouse> (8739695) now yields immediate progress',
  'reason': 'The mouse is a required item and within reachable cost (86 steps).',
  'True': {
    'type': 'action',
    'action': 'go grasp target object <mouse> (8739695) - my cost: 86 steps'
  },
  'False': {
    'type': 'hypothesis',
    'text': 'Exploring the current room <Bedroom> (8000) will reveal additional required items at minimal cost',
    'reason': 'Exploration here costs 1 step and may uncover target objects or containers.',
    'True': {
      'type': 'action',
      'action': 'explore current room <Bedroom> (8000) - cost: 1 steps'
    },
    'False': {
      'type': 'hypothesis',
      'text': 'Next best room to visit (by lower travel cost) is <Office> (1000) over <Office> (4000) and <Kitchen> (5000)',
      'reason': '80 steps (Office-1000) vs 116 (Kitchen) vs 162 (Office-4000).',
      'True': {
        'type': 'action',
        'action': 'go to <Office> (1000) - cost: 80 steps'
      },
      'False': {
        'type': 'hypothesis',
        'text': 'I lack collaborator status; coordination could prevent redundant moves',
        'reason': 'Unobserved teammate actions may make some moves unnecessary.',
        'True': {
          'type': 'action',
          'action': 'send a message <'Alice, what have you already collected? Should I go to Kitchen (5000) or Office (4000)?'>'
        },
        'False': {
          'type': 'action',
          'action': 'go to <Kitchen> (5000) - my cost: 116 steps'
        }
      }
    }
  }
}"
evaluate,"I'm $AGENT_NAME$. I have received a Scenario Tree produced by the checker. Each leaf corresponds to a concrete next-action choice under a specific set of assumptions (including multi-agent cases such as collaborator-taken objects or stale information).

Tree format you may receive:
- Hypothesis node:
  {'type': 'hypothesis', 'text': '<assumption>', 'reason': '<why this is considered>', 'True': {..}, 'False': {..}}
- Action node (leaf):
  {'type': 'action', 'action': '[action_string]'}

Your task: Evaluate all actionable leaves (each leaf that contains a concrete action) and return a compact mapping that the selector can rank. For each scenario (leaf), provide:

1. Likelihood (1–5):
   - How plausible it is that this branch (set of assumptions) is true.
   - 1 = very unlikely, 5 = highly likely.

2. Gain (1–5):
   - How much this action advances/satisfies the goal under this branch.
   - 5 = directly satisfies or the most impactful next step; 3 = moderate progress; 1 = marginal.

3. CostPenalty (1–5):
   - A coarse cost proxy derived from travel distance and communication.
   - If the action string includes 'cost: X steps' or 'my cost: X steps', map X to penalty:
   - X ≤ 30 → 1, 30 < X ≤ 80 → 2, 80 < X ≤ 150 → 3, 150 < X ≤ 200 → 4 , X > 200 → 5.
   - If no distance appears, use action-type defaults: 'go grasp target object'=2, 'explore current room'=1, 'go to'=3, 'transport to bed'=3.
   - Communication (send a message) adds +1 to CostPenalty (no movement) to reflect communication cost.

4. Action:
   - Exactly one action from Available Actions that corresponds to this scenario branch.
   - Copy the full action string verbatim (e.g., 'go grasp target object <mouse> (8739695) - my cost: 86 steps'). If the leaf already contains a concrete action string, use that.

Constraints:
- Use only actions that appear in Available Actions (verbatim match; do not invent objects/IDs).
- You may communicate with the collaborator via send a message <'...'> even if it is not listed in Available Actions.
- Evaluate every leaf that contains an action. If the Scenario Tree has N actionable leaves, return N scenarios.
- Provide Likelihood, Gain, and CostPenalty as defined above. Performance will be computed by the system.
- Number scenarios in a stable order (e.g., left-to-right depth-first traversal): 'Scenario 1', 'Scenario 2', ...

Input:
Goal: $GOAL$
Progress: $PROGRESS$
Dialogue history:
$DIALOGUE_HISTORY$
Previous actions: $ACTION_HISTORY$
Available Actions:
$AVAILABLE_ACTIONS$
Scenario Tree:
$SCENARIO_TREE$

Output Format:
Return a JSON dictionary (no extra text) where:
- Keys = scenario identifiers (e.g., 'Scenario 1', 'Scenario 2', ...).
- Values = a dictionary with:
  - 'Likelihood': integer (1–5)
  - 'Gain': integer (1–5)
  - 'CostPenalty': integer (1–5)
  - 'Action': string

Example Output:
{
  'Scenario 1': {
    'Likelihood': 4,
    'Gain': 5,
    'CostPenalty': 3,
    'Action': 'go grasp target object <mouse> (8739695) - my cost: 86 steps'
  },
  'Scenario 2': {
    'Likelihood': 3,
    'Gain': 3,
    'CostPenalty': 1,
    'Action': 'explore current room <Bedroom> (8000) - cost: 1 steps'
  },
  'Scenario 3': {
    'Likelihood': 2,
    'Gain': 2,
    'CostPenalty': 5,
    'Action': 'go to <Office> (1000) - cost: 250 steps'
  },
  'Scenario 4': {
    'Likelihood': 3,
    'Gain': 2,
    'CostPenalty': 1,
    'Action': 'send a message <\'Alice, what have you already collected? Should I go to Kitchen (5000) or Office (4000)?\'>'
  }
}"
reply,"I'm $AGENT_NAME$. My collaborator $OPPO_NAME$ just sent a message and I must reply concisely and helpfully to keep us coordinated on the shared goal.
Goal: $GOAL$
My Progress: $PROGRESS$
Dialogue history:
$DIALOGUE_HISTORY$
Previous actions: $ACTION_HISTORY$
Last incoming message from $OPPO_NAME$:
$LAST_MESSAGE$

First, directly ANSWER any explicit question in the last message.

Then, provide exactly ONE short follow-up (status, location, or confirmation) that reduces uncertainty most relevant to the goal (e.g., where an item is, who will grab it, whether a room was checked).

If making/confirming commitments, reference specific objects with their full notation <name> (id) ONLY if that exact id is known from context; otherwise use just the name without inventing ids.

Keep it brief. Be polite, action-oriented, and unambiguous.

OUTPUT FORMAT: return exactly one action string in this format:
send a message <'...'>
Do not add any extra text before or after.

Answer:"
