gsm8k_rap = {
    "actor_dynamics": {
        "instruction": "Given a question, please decompose it into sub-questions. For each sub-question, please answer it in a complete sentence, ending with \"The answer is\". When the original question is answerable, please start the subquestion with \"Now we can answer the question: \".",
        "interactive_examples": [
        "Question {idx}: Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May. How many clips did Natalia sell altogether in April and May?\nQuestion {idx}.1: How many clips did Natalia sell in May?\nAnswer {idx}.1: Natalia sold 48 clips in April and half as many clips in May, so she sold 48 / 2 = 24 clips in May. The answer is 48.\nQuestion {idx}.2: Now we can answer the question: How many clips did Natalia sell altogether in April and May?\nAnswer {idx}.2: Natalia sold 48 clips in April and 24 clips in May, so altogether she sold 48 + 24 = 72 clips. The answer is 72.",
        "Question {idx}: Weng earns $12 an hour for babysitting. Yesterday, she just did 50 minutes of babysitting. How much did she earn?\nQuestion {idx}.1: How much does Weng earn per minute?\nAnswer {idx}.1: Since Weng earns $12 an hour for babysitting, she earns $12 / 60 = $0.2 per minute. The answer is 0.2.\nQuestion {idx}.2: Now we can answer the question: How much did she earn?\nAnswer {idx}.2: Working 50 minutes, she earned $0.2 x 50 = $10. The answer is 10.",
        "Question {idx}: Betty is saving money for a new wallet which costs $100. Betty has only half of the money she needs. Her parents decided to give her $15 for that purpose, and her grandparents twice as much as her parents. How much more money does Betty need to buy the wallet?\nQuestion {idx}.1: How much money does Betty have in the beginning?\nAnswer {idx}.1: In the beginning, Betty has only half of the money she needs, which is 100 / 2 = $50. The answer is 50.\nQuestion {idx}.2: How much money did Betty's grandparents give her?\nAnswer {idx}.2: Her grandparents gave her twice as much as her parents, so they gave her 15 * 2 = $30. The answer is 30.\nQuestion {idx}.3: Now we can answer the question: How much more money does Betty need to buy the wallet?\nAnswer {idx}.3: Now that she got $15 from her parents and $30 from her grandparents, she will need $100 - $15 - $30 = $55. Since she already has $50, she needs $55 - $50 = $5 more. The answer is 5.",
        "Question {idx}: Julie is reading a 120-page book. Yesterday, she was able to read 12 pages and today, she read twice as many pages as yesterday. If she wants to read half of the remaining pages tomorrow, how many pages should she read?\nQuestion {idx}.1: How many pages did Julie read today?\nAnswer {idx}.1: Julie read twice as many pages as yesterday, so she read 12 * 2 = 24 pages The answer is 24.\nQuestion {idx}.2: How many pages did Julie read since yesterday?\nAnswer {idx}.2: Since yesterday, Julie read 12 + 24 = 36 pages. The answer is 36.\nQuestion {idx}.3: How many pages are left to be read?\nAnswer {idx}.3: There are 120 - 36 = 84 pages left to be read. The answer is 84.\nQuestion {idx}.4: Now we can answer the question: How many pages should she read?\nAnswer {idx}.4: She wants to read half of the remaining pages, so she should read 84 / 2 = 42 pages. The answer is 42.",
        "Question {idx}: James writes a 3-page letter to 2 different friends twice a week. How many pages does he write a year?\nQuestion {idx}.1: How many pages does he write every week?\nAnswer {idx}.1: James writes a 3-page letter to 2 different friends twice a week, so he writes 3 * 2 * 2 = 12 pages every week. The answer is 12.\nQuestion {idx}.2: How many weeks are there in a year?\nAnswer {idx}.2: There are 52 weeks in a year. The answer is 52.\nQuestion {idx}.3: Now we can answer the question: How many pages does he write a year?\nAnswer {idx}.3: James writes 12 pages every week, so he writes 12 * 52 = 624 pages a year. The answer is 624.",
        "Question {idx}: Mark has a garden with flowers. He planted plants of three different colors in it. Ten of them are yellow, and there are 80% more of those in purple. There are only 25% as many green flowers as there are yellow and purple flowers. How many flowers does Mark have in his garden?\nQuestion {idx}.1: How many purple flowers are there?\nAnswer {idx}.1: There are 80% more purple flowers than yellow flowers, so there are 10 * 1.8 = 18 purple flowers. The answer is 18.\nQuestion {idx}.2: How many yellow and purple flowers are there in total?\nAnswer {idx}.2: There are 10 yellow flowers and 18 purple flowers, so there are 10 + 18 = 28 yellow and purple flowers. The answer is 28.\nQuestion {idx}.3: How many green flowers are there?\nAnswer {idx}.3: There are 25% as many green flowers as there are yellow and purple flowers, so there are 28 * 0.25 = 7 green flowers. The answer is 7.\nQuestion {idx}.4: Now we can answer the question: How many flowers does Mark have in his garden?\nAnswer {idx}.4: Mark has 10 yellow flowers, 18 purple flowers, and 7 green flowers, so he has 10 + 18 + 7 = 35 flowers in his garden. The answer is 35.",
        "Question {idx}: Albert is wondering how much pizza he can eat in one day. He buys 2 large pizzas and 2 small pizzas. A large pizza has 16 slices and a small pizza has 8 slices. If he eats it all, how many pieces does he eat that day?\nQuestion {idx}.1: How many slices do the large pizzas have?\nAnswer {idx}.1: He buys 2 large pizzas, so he has 2 * 16 = 32 slices. The answer is 32.\nQuestion {idx}.2: How many slices do the small pizzas have?\nAnswer {idx}.2: He buys 2 small pizzas, so he has 2 * 8 = 16 slices. The answer is 16.\nQuestion {idx}.3: How many pieces does he eat that day?\nAnswer {idx}.3: Now we can answer the question: There are 32 slices from the large pizzas and 16 slices from the small pizzas, so he eats 32 + 16 = 48 pieces that day. The answer is 48.",
        "Question {idx}: Ken created a care package to send to his brother, who was away at boarding school. Ken placed a box on a scale, and then he poured into the box enough jelly beans to bring the weight to 2 pounds. Then, he added enough brownies to cause the weight to triple. Next, he added another 2 pounds of jelly beans. And finally, he added enough gummy worms to double the weight once again. What was the final weight of the box of goodies, in pounds?\nQuestion {idx}.1: What was the weight of the box after Ken poured jelly beans for the first time?\nAnswer {idx}.1: Ken poured jelly beans into the box until the weight was 2 pounds, so the weight of the box was 2 pounds. The answer is 2.\nQuestion {idx}.2: What was the weight of the box after Ken added brownies?\nAnswer {idx}.2: Ken aadded enough brownies to cause the weight to triple, so the weight of the box was 2 * 3 = 6 pounds. The answer is 6.\nQuestion {idx}.3: What was the weight of the box after Ken added jelly beans for the second time?\nAnswer {idx}.3: He added another 2 pounds of jelly beans, which means the weight of the box was 6 + 2 = 8 pounds. The answer is 8.\nQuestion {idx}.4: Now we can answer the question: What was the final weight of the box of goodies, in pounds?\nAnswer {idx}.4: Finally, he added enough gummy worms to double the weight once again, so the weight of the box was 8 * 2 = 16 pounds. The answer is 16.",
        "Question {idx}: Alexis is applying for a new job and bought a new set of business clothes to wear to the interview. She went to a department store with a budget of $200 and spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt. She also purchased a pair of shoes, but lost the receipt for them. She has $16 left from her budget. How much did Alexis pay for the shoes?\nQuestion {idx}.1: How much did Alexis pay for everything else?\nAnswer {idx}.1: Alexis spent $30 on a button-up shirt, $46 on suit pants, $38 on a suit coat, $11 on socks, and $18 on a belt, so she spent 30 + 46 + 38 + 11 + 18 = $143 on everything else. The answer is 143.\nQuestion {idx}.2: How much money did Alexis spend in total?\nAnswer {idx}.2: Alexis had a budget of $200 and finally there was $16 left, so she spent 200 - 16 = $184 in total. The answer is 184.\nQuestion {idx}.3: Now we can answer the question: How much did Alexis pay for the shoes?\nAnswer {idx}.3: Alexis spent $143 on everything else, so she spent 184 - 143 = $41 on the shoes. The answer is 41.",
        "Question {idx}: Tina makes $18.00 an hour. If she works more than 8 hours per shift, she is eligible for overtime, which is paid by your hourly wage + 1/2 your hourly wage. If she works 10 hours every day for 5 days, how much money does she make?\nQuestion {idx}.1: How much does Tina make in an 8-hour shift?\nAnswer {idx}.1: Tina makes $18.00 an hour, so she makes 18 * 8 = $144.00 in an 8-hour shift. The answer is 144.\nQuestion {idx}.2: How many hours of overtime does Tina get?\nAnswer {idx}.2: Tina works 10 hours every day for 5 days, so she works 10 * 5 = 50 hours. Since she works 8 hours every day, she gets 50 - 8 * 5 = 10 hours of overtime. The answer is 10.\nQuestion {idx}.3: How much is her hourly overtime wage?\nAnswer {idx}.3: Her hourly overtime wage is 18 + 18 / 2 = $27.00. The answer is 27.\nQuestion {idx}.4: How much does Tina make in overtime each day?\nAnswer {idx}.4: Tina works 10 hours a day, and 8 hours of that is paid at her regular hourly wage, so she makes 10 - 8 = 2 hours of overtime every day. Since her hourly overtime wage is $27.00, she makes 27 * 2 = $54.00 in overtime each day. The answer is 54.\nQuestion {idx}.5: How much does Tina make each day?\nAnswer {idx}.5: Tina makes $144.00 in an 8-hour shift and $54.00 in overtime, so she makes 144 + 54 = $198.00 each day. The answer is 198.\nQuestion {idx}.6: Now we can answer the question: How much money does she make?\nAnswer {idx}.6: Tina works 5 days a week, so she makes 198 * 5 = $990.00. The answer is 990."
        ],
        "useful_examples": [
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        "",
        ""
        ],
        "question_prefix": "Question {idx}: {question}",
        "subquestion_prefix": "Question {idx}.{sub_idx}:",
        "answer_prefix": "Answer {idx}.{sub_idx}:",
        "overall_question_prefix": "Now we can answer the question:",
  },
  "evaluator": {
        "input": "Given a question and some sub-questions, determine whether the last sub-question is useful to answer the question. Output 'Yes' or 'No', and a reason.\n\nQuestion 1: Four years ago, Kody was only half as old as Mohamed. If Mohamed is currently twice as 30 years old, how old is Kody?\nQuestion 1.1: How old is Mohamed?\nQuestion 1.2: How old was Mohamed four years ago?\nNew question 1.3: How old was Kody four years ago?\nIs the new question useful? Yes. We need the answer to calculate how old is Kody now.\n\nQuestion 2: Traci and Harris are baking cakes together. Traci has brought flour from her own house and Harris has 400g of flour in his house. Each cake needs 100g of flour and Traci and Harris have created 9 cakes each. How much flour, in grams, did Traci bring from her own house?\nNew question 2.1: How many cakes did Traci bring from her own house?\nIs the new question useful? No. The new question is not related to the original question.\n\nQuestion 3: A quantity surveyor is figuring the construction costs for a couple that wishes to build a house. The costs are as follows: land costs $50 per square meter, bricks cost $100 per 1000 bricks and roof tiles cost $10 per roof tile. If the house they wish to build requires 2000 square meters, 10000 bricks, and 500 roof tiles, how much construction costs are required for this project?\nQuestion 3.1: How much does the land cost?\nQuestion 3.2: How much do the bricks cost?\nNew question 3.3: How much do the roof tiles cost?\nIs the new question useful? Yes. We need the answer to calculate the total construction costs.\n\nQuestion 4: Wallace's water heater is twice the size of Catherine's water heater. If the capacity of Wallace's water heater is 40 gallons and it's 3/4 full, calculate the total number of gallons of water they both have if Catherine's water heater is also full with water to 3/4 of its capacity.\nQuestion 4.1: How much water is in Wallace's water heater?\nNew question 4.2: How much water do they have in total?\nIs the new question useful? No. It is too hard to answer the new question based on the current information.\n\n",
        "question_prefix": "Question 5: ",
        "subquestion_prefix": "Question 5.{}:",
        "new_subquestion_prefix": "New question 5.{}:",
        "useful_prefix": "Is the new question useful?"
  }  
}
qa_rest = {
    "policy_sys": """Your task is to give the correct next step, given a science problem and an existing partial solution (not a complete answer). 
Assuming the input is n-steps, then the format of the input is:
"Problem: ...
Existing Steps:
Step 1: ...
Step 2: ...
...
Step n: ..."
where ... denotes omitted input information. 

Please follow the restricted output format:
* If no existing steps are given, generate Step 1.
* Otherwise, following the given step(s), output ONLY ONE step within 1000 tokens. SHOULD NOT output multiple steps.
* DO NOT repeat Problem or any Existing Steps.
* Your output should be a complete reasoning step that includes calculations, reasoning, choosing answers, etc. 
* When the final answer is/has been reached, begin the step with EXACTLY the phrase:\"Now we can answer the question: The answer is \", followed by EXACTLY one number. Do not include any other words, punctuation, or explanation after the number.""",
    
    "usefulness_evaluator": "Given a chain of thoughts, determine whether the last thought is useful to answer the question. ONLY output 'Yes' or 'No'.",
    "evaluator": {
        "correctness_cot": """Given a question and a chain of thoughts, determine whether the last thought is **correct**, where **correct** means factually accurate and mathematically accurate (all calculations and formulas are correct), and logically consistent with the question.

Instructions:
1. By default, enclose your reasoning between `<think>` and `</think>`. Keep it concise, with no more than 200 words. If the user does not request reasoning or `</think>` is given, you can omit it.
2. After `</think>`, on a new line, output only a score:
   - 0 if any correctness criterion is unmet.
   - 1 if all correctness criteria are fully met.
   The score must be parsable by Python's `float()` function, with no punctuation or additional text.""",

        "usefulness_cot": """Given a question and a chain of thoughts, determine how **useful** the last thought is for answering the question, regardless of correctness.

Instructions:
1. By default, enclose your reasoning between `<think>` and `</think>`. Keep it concise, with no more than 200 words. If the user does not request reasoning or `</think>` is given, you can omit it.
2. After `</think>`, on a new line, output only a score between 0 and 1:
   - 0 if the step is entirely irrelevant or unhelpful.
   - 1 if the step is essential and maximally useful.
   - A value strictly between 0 and 1 if the step is partially useful. Larger values indicate more usefulness.
   The score must be parsable by Python's `float()` function, with no punctuation or additional text.""",

       "correctness": """Given a question and a chain of thoughts, determine whether the last thought is **correct**, where **correct** means factually accurate and mathematically accurate (all calculations and formulas are correct), and logically consistent with the question.

Instructions:
Output only a score:
- 0 if any correctness criterion is unmet.
- 1 if all correctness criteria are fully met.
The score must be parsable by Python's `float()` function, with no punctuation or additional text.""",

    "usefulness": """Given a question and a chain of thoughts, determine how **useful** the last thought is for answering the question, regardless of correctness.

Instructions:
Output only a score between 0 and 1:
- 0 if the step is entirely irrelevant or unhelpful.
- 1 if the step is essential and maximally useful.
- A value strictly between 0 and 1 if the step is partially useful. Larger values indicate more usefulness.
The score must be parsable by Python's `float()` function, with no punctuation or additional text."""

}
}
#  """Your task is to give the correct next step, given a science problem and an existing partial solution (not a complete answer). 
# Assuming the input is n-steps, then the format of the input is:
# "Problem: ...
# Existing Steps:
# Step 1: ...
# Step 2: ...
# ...
# Step n: ..."
# where ... denotes omitted input information. 

# Please follow the restricted output format:
# * If no existing steps are provided, you need to briefly analyze the problem from scratch and then output the first step. 
# * Otherwise, you need to output the next step (step n+1) that you think is correct, following the ideas and analysis of the existing steps.
# * In any scenario, the output format is limited to: "Next step: ...", where ... indicates omitted output information, which is the part you should fill in. Your output should be a complete reasoning step that includes calculations, reasoning, choosing answers, etc. 
# * When the last step is reached, please start the step with \"Now we can answer the question: The answer is \" .""",
blocksworld_rap = {
    "base_llm": {"model_name": "gpt4-short", "max_tokens": 2048, "temperature": 0.7},

    "icl": """I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do
    Pick up a block\nUnstack a block from on top of another block\nPut down a block\nStack a block on top of another block\n\nI have the following restrictions on my actions:\nI can only pick up or unstack one block at a time.\nI can only pick up or unstack a block if my hand is empty.\nI can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.\nI can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.\nI can only unstack a block from on top of another block if the block I am unstacking is clear.\nOnce I pick up or unstack a block, I am holding the block.\nI can only put down a block that I am holding.\nI can only stack a block on top of another block if I am holding the block being stacked.\nI can only stack a block on top of another block if the block onto which I am stacking the block is clear.\nOnce I put down or stack a block, my hand becomes empty.
    
    [STATEMENT]\nAs initial conditions I have that, the red block is clear, the yellow block is clear, the hand is empty, the red block is on top of the blue block, the yellow block is on top of the orange block, the blue block is on the table and the orange block is on the table.
    My goal is to have that the orange block is on top of the red block.
    
    My plan is as follows:
    
    [PLAN]\nunstack the yellow block from on top of the orange block\nput down the yellow block\npick up the orange block\nstack the orange block on top of the red block
    [PLAN END]
    
    [STATEMENT]\nAs initial conditions I have that, the orange block is clear, the yellow block is clear, the hand is empty, the blue block is on top of the red block, the orange block is on top of the blue block, the red block is on the table and the yellow block is on the table.
    My goal is to have that the blue block is on top of the red block and the yellow block is on top of the orange block.
    
    My plan is as follows:
    
    [PLAN]\npick up the yellow block\nstack the yellow block on top of the orange block
    [PLAN END]
    
    [STATEMENT]\nAs initial conditions I have that, the red block is clear, the blue block is clear, the orange block is clear, the hand is empty, the blue block is on top of the yellow block, the red block is on the table, the orange block is on the table and the yellow block is on the table.\nMy goal is to have that the blue block is on top of the orange block and the yellow block is on top of the red block.
    
    My plan is as follows:
    
    [PLAN]\nunstack the blue block from on top of the yellow block\nstack the blue block on top of the orange block\npick up the yellow block\nstack the yellow block on top of the red block
    [PLAN END]
    
    [STATEMENT]\nAs initial conditions I have that, the red block is clear, the blue block is clear, the yellow block is clear, the hand is empty, the yellow block is on top of the orange block, the red block is on the table, the blue block is on the table and the orange block is on the table.\nMy goal is to have that the orange block is on top of the blue block and the yellow block is on top of the red block.
    
    My plan is as follows:
    
    [PLAN]\nunstack the yellow block from on top of the orange block\nstack the yellow block on top of the red block\npick up the orange block\nstack the orange block on top of the blue block\n[PLAN END]
    
    [STATEMENT]\nAs initial conditions I have that, <init_state>\nMy goal is to <goals>
    
    My plan is as follows:
    
    [PLAN]\n<action>""",
    
    "world_update_pickup": """I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do 
    
    Pick up a block 
    Unstack a block from on top of another block 
    Put down a block 
    Stack a block on top of another block 
    
    I have the following restrictions on my actions:
    I can only pick up or unstack one block at a time. 
    I can only pick up or unstack a block if my hand is empty. 
    I can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up. 
    I can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block. 
    I can only unstack a block from on top of another block if the block I am unstacking is clear. Once I pick up or unstack a block, I am holding the block. 
    I can only put down a block that I am holding. \nI can only stack a block on top of another block if I am holding the block being stacked. 
    I can only stack a block on top of another block if the block onto which I am stacking the block is clear. Once I put down or stack a block, my hand becomes empty.
    
    After being given an initial state and an action, give the new state after performing the action.
    
    [SCENARIO 1]
    [STATE 0] I have that, the white block is clear, the cyan block is clear, the brown block is clear, the hand is empty, the white block is on top of the purple block, the purple block is on the table, the cyan block is on the table and the brown block is on the table.
    [ACTION] Pick up the brown block.
    [CHANGE] The hand was empty and is now holding the brown block, the brown block was on the table and is now in the hand, and the brown block is no longer clear.
    [STATE 1] I have that, the white block is clear, the cyan block is clear, the brown block is in the hand, the hand is holding the brown block, the white block is on top of the purple block, the purple block is on the table and the cyan block is on the table.
    
    [SCENARIO 2]
    [STATE 0] I have that, the purple block is clear, the cyan block is clear, the white block is clear, the hand is empty, the white block is on top of the brown block, the purple block is on the table, the cyan block is on the table and the brown block is on the table.\n[ACTION] Pick up the cyan block.
    [CHANGE] The hand was empty and is now holding the cyan block, the cyan block was on the table and is now in the hand, and the cyan block is no longer clear.
    [STATE 1] I have that, the cyan block is in the hand, the white block is clear, the purple block is clear, the hand is holding the cyan block, the white block is on top of the brown block, the purple block is on the table and the brown block is on the table.
    
    [SCENARIO 3]
    [STATE 0] I have that, {}
    [ACTION] {}
    [CHANGE]""",
    
    "world_update_unstack": "I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do \n\nPick up a block \nUnstack a block from on top of another block \nPut down a block \nStack a block on top of another block \n\nI have the following restrictions on my actions:\nI can only pick up or unstack one block at a time. \nI can only pick up or unstack a block if my hand is empty. \nI can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up. \nI can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block. \nI can only unstack a block from on top of another block if the block I am unstacking is clear. Once I pick up or unstack a block, I am holding the block. \nI can only put down a block that I am holding. \nI can only stack a block on top of another block if I am holding the block being stacked. \nI can only stack a block on top of another block if the block onto which I am stacking the block is clear. Once I put down or stack a block, my hand becomes empty.\n\nAfter being given an initial state and an action, give the new state after performing the action.\n\n[SCENARIO 1]\n[STATE 0] I have that, the white block is clear, the cyan block is clear, the brown block is clear, the hand is empty, the white block is on top of the purple block, the purple block is on the table, the cyan block is on the table and the brown block is on the table.\n[ACTION] Unstack the white block from on top of the purple block.\n[CHANGE] The hand was empty and is now holding the white block, the white block was on top of the purple block and is now in the hand, the white block is no longer clear, and the purple block is now clear.\n[STATE 1] I have that, the purple block is clear, the cyan block is clear, the brown block is clear, the hand is holding the white block, the white block is in the hand, the purple block is on the table, the cyan block is on the table and the brown block is on the table.\n\n[SCENARIO 2]\n[STATE 0] I have that, the purple block is clear, the cyan block is clear, the white block is clear, the hand is empty, the cyan block is on top of the brown block, the purple block is on the table, the white block is on the table and the brown block is on the table.\n[ACTION] Unstack the cyan block from on top of the brown block.\n[CHANGE] The hand was empty and is now holding the cyan block, the cyan block was on top of the brown block and is now in the hand, the cyan block is no longer clear, and the brown block is now clear.\n[STATE 1] I have that, the purple block is clear, the brown block is clear, the cyan block is in the hand, the white block is clear, the hand is holding the cyan block, the purple block is on the table, the white block is on the table and the brown block is on the table.\n\n[SCENARIO 3]\n[STATE 0] I have that, {}\n[ACTION] {}\n[CHANGE]",
    
    "world_update_putdown": "I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do \n\nPick up a block \nUnstack a block from on top of another block \nPut down a block \nStack a block on top of another block \n\nI have the following restrictions on my actions:\nI can only pick up or unstack one block at a time. \nI can only pick up or unstack a block if my hand is empty. \nI can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up. \nI can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block. \nI can only unstack a block from on top of another block if the block I am unstacking is clear. Once I pick up or unstack a block, I am holding the block. \nI can only put down a block that I am holding. \nI can only stack a block on top of another block if I am holding the block being stacked. \nI can only stack a block on top of another block if the block onto which I am stacking the block is clear. Once I put down or stack a block, my hand becomes empty.\n\nAfter being given an initial state and an action, give the new state after performing the action.\n\n[SCENARIO 1]\n[STATE 0] I have that, the white block is clear, the purple block is clear, the cyan block is in the hand, the brown block is clear, the hand is holding the cyan block, the white block is on the table, the purple block is on the table, and the brown block is on the table.\n[ACTION] Put down the cyan block.\n[CHANGE] The hand was holding the cyan block and is now empty, the cyan block was in the hand and is now on the table, and the cyan block is now clear.\n[STATE 1] I have that, the cyan block is clear, the purple block is clear, the white block is clear, the brown block is clear, the hand is empty, the white block is on the table, the purple block is on the table, the cyan block is on the table, and the brown block is on the table.\n\n[SCENARIO 2]\n[STATE 0] I have that, the purple block is clear, the black block is in the hand, the white block is clear, the hand is holding the black block, the white block is on top of the brown block, the purple block is on the table, and the brown block is on the table.\n[ACTION] Put down the black block.\n[CHANGE] The hand was holding the black block and is now empty, the black block was in the hand and is now on the table, and the black block is now clear.\n[STATE 1] I have that, the black block is clear, the purple block is clear, the white block is clear, the hand is empty, the white block is on top of the brown block, the purple block is on the table, the brown block is on the table, and the black block is on the table.\n\n[SCENARIO 3]\n[STATE 0] I have that, {}\n[ACTION] {}\n[CHANGE]",
    
    "world_update_stack": "I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do \n\nPick up a block \nUnstack a block from on top of another block \nPut down a block \nStack a block on top of another block \n\nI have the following restrictions on my actions:\nI can only pick up or unstack one block at a time. \nI can only pick up or unstack a block if my hand is empty. \nI can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up. \nI can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block. \nI can only unstack a block from on top of another block if the block I am unstacking is clear. Once I pick up or unstack a block, I am holding the block. \nI can only put down a block that I am holding. \nI can only stack a block on top of another block if I am holding the block being stacked. \nI can only stack a block on top of another block if the block onto which I am stacking the block is clear. Once I put down or stack a block, my hand becomes empty.\n\nAfter being given an initial state and an action, give the new state after performing the action.\n\n[SCENARIO 1]\n[STATE 0] I have that, the white block is clear, the purple block is clear, the cyan block is in the hand, the brown block is clear, the hand is holding the cyan block, the white block is on the table, the purple block is on the table, and the brown block is on the table.\n[ACTION] Stack the cyan block on top of the brown block.\n[CHANGE] The hand was holding the cyan block and is now empty, the cyan block was in the hand and is now on top of the brown block, the brown block is no longer clear, and the cyan block is now clear.\n[STATE 1] I have that, the cyan block is clear, the purple block is clear, the white block is clear, the hand is empty, the cyan block is on top of the brown block, the brown block is on the table, the purple block is on the table, and the white block is on the table.\n\n[SCENARIO 2]\n[STATE 0] I have that, the purple block is clear, the black block is in the hand, the white block is clear, the hand is holding the black block, the white block is on top of the brown block, the purple block is on the table, and the brown block is on the table.\n[ACTION] Stack the black block on top of the purple block.\n[CHANGE] The hand was holding the black block and is now empty, the black block was in the hand and is now on top of the purple block, the purple block is no longer clear, and the black block is now clear.\n[STATE 1] I have that, the black block is clear, the white block is clear, the hand is empty, the black block is on top of the purple block, the white block is on top of the brown block, the brown block is on the table, and the purple block is on the table.\n\n[SCENARIO 3]\n[STATE 0] I have that, {}\n[ACTION] {}\n[CHANGE]",
    
    "evaluator": """I am playing with a set of blocks where I need to arrange the blocks into stacks. Here are the actions I can do
    
    Pick up a block
    Unstack a block from on top of another block
    Put down a block\nStack a block on top of another block
    
    I have the following restrictions on my actions:
    I can only pick up or unstack one block at a time.
    I can only pick up or unstack a block if my hand is empty.
    I can only pick up a block if the block is on the table and the block is clear. A block is clear if the block has no other blocks on top of it and if the block is not picked up.
    I can only unstack a block from on top of another block if the block I am unstacking was really on top of the other block.
    I can only unstack a block from on top of another block if the block I am unstacking is clear.
    Once I pick up or unstack a block, I am holding the block.
    I can only put down a block that I am holding.
    I can only stack a block on top of another block if I am holding the block being stacked.
    I can only stack a block on top of another block if the block onto which I am stacking the block is clear.
    Once I put down or stack a block, my hand becomes empty.
    
    Please evaluate whether the given action is a good one under certain conditions.
    
    [STATEMENT]
    As initial conditions I have that, the red block is clear, the yellow block is clear, the hand is empty, the red block is on top of the blue block, the yellow block is on top of the orange block, the blue block is on the table and the orange block is on the table.
    My goal is to have that the orange block is on top of the red block.
    [ACTION]
    unstack the red block from on top of the blue block
    [EVALUATION]
    bad
    
    [STATEMENT]
    As initial conditions I have that, the orange block is in the hand, the yellow block is clear, the hand is holding the orange block, the blue block is on top of the red block, the yellow block is on top of the blue block, and the red block is on the table.
    My goal is to have have that the yellow block is on top of the orange block.
    [ACTION]
    put down the orange block
    [EVALUATION]
    good
    
    [STATEMENT]
    As initial conditions I have that, the orange block is clear, the yellow block is clear, the hand is empty, the blue block is on top of the red block, the orange block is on top of the blue block, the red block is on the table and the yellow block is on the table.
    My goal is to have that the blue block is on top of the red block and the yellow block is on top of the orange block.
    [ACTION]
    pick up the yellow block
    [EVALUATION]
    good
    
    [STATEMENT]
    As initial conditions I have that, the orange block is clear, the yellow block is clear, the hand is empty, the blue block is on top of the red block, the orange block is on top of the blue block, the red block is on the table and the yellow block is on the table.
    My goal is to have that the blue block is on top of the red block and the yellow block is on top of the orange block.
    [ACTION]
    pick up the yellow block
    [EVALUATION]
    good
    
    [STATEMENT]
    As initial conditions I have that, the blue block is clear, the orange block is in the hand, the red block is clear, the hand is holding the orange block, the red block is on top of the yellow block, the blue block is on the table, and the yellow block is on the table.
    My goal is to have have that the red block is on top of the yellow block and the orange block is on top of the blue block.
    [ACTION]
    stack the orange block on top of the red block
    [EVALUATION]
    bad
    
    [STATEMENT]
    As initial conditions I have that, <init_state>
    My goal is to <goals>
    [ACTION]
    <action>
    [EVALUATION]
    """
}
alfred_reflexion = {
    "actor": {
        "llm": {"temperature": 0.0, "max_tokens": 256, "stop": '\n', "cache": True, "token_usage": True},
        "lmpc_def": "Interact with a household to solve a task. ",
        "demo_path": "alfworld_3prompts.json",
        "gen_trigger": ">",
        "task_desc_trigger": "Here is the task:",
        "max_prev_eval_results": 3,
        "max_trajectory_len": 49
    },
    "evaluator": {
        "llm": {"temperature": 0.0, "max_tokens": 256, "stop": None, "cache": True, "token_usage": True},
        "lmpc_def": """You will be given the history of a past experience in which you were placed in an environment \
            and given a task to complete. You were unsuccessful in completing the task. \
            Do not summarize your environment, but rather think about the strategy and path you took to attempt to complete the task. \
            Devise a concise, new plan of action that accounts for your mistake with reference to specific actions that you should have taken. \
            For example, if you tried A and B but forgot C, then devise a plan to achieve C with environment-specific actions. \
            You will need this later when you are solving the same task. Give your plan after "Plan".""",
        "task_info_prefix": """\n\t\n""",
        "demos_prefix": """Here are two examples: \n\t""",
        "demos": "You are in the middle of a room. Looking quickly around you, you see a cabinet 10, a cabinet 9, a cabinet 8, a cabinet 7, a cabinet 6, a cabinet 5, a cabinet 4, a cabinet 3, a cabinet 2, a cabinet 1, a coffeemachine 1, a countertop 2, a countertop 1, a diningtable 2, a diningtable 1, a drawer 3, a drawer 2, a drawer 1, a fridge 1, a garbagecan 1, a microwave 1, a sinkbasin 1, a stoveburner 4, a stoveburner 3, a stoveburner 2, a stoveburner 1, and a toaster 1.\nYour task is to: heat some mug and put it in coffeemachine.\n> think: To solve the task, I need to find and take a mug, then heat it with stoveburner, then put it in coffeemachine.\nOk.\n> look\nYou are in the middle of a room. Looking quickly around you, you see nothing.\n> think: First I need to find a mug. A mug is more likely to appear in countertop (1-2), diningtable (1-2), drawer (1-3), fridge (1), sinkbasin (1), stoveburner (1-4), toaster (1). I can check one by one, starting with countertop 1.\nOk.\n> look\nYou are in the middle of a room. Looking quickly around you, you see nothing.\n> go to countertop 1\nOn the countertop 1, you see a butterknife 1, a cup 1, a fork 2, and a mug 1.\n> think: Now I find a mug (1). Next, I need to take it.\nOk.\n> take mug 1 from countertop 1\nYou pick up the mug 1 from the countertop 1.\n> think: Now I take a mug (1). Next, I need to heat it with stoveburner (1-4).\nOk.\n> go to stoveburner 1\nOn the stoveburner 1, you see a pan 2.\n> examine stoveburner 1\nOn the stoveburner 1, you see a pan 2.\n> examine stoveburner 1\nOn the stoveburner 1, you see a pan 2.\n> examine stoveburner 1\nOn the stoveburner 1, you see a pan 2.\n> examine stoveburner 1\nOn the stoveburner 1, you see a pan 2.\nSTATUS: FAIL\nNew plan: I was stuck in a loop in which I continually examined stoveburner 1 instead of heating mug 1 with stoveburner 1. I should have taken mug 1 from countertop 1, then heated it with stoveburner 1, then put it in coffeemachine 1. It did not help to execute two identical actions in a row. I will try to execute a different action if I am stuck in a loop again. \n\nYou are in the middle of a room. Looking quickly around you, you see a bed 1, a desk 2, a desk 1, a drawer 6, a drawer 5, a drawer 4, a drawer 3, a drawer 2, a drawer 1, a garbagecan 1, a laundryhamper 1, a safe 1, a shelf 6, a shelf 5, a shelf 4, a shelf 3, a shelf 2, and a shelf 1.\nYour task is to: look at bowl under the desklamp.\n> think: To solve the task, I need to find and take a bowl, then find and use a desklamp.\nOK.\n> think: First I need to find a bowl. A bowl is more likely to appear in drawer (1-6), desk (1-2), shelf (1-6), garbagecan (1), laundryhamper (1). I can check one by one, starting with drawer 1.\nOK.\n> go to drawer 1\nThe drawer 1 is closed.\n> open drawer 1\nYou open the drawer 1. The drawer 1 is open. In it, you see a cellphone 2.\n> go to drawer 2\nOn the drawer 2, you see a keychain 1.\n> go to drawer 3\nNothing happens.\n> go to drawer 4\nThe drawer 4 is closed.\n> open drawer 4\nYou open the drawer 4. The drawer 4 is open. In it, you see nothing.\n> go to drawer 5\nOn the drawer 5, you see nothing.\n> go to drawer 6\nThe drawer 6 is closed.\n> open drawer 6\nYou open the drawer 6. The drawer 6 is open. In it, you see a book 2, and a pen 2.\n> go to desk 1\nOn the desk 1, you see a alarmclock 1, a desklamp 1, a mug 1, and a pencil 1.\n> think: Now I find a desklamp (1). Next, I need to use it.\nOK.\n> use desklamp 1\nYou turn on the desklamp 1.\n> \nNothing happens.\n> go to desk 2\nOn the desk 2, you see a bowl 2, a bowl 1, a cd 2, a cd 1, and a creditcard 3.\n> think: Now I find a bowl (2). Next, I need to take it.\nOK.\n> take bowl 2 from desk 2\nYou pick up the bowl 2 from the desk 2.\n> \nNothing happens.\n> \nNothing happens.\nSTATUS: FAIL\nNew plan: In this environment, my plan was to find a bowl then find and use a desklamp. However, the task says to look at bowl under the desklamp. I should have looked for the desklamp then looked for the bowl. I noticed that the desklamp was found on desk 1. In the next trial, I will go to desk 1, find the lamp, then look for the bowl under the desklamp.",
        "gen_trigger": "\n\nNew plan:",
        "max_prev_eval_results": 3
    },
}
hotpotqa_reflexion = {
    "actor": {
        "lmpc_def": """Solve a question answering task with interleaving Thought, Action, Observation steps. \
                Thought can reason about the current situation, and Action can be three types: 
        (1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. If not, it will return some similar entities to search.
        (2) Lookup[keyword], which returns the next sentence containing keyword in the last passage successfully found by Search.
        (3) Finish[answer], which returns the answer and finishes the task.
        You may take as many steps as necessary.
        """,
            
            "demos": """"""
    },
    "evaluator": {
        "lmpc_def": "You are an advanced reasoning agent that can improve based on self refection. You will be given a previous reasoning trial in which you were given access to relevant context and a question to answer. You were unsuccessful in answering the question either because you guessed the wrong answer with Finish[<answer>] or there is a phrasing discrepancy with your provided answer and the answer key. In a few sentences, Diagnose a possible reason for failure or phrasing discrepancy and devise a new, concise, high level plan that aims to mitigate the same failure. Use complete sentences.  ",
        "task_info_prefix": """\nPrevious trial:\nQuestion: """,
        "taks_info_traj_suffix": """\n\nReflection:""",
        "demos_prefix": """\nHere are some examples:\n""",
        "demos_suffix": """\n(END OF EXAMPLES)""",
        "demos": """Previous Trial:
Question: The Rome Protocols were signed by three Prime Ministers one of which was assassinated as part of what?
Thought 1: I need to search Rome Protocols, find the three Prime Ministers, then find what they were assassinated as part of.
Action 1: Search[Rome Protocols]
Observation 1: The Rome Protocols were a series of three international agreements concluded in Rome on 17 March 1934 between the governments of Austria, Hungary and Italy. They were signed by Italian Prime Minister Benito Mussolini, Austrian Chancellor Engelbert Dollfuss and Hungarian Prime Minister Gyula Gömbös. All of the protocols went into effect on 12 July 1934 and were registered in League of Nations Treaty Series on 12 December 1934.
Thought 2: The three Prime Ministers were Benito Mussolini, Engelbert Dollfuss, and Gyula Gömbös. I need to search Benito Mussolini and find what he was assassinated as part of.
Action 2: Search[Benito Mussolini]
Observation 2: Benito Amilcare Andrea Mussolini (UK:  MU(U)SS-ə-LEE-nee, US:  MOOSS-, Italian: [beˈniːto aˈmilkare anˈdrɛːa mussoˈliːni]; 29 July 1883 – 28 April 1945) was an Italian politician and journalist who founded and led the National Fascist Party (PNF). He was Prime Minister of Italy from the March on Rome in 1922 until his deposition in 1943, as well as "Duce" of Italian fascism from the establishment of the Italian Fasces of Combat in 1919 until his summary execution in 1945 by Italian partisans. As dictator of Italy and principal founder of fascism, Mussolini inspired and supported the international spread of fascist movements during the inter-war period.Mussolini was originally a socialist politician and a journalist at the Avanti! newspaper. In 1912, he became a member of the National Directorate of the Italian Socialist Party (PSI), but he was expelled from the PSI for advocating military intervention in World War I, in opposition to the party's stance on neutrality. In 1914, Mussolini founded a new journal, Il Popolo d'Italia, and served in the Royal Italian Army during the war until he was wounded and discharged in 1917. Mussolini denounced the PSI, his views now centering on Italian nationalism instead of socialism, and later founded the fascist movement which came to oppose egalitarianism and class conflict, instead advocating "revolutionary nationalism" transcending class lines. On 31 October 1922, following the March on Rome (28–30 October), Mussolini was appointed prime minister by King Victor Emmanuel III, becoming the youngest individual to hold the office up to that time. After removing all political opposition through his secret police and outlawing labor strikes, Mussolini and his followers consolidated power through a series of laws that transformed the nation into a one-party dictatorship. Within five years, Mussolini had established dictatorial authority by both legal and illegal means and aspired to create a totalitarian state. In 1929, Mussolini signed the Lateran Treaty with the Holy See to establish Vatican City.
Mussolini's foreign policy aimed to restore the ancient grandeur of the Roman Empire by expanding Italian colonial possessions and the fascist sphere of influence. In the 1920s, he ordered the Pacification of Libya, instructed the bombing of Corfu over an incident with Greece, established a protectorate over Albania, and incorporated the city of Fiume into the Italian state via agreements with Yugoslavia. In 1936, Ethiopia was conquered following the Second Italo-Ethiopian War and merged into Italian East Africa (AOI) with Eritrea and Somalia. In 1939, Italian forces annexed Albania. Between 1936 and 1939, Mussolini ordered the successful Italian military intervention in Spain in favor of Francisco Franco during the Spanish Civil War. Mussolini's Italy initially tried to avoid the outbreak of a second global war, sending troops at the Brenner Pass to delay Anschluss and taking part in the Stresa Front, the Lytton Report, the Treaty of Lausanne, the Four-Power Pact and the Munich Agreement. However, Italy then alienated itself from Britain and France by aligning with Germany and Japan. Germany invaded Poland on 1 September 1939, resulting in declarations of war by France and the UK and the start of World War II.
On 10 June 1940, Mussolini decided to enter the war on the Axis side. Despite initial success, the subsequent Axis collapse on multiple fronts and eventual Allied invasion of Sicily made Mussolini lose the support of the population and members of the Fascist Party. As a consequence, early on 25 July 1943, the Grand Council of Fascism passed a motion of no confidence in Mussolini; later that day King Victor Emmanuel III dismissed him as head of government and had him placed in custody, appointing Pietro Badoglio to succeed him as Prime Minister. After the king agreed to an armistice with the Allies, on 12 September 1943 Mussolini was rescued from captivity in the Gran Sasso raid by German paratroopers and Waffen-SS commandos led by Major Otto-Harald Mors. Adolf Hitler, after meeting with the rescued former dictator, then put Mussolini in charge of a puppet regime in northern Italy, the Italian Social Republic (Italian: Repubblica Sociale Italiana, RSI), informally known as the Salò Republic, causing a civil war. In late April 1945, in the wake of near total defeat, Mussolini and his mistress Clara Petacci attempted to flee to Switzerland, but both were captured by Italian communist partisans and summarily executed by firing squad on 28 April 1945 near Lake Como. The bodies of Mussolini and his mistress were then taken to Milan, where they were hung upside down at a service station to publicly confirm their demise.
Thought 3: Benito Mussolini was assassinated as part of the collapse of the Axis powers during World War II. So the Rome Protocols were signed by three Prime Ministers one of which was assassinated as part of World War II.
Action 3: Finish[World War II]

Reflection:I searched one of the prime ministers involved in the signing, then attemted to answer right away. I should have searched each of the prime ministers, then looked up 'death' on each of their pages in order to get more information before answering.

Previous Trial:
Question: Kam Heskin plays Paige Morgan in a 2004 film directed by who?
Thought 1: I need to search Kam Heskin and Paige Morgan, find the 2004 film they are in, then find the director of the film.
Action 1: Search[Kam Heskin]
Observation 1: Kam Erika Heskin (born May 8, 1973) is an American actress. She began her career playing Caitlin Deschanel on the NBC daytime soap opera Sunset Beach (1998–1999), before appearing in films Planet of the Apes (2001) and Catch Me If You Can (2002). Heskin went on in 2003 to play Elizabeth Bennet in an independent film Pride & Prejudice: A Latter-Day Comedy, Robin in Charmed (2004),  and Paige Morgan in The Prince and Me film franchise (2006–2010).
Thought 2: Kam Heskin plays Paige Morgan in The Prince and Me film franchise. I need to search The Prince and Me film franchise and find the director of the 2004 film.
Action 2: Search[The Prince and Me film franchise]
Observation 2: Could not find [The Prince and Me film franchise]. Similar: ['Kam Heskin', 'Prince of Persia', 'Harry Potter and the Half-Blood Prince (film)', 'Prince of Persia: The Sands of Time (film)', 'The Chronicles of Narnia (film series)', 'Majestic Prince (franchise)', 'Kingsman (franchise)', 'The Little Prince (2015 film)', 'Shrek (franchise)', 'Harry Potter (film series)']
Thought 3: I need to search The Prince and Me (2004 film) instead.
Action 3: Search[The Prince and Me (2004 film)]
Observation 3: Could not find [The Prince and Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Kam Heskin', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince of Persia', 'Brooklynn Prince', 'Prince (musician)']
Thought 4: I need to search The Prince & Me (2004 film).
Action 4: Search[The Prince & Me (2004 film)]
Observation 4: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']
Thought 5: I need to search The Prince & Me (2004 film) instead.
Action 5: Search[The Prince & Me (2004 film)]
Observation 5: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']
Thought 6: I need to search The Prince & Me (2004 film).
Action 6: Search[The Prince & Me (2004 film)]
Observation 6: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']

Reflection: I got stuck in a loop where I kept trying to search 'The Prince & Me (2004 film)' but the page could not be found. Instead I should have tried to search the similar results that had a similar name to see and they were made in 2004."""
    }
}
hotpotqa_react = {
    "actor": {
        "lmpc_def": """Solve a question answering task with interleaving Thought, Action, Observation steps. \
            Thought can reason about the current situation, and Action can be three types: 
        (1) Search[entity], which searches the exact entity on Wikipedia and returns the first paragraph if it exists. \
            If not, it will return some similar entities to search.
        (2) Lookup[keyword], which returns the next sentence containing keyword in the last passage successfully found \
            by Search.
        (3) Finish[answer], which returns the answer and finishes the task.
        You may take as many steps as necessary.""",
            "demos_prefix": """\nHere are some examples:""",
            "demos_suffix": """\n(END OF EXAMPLES)""",
            "demos": """Question: What is the elevation range for the area that the eastern sector of the Colorado orogeny extends into?
Thought 1: I need to search Colorado orogeny, find the area that the eastern sector of the Colorado orogeny extends into, then find the elevation range of the area.
Action 1: Search[Colorado orogeny]
Observation 1: The Colorado orogeny was an episode of mountain building (an orogeny) in Colorado and surrounding areas.
Thought 2: It does not mention the eastern sector. So I need to look up eastern sector.
Action 2: Lookup[eastern sector]
Observation 2: (Result 1 / 1) The eastern sector extends into the High Plains and is called the Central Plains orogeny.
Thought 3: The eastern sector of Colorado orogeny extends into the High Plains. So I need to search High Plains and find its elevation range.
Action 3: Search[High Plains]
Observation 3: High Plains refers to one of two distinct land regions:
Thought 4: I need to instead search High Plains (United States).
Action 4: Search[High Plains (United States)]
Observation 4: The High Plains are a subregion of the Great Plains. From east to west, the High Plains rise in elevation from around 1,800 to 7,000 ft (550 to 2,130 m).[3]
Thought 5: High Plains rise in elevation from around 1,800 to 7,000 ft, so the answer is 1,800 to 7,000 ft.
Action 5: Finish[1,800 to 7,000 ft]

Question: Musician and satirist Allie Goertz wrote a song about the "The Simpsons" character Milhouse, who Matt Groening named after who?
Thought 1: The question simplifies to "The Simpsons" character Milhouse is named after who. I only need to search Milhouse and find who it is named after.
Action 1: Search[Milhouse]
Observation 1: Milhouse Mussolini Van Houten is a recurring character in the Fox animated television series The Simpsons voiced by Pamela Hayden and created by Matt Groening.
Thought 2: The paragraph does not tell who Milhouse is named after, maybe I can look up "named after".
Action 2: Lookup[named after]
Observation 2: (Result 1 / 1) Milhouse was named after U.S. president Richard Nixon, whose middle name was Milhous. 
Thought 3: Milhouse was named after U.S. president Richard Nixon, so the answer is Richard Nixon.
Action 3: Finish[Richard Nixon]

Question: Which documentary is about Finnish rock groups, Adam Clayton Powell or The Saimaa Gesture?
Thought 1: I need to search Adam Clayton Powell and The Saimaa Gesture, and find which documentary is about Finnish rock groups.
Action 1: Search[Adam Clayton Powell]
Observation 1: Could not find [Adam Clayton Powell]. Similar: ['Adam Clayton Powell III', 'Seventh Avenue (Manhattan)', 'Adam Clayton Powell Jr. State Office Building', 'Isabel Washington Powell', 'Adam Powell', 'Adam Clayton Powell (film)', 'Giancarlo Esposito'].
Thought 2: To find the documentary, I can search Adam Clayton Powell (film).
Action 2: Search[Adam Clayton Powell (film)]
Observation 2: Adam Clayton Powell is a 1989 American documentary film directed by Richard Kilberg.
The film is about the rise and fall of influential African-American politician Adam Clayton Powell Jr.[3][4] It was later aired as part of the PBS series The American Experience.
Thought 3: Adam Clayton Powell (film) is a documentary about an African-American politician, not Finnish rock groups. So the documentary about Finnish rock groups must instead be The Saimaa Gesture.
Action 3: Finish[The Saimaa Gesture]

Question: What profession does Nicholas Ray and Elia Kazan have in common?
Thought 1: I need to search Nicholas Ray and Elia Kazan, find their professions, then find the profession they have in common.
Action 1: Search[Nicholas Ray]
Observation 1: Nicholas Ray (born Raymond Nicholas Kienzle Jr., August 7, 1911 – June 16, 1979) was an American film director, screenwriter, and actor best known for the 1955 film Rebel Without a Cause.
Thought 2: Professions of Nicholas Ray are director, screenwriter, and actor. I need to search Elia Kazan next and find his professions.
Action 2: Search[Elia Kazan]
Observation 2: Elia Kazan was an American film and theatre director, producer, screenwriter and actor.
Thought 3: Professions of Elia Kazan are director, producer, screenwriter, and actor. So profession Nicholas Ray and Elia Kazan have in common is director, screenwriter, and actor.
Action 3: Finish[director, screenwriter, actor]

Question: Which magazine was started first Arthur's Magazine or First for Women?
Thought 1: I need to search Arthur's Magazine and First for Women, and find which was started first.
Action 1: Search[Arthur's Magazine]
Observation 1: Arthur's Magazine (1844-1846) was an American literary periodical published in Philadelphia in the 19th century. 
Thought 2: Arthur's Magazine was started in 1844. I need to search First for Women next.
Action 2: Search[First for Women]
Observation 2: First for Women is a woman's magazine published by Bauer Media Group in the USA.[1] The magazine was started in 1989. 
Thought 3: First for Women was started in 1989. 1844 (Arthur's Magazine) < 1989 (First for Women), so Arthur's Magazine was started first.
Action 3: Finish[Arthur's Magazine]

Question: Were Pavel Urysohn and Leonid Levin known for the same type of work?
Thought 1: I need to search Pavel Urysohn and Leonid Levin, find their types of work, then find if they are the same.
Action 1: Search[Pavel Urysohn]
Observation 1: Pavel Samuilovich Urysohn (February 3, 1898 â August 17, 1924) was a Soviet mathematician who is best known for his contributions in dimension theory.
Thought 2: Pavel Urysohn is a mathematician. I need to search Leonid Levin next and find its type of work.
Action 2: Search[Leonid Levin]
Observation 2: Leonid Anatolievich Levin is a Soviet-American mathematician and computer scientist. 
Thought 3: Leonid Levin is a mathematician and computer scientist. So Pavel Urysohn and Leonid Levin have the same type of work. 
Action 3: Finish[yes]"""
    },
        
    "evaluator": {   
        # Evaluator in Tool-Use Workflow (2 demos)
        "demos": """
Previous Trial:
Question: The Rome Protocols were signed by three Prime Ministers one of which was assassinated as part of what?
Thought 1: I need to search Rome Protocols, find the three Prime Ministers, then find what they were assassinated as part of.
Action 1: Search[Rome Protocols]
Observation 1: The Rome Protocols were a series of three international agreements concluded in Rome on 17 March 1934 between the governments of Austria, Hungary and Italy. They were signed by Italian Prime Minister Benito Mussolini, Austrian Chancellor Engelbert Dollfuss and Hungarian Prime Minister Gyula Gömbös. All of the protocols went into effect on 12 July 1934 and were registered in League of Nations Treaty Series on 12 December 1934.
Thought 2: The three Prime Ministers were Benito Mussolini, Engelbert Dollfuss, and Gyula Gömbös. I need to search Benito Mussolini and find what he was assassinated as part of.
Action 2: Search[Benito Mussolini]
Observation 2: Benito Amilcare Andrea Mussolini (UK:  MU(U)SS-ə-LEE-nee, US:  MOOSS-, Italian: [beˈniːto aˈmilkare anˈdrɛːa mussoˈliːni]; 29 July 1883 – 28 April 1945) was an Italian politician and journalist who founded and led the National Fascist Party (PNF). He was Prime Minister of Italy from the March on Rome in 1922 until his deposition in 1943, as well as "Duce" of Italian fascism from the establishment of the Italian Fasces of Combat in 1919 until his summary execution in 1945 by Italian partisans. As dictator of Italy and principal founder of fascism, Mussolini inspired and supported the international spread of fascist movements during the inter-war period.Mussolini was originally a socialist politician and a journalist at the Avanti! newspaper. In 1912, he became a member of the National Directorate of the Italian Socialist Party (PSI), but he was expelled from the PSI for advocating military intervention in World War I, in opposition to the party's stance on neutrality. In 1914, Mussolini founded a new journal, Il Popolo d'Italia, and served in the Royal Italian Army during the war until he was wounded and discharged in 1917. Mussolini denounced the PSI, his views now centering on Italian nationalism instead of socialism, and later founded the fascist movement which came to oppose egalitarianism and class conflict, instead advocating "revolutionary nationalism" transcending class lines. On 31 October 1922, following the March on Rome (28–30 October), Mussolini was appointed prime minister by King Victor Emmanuel III, becoming the youngest individual to hold the office up to that time. After removing all political opposition through his secret police and outlawing labor strikes, Mussolini and his followers consolidated power through a series of laws that transformed the nation into a one-party dictatorship. Within five years, Mussolini had established dictatorial authority by both legal and illegal means and aspired to create a totalitarian state. In 1929, Mussolini signed the Lateran Treaty with the Holy See to establish Vatican City.
Mussolini's foreign policy aimed to restore the ancient grandeur of the Roman Empire by expanding Italian colonial possessions and the fascist sphere of influence. In the 1920s, he ordered the Pacification of Libya, instructed the bombing of Corfu over an incident with Greece, established a protectorate over Albania, and incorporated the city of Fiume into the Italian state via agreements with Yugoslavia. In 1936, Ethiopia was conquered following the Second Italo-Ethiopian War and merged into Italian East Africa (AOI) with Eritrea and Somalia. In 1939, Italian forces annexed Albania. Between 1936 and 1939, Mussolini ordered the successful Italian military intervention in Spain in favor of Francisco Franco during the Spanish Civil War. Mussolini's Italy initially tried to avoid the outbreak of a second global war, sending troops at the Brenner Pass to delay Anschluss and taking part in the Stresa Front, the Lytton Report, the Treaty of Lausanne, the Four-Power Pact and the Munich Agreement. However, Italy then alienated itself from Britain and France by aligning with Germany and Japan. Germany invaded Poland on 1 September 1939, resulting in declarations of war by France and the UK and the start of World War II.
On 10 June 1940, Mussolini decided to enter the war on the Axis side. Despite initial success, the subsequent Axis collapse on multiple fronts and eventual Allied invasion of Sicily made Mussolini lose the support of the population and members of the Fascist Party. As a consequence, early on 25 July 1943, the Grand Council of Fascism passed a motion of no confidence in Mussolini; later that day King Victor Emmanuel III dismissed him as head of government and had him placed in custody, appointing Pietro Badoglio to succeed him as Prime Minister. After the king agreed to an armistice with the Allies, on 12 September 1943 Mussolini was rescued from captivity in the Gran Sasso raid by German paratroopers and Waffen-SS commandos led by Major Otto-Harald Mors. Adolf Hitler, after meeting with the rescued former dictator, then put Mussolini in charge of a puppet regime in northern Italy, the Italian Social Republic (Italian: Repubblica Sociale Italiana, RSI), informally known as the Salò Republic, causing a civil war. In late April 1945, in the wake of near total defeat, Mussolini and his mistress Clara Petacci attempted to flee to Switzerland, but both were captured by Italian communist partisans and summarily executed by firing squad on 28 April 1945 near Lake Como. The bodies of Mussolini and his mistress were then taken to Milan, where they were hung upside down at a service station to publicly confirm their demise.
Thought 3: Benito Mussolini was assassinated as part of the collapse of the Axis powers during World War II. So the Rome Protocols were signed by three Prime Ministers one of which was assassinated as part of World War II.
Action 3: Finish[World War II]

Reflection: I searched one of the prime ministers involved in the signing, then attemted to answer right away. I should have searched each of the prime ministers, then looked up 'death' on each of their pages in order to get more information before answering.

Previous Trial:
Question: Kam Heskin plays Paige Morgan in a 2004 film directed by who?
Thought 1: I need to search Kam Heskin and Paige Morgan, find the 2004 film they are in, then find the director of the film.
Action 1: Search[Kam Heskin]
Observation 1: Kam Erika Heskin (born May 8, 1973) is an American actress. She began her career playing Caitlin Deschanel on the NBC daytime soap opera Sunset Beach (1998–1999), before appearing in films Planet of the Apes (2001) and Catch Me If You Can (2002). Heskin went on in 2003 to play Elizabeth Bennet in an independent film Pride & Prejudice: A Latter-Day Comedy, Robin in Charmed (2004),  and Paige Morgan in The Prince and Me film franchise (2006–2010).
Thought 2: Kam Heskin plays Paige Morgan in The Prince and Me film franchise. I need to search The Prince and Me film franchise and find the director of the 2004 film.
Action 2: Search[The Prince and Me film franchise]
Observation 2: Could not find [The Prince and Me film franchise]. Similar: ['Kam Heskin', 'Prince of Persia', 'Harry Potter and the Half-Blood Prince (film)', 'Prince of Persia: The Sands of Time (film)', 'The Chronicles of Narnia (film series)', 'Majestic Prince (franchise)', 'Kingsman (franchise)', 'The Little Prince (2015 film)', 'Shrek (franchise)', 'Harry Potter (film series)']
Thought 3: I need to search The Prince and Me (2004 film) instead.
Action 3: Search[The Prince and Me (2004 film)]
Observation 3: Could not find [The Prince and Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Kam Heskin', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince of Persia', 'Brooklynn Prince', 'Prince (musician)']
Thought 4: I need to search The Prince & Me (2004 film).
Action 4: Search[The Prince & Me (2004 film)]
Observation 4: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']
Thought 5: I need to search The Prince & Me (2004 film) instead.
Action 5: Search[The Prince & Me (2004 film)]
Observation 5: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']
Thought 6: I need to search The Prince & Me (2004 film).
Action 6: Search[The Prince & Me (2004 film)]
Observation 6: Could not find [The Prince & Me (2004 film)]. Similar: ['The Prince & Me', 'The Prince & Me 2: The Royal Wedding', 'Prince of Darkness (film)', 'Prince of Persia: The Sands of Time (film)', 'Rob Knox', 'Alexander (2004 film)', 'Prince (musician)', 'Prince of Persia', 'Kam Heskin', 'Brooklynn Prince']

Reflection: I got stuck in a loop where I kept trying to search 'The Prince & Me (2004 film)' but the page could not be found. Instead I should have tried to search the similar results that had a similar name to see and they were made in 2004.
"""
    }
}
crosswords_tot = {
    # 5 shot
    "standard_prompt": '''
Solve 5x5 mini crosswords. Given an input of 5 horizontal clues and 5 vertical clues, generate an output of 5 rows, where each row is 5 letter separated by space.

Input:
h1. A lunar valley
h2. A fatty oil
h3. To entice
h4. To lower; to reduce
h5. A solitary person
v1. According to the roster
v2. Another name for Port-Francqui
v3. An illicit lover; a European lake
v4. To lisp
v5. To come in

Output:
R I L L E
O L E I N
T E M P T
A B A S E
L O N E R

Input:
h1. One who saws
h2. A fungus genus
h3. An assessor
h4. Pasture land
h5. Receiving by the ear
v1. To swell; to increase
v2. The Brazilian macaw; an Australian bird
v3. A Timorese island
v4. Excessive fluid accumulation
v5. Dewy; roscid

Output:
S A W E R
U R E D O
R A T E R
G R A M A
E A R A L

Input:
h1. Dandruff; scum; the bull-trout
h2. One who greets; to vacillate; a British river
h3. A Turkish written decree
h4. Mignon; petty; little
h5. A bishop's permission for a priest to leave a diocese
v1. To steal; to brush across
v2. A sedge (a primitive three-sided grass)
v3. Grape jam
v4. A flatworm larva
v5. Ore refuse; to prepare material for glass by heat

Output:
S C U R F
W A V E R
I R A D E
P E T I T
E X E A T

Input:
h1. Presented; revealed
h2. An interjection expressing sorrow
h3. Benefit; result
h4. A cigarette
h5. Chased up a tree
v1. Swarthy; tawny
v2. An apiarist or bee keeper
v3. To speak formally
v4. To indite; to scribble
v5. An insecticide

Output:
S H O W N
W I R R A
A V A I L
R E T T E
T R E E D

Input:
h1. Scald; an ancient Scandinavian bard
h2. H2O; to irrigate
h3. The companion to an "intro", a postscript or exit piece
h4. An artificial fabric
h5. Deep religious feeling
v1. To rush; to stoop; a descent
v2. A New Zealand fir tree
v3. Mine refuse
v4. The garden dormouse
v5. Like a drone; humming

Output:
S K A L D
W A T E R
O U T R O
O R L O N
P I E T Y

Input:
{input}

Output:
''', 
    "cot_prompt": '''Solve 5x5 mini crosswords. Given an input of 5 horizontal clues and 5 vertical clues, generate thoughts about which 5-letter word fits each clue, then an output of 5 rows, where each row is 5 letter separated by space.

Input:
h1. A lunar valley
h2. A fatty oil
h3. To entice
h4. To lower; to reduce
h5. A solitary person
v1. According to the roster
v2. Another name for Port-Francqui
v3. An illicit lover; a European lake
v4. To lisp
v5. To come in

Thoughts:
h1. A lunar valley: RILLE
h2. A fatty oil: OLEIN
h3. To entice: TEMPT
h4. To lower; to reduce: ABASE
h5. A solitary person: LONER
v1. According to the roster: ROTAL
v2. Another name for Port-Francqui: ILEBO
v3. An illicit lover; a European lake: LEMAN
v4. To lisp: LIPSE
v5. To come in: ENTER

Output:
R I L L E
O L E I N
T E M P T
A B A S E
L O N E R

Input:
h1. One who saws
h2. A fungus genus
h3. An assessor
h4. Pasture land
h5. Receiving by the ear
v1. To swell; to increase
v2. The Brazilian macaw; an Australian bird
v3. A Timorese island
v4. Excessive fluid accumulation
v5. Dewy; roscid

Thoughts:
h1. One who saws: SAWER
h2. A fungus genus: UREDO
h3. An assessor: RATER
h4. Pasture land: GRAMA
h5. Receiving by the ear: EARAL
v1. To swell; to increase: SURGE
v2. The Brazilian macaw; an Australian bird: ARARA
v3. A Timorese island: WETAR
v4. Excessive fluid accumulation: EDEMA
v5. Dewy; roscid: RORAL

Output:
S A W E R
U R E D O
R A T E R
G R A M A
E A R A L

Input:
h1. Dandruff; scum; the bull-trout
h2. One who greets; to vacillate; a British river
h3. A Turkish written decree
h4. Mignon; petty; little
h5. A bishop's permission for a priest to leave a diocese
v1. To steal; to brush across
v2. A sedge (a primitive three-sided grass)
v3. Grape jam
v4. A flatworm larva
v5. Ore refuse; to prepare material for glass by heat

Thoughts:
h1. Dandruff; scum; the bull-trout: SCURF
h2. One who greets; to vacillate; a British river: WAVER
h3. A Turkish written decree: IRADE
h4. Mignon; petty; little: PETIT
h5. A bishop's permission for a priest to leave a diocese: EXEAT
v1. To steal; to brush across: SWIPE
v2. A sedge (a primitive three-sided grass): CAREX
v3. Grape jam: UVATE
v4. A flatworm larva: REDIA
v5. Ore refuse; to prepare material for glass by heat: FRETT

Output:
S C U R F
W A V E R
I R A D E
P E T I T
E X E A T

Input:
h1. Presented; revealed
h2. An interjection expressing sorrow
h3. Benefit; result
h4. A cigarette
h5. Chased up a tree
v1. Swarthy; tawny
v2. An apiarist or bee keeper
v3. To speak formally
v4. To indite; to scribble
v5. An insecticide

Thoughts:
h1. Presented; revealed: SHOWN
h2. An interjection expressing sorrow: WIRRA
h3. Benefit; result: AVAIL
h4. A cigarette: RETTE
h5. Chased up a tree: TREED
v1. Swarthy; tawny: SWART
v2. An apiarist or bee keeper: HIVER
v3. To speak formally: ORATE
v4. To indite; to scribble: WRITE
v5. An insecticide: NALED

Output:
S H O W N
W I R R A
A V A I L
R E T T E
T R E E D

Input:
h1. Scald; an ancient Scandinavian bard
h2. H2O; to irrigate
h3. The companion to an "intro", a postscript or exit piece
h4. An artificial fabric
h5. Deep religious feeling
v1. To rush; to stoop; a descent
v2. A New Zealand fir tree
v3. Mine refuse
v4. The garden dormouse
v5. Like a drone; humming

Thoughts:
h1. Scald; an ancient Scandinavian bard: SKALD
h2. H2O; to irrigate: WATER
h3. The companion to an "intro", a postscript or exit piece: OUTRO
h4. An artificial fabric: ORLON
h5. Deep religious feeling: PIETY
v1. To rush; to stoop; a descent: SWOOP
v2. A New Zealand fir tree: KAURI
v3. Mine refuse: ATTLE
v4. The garden dormouse: LEROT
v5. Like a drone; humming: DRONY

Output:
S K A L D
W A T E R
O U T R O
O R L O N
P I E T Y

Input:
{input}
''',
    "propose_prompt": '''Let's play a 5 x 5 mini crossword, where each word should have exactly 5 letters.\n\n{input}\n
Given the current status, list all possible answers for unfilled or changed words, and your confidence levels (certain/high/medium/low), using the format "h1. apple (medium)". Use "certain" cautiously and only when you are 100% sure this is the correct word. You can list more then one possible answer for each word.
''',
    "value_prompt": '''Evaluate if there exists a five letter word of some meaning that fit some letter constraints (sure/maybe/impossible).\n
Incorrect; to injure: w _ o _ g
The letter constraint is: 5 letters, letter 1 is w, letter 3 is o, letter 5 is g.
Some possible words that mean "Incorrect; to injure":
wrong (w r o n g): 5 letters, letter 1 is w, letter 3 is o, letter 5 is g. fit!
sure

A person with an all-consuming enthusiasm, such as for computers or anime: _ _ _ _ u
The letter constraint is: 5 letters, letter 5 is u.
Some possible words that mean "A person with an all-consuming enthusiasm, such as for computers or anime":
geek (g e e k): 4 letters, not 5
otaku (o t a k u): 5 letters, letter 5 is u
sure

Dewy; roscid: r _ _ _ l
The letter constraint is: 5 letters, letter 1 is r, letter 5 is l.
Some possible words that mean "Dewy; roscid":
moist (m o i s t): 5 letters, letter 1 is m, not r
humid (h u m i d): 5 letters, letter 1 is h, not r
I cannot think of any words now. Only 2 letters are constrained, it is still likely
maybe

A woodland: _ l _ d e
The letter constraint is: 5 letters, letter 2 is l, letter 4 is d, letter 5 is e.
Some possible words that mean "A woodland":
forest (f o r e s t): 6 letters, not 5
woods (w o o d s): 5 letters, letter 2 is o, not l
grove (g r o v e): 5 letters, letter 2 is r, not l
I cannot think of any words now. 3 letters are constrained, and _ l _ d e seems a common pattern
maybe

An inn: _ d _ w f
The letter constraint is: 5 letters, letter 2 is d, letter 4 is w, letter 5 is f.
Some possible words that mean "An inn":
hotel (h o t e l): 5 letters, letter 2 is o, not d
lodge (l o d g e): 5 letters, letter 2 is o, not d
I cannot think of any words now. 3 letters are constrained, and it is extremely unlikely to have a word with pattern _ d _ w f to mean "An inn"
impossible

Chance; a parasitic worm; a fish: w r a k _
The letter constraint is: 5 letters, letter 1 is w, letter 2 is r, letter 3 is a, letter 4 is k.
Some possible words that mean "Chance; a parasitic worm; a fish":
fluke (f l u k e): 5 letters, letter 1 is f, not w
I cannot think of any words now. 4 letters are constrained, and it is extremely unlikely to have a word with pattern w r a k _ to mean "Chance; a parasitic worm; a fish"
impossible

{input}
''',
    # """20 games with indices 1, 6, · · · , 91, 96 for testing, and games 136, 141, 146, 151, 156 for prompting"""
    "test_idx": [1, 6, 11, 16, 21, 26, 31, 36, 41, 46, 51, 56, 61, 66, 71, 76, 81, 86, 91, 96],
}
gsm8k_selfeval = {
    
}

task_framework_config = {
    "alfred_reflexion": alfred_reflexion,
    "alfred_react": {},
    "hotpotqa_reflexion": hotpotqa_reflexion,
    "hotpotqa_react":hotpotqa_react,
    "crosswords_tot": crosswords_tot,
    "blocksworld_rap": blocksworld_rap,
    "gsm8k_rap": gsm8k_rap,
    "gsm8k_selfeval": gsm8k_selfeval
}