# label
# prompt: system, observation, image_path(only one here), action

import os
import json
import pandas as pd
import statistics

def get_perception_json(input_path, output_path, game):

    new_json = {}
    with open(input_path, 'r', encoding='utf-8') as f:
        data = json.load(f)

        for key in data.keys():
            if game == 'overcooked':
                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of Overcooked.",
                        "observation" : (
                            "GAME RULES:\n"
                            "1. Overcooked is a cooperative game where two chefs collaborate to cook and serve soups in 50 timesteps.\n"
                            "2. The chefs can move in the available area and cannot move to the counter.\n"
                            "3. The chefs can interact with the object on the tile that they are facing.\n"
                            "4. A soup is cooked in the following steps:\n"
                            "    a. Pick up (interact) 1 onion and place (interact) it in the pot.\n"
                            "    b. After placing 3 onions in the pot, open (interact) the pot and cook for 5 timesteps. The pot will show how long the soup has been cooked.\n"
                            "    c. When the pot shows the number 5, the soup is finished. Pick up (interact) a dish to plate (interact) the soup.\n"
                            "    d. Deliver the soup and put (interact) it on the serving location.\n\n"
                            "PLAYER INFORMATION:\n"
                            "1. You are controlling chef_0 in the blue hat.\n"
                            "2. The latest game frame is given. The image shows the frame and object legend, with the timestep in the top left corner.\n"
                            "3. The room is a 4x5 grid (row and column).\n\n"
                            "OBJECTIVE:\n"
                            "1. You need to identify the current positions of each game element from the given game frame, and output a 4x5 array representing the current board (grid). Each element in the grid should be the exact symbol that appears at that position in the game.\n"
                            "2. The letter X stands for table, P for pot, O and o stand for onions, D and d for dishs, S for service desk, and M for empty area which is available for chefs to move.\n"
                            "3. The numbers 0 and 1 represent the chef, and the direction arrow ↑ ↓ ← → represents the direction the chef is facing. Each object occupies a grid size, and the chef moves one grid distance at a time.\n"
                            "4. When the onion or dish is on the table or being held by chef, a lowercase o or d will be added after its corresponding character.\n"
                            "5. When one onion is placed on the pot, it will be denoted as P{ø, and P{øøø means that there are three onions on the pot.\n"
                            "6. When the pot cooks the soup, it will show how long it has been cooked, such as P{øøø1 means that it has been cooked for 1 timestep. P{øøø✓ means that the soup is finished. 0{øøø✓ means that the chef_0 is holding a dish of soup."
                            # "The following symbols are used to represent the game elements:\n"
                            # "- X: counter\n"
                            # "- P: pot\n"
                            # "- O/o: onions (lowercase when on the table or being held by chef)\n"
                            # "- D/d: dishs (lowercase when on the table or being held by chef)\n"
                            # "- S: serving location\n"
                            # "- M: available area (where chefs can move)\n"
                            # "- 0/1: chef 0 (blue hat) and chef 1 (green hat)\n"
                            # "- ↑/↓/←/→: direction the chef is facing\n"
                            # "- P{ø: one onion is placed on the pot\n"
                            # "- P{øøø: three onions are placed on the pot\n"
                            # "- P{øøø1: soup has been cooked for 1 timestep\n"
                            # "- P{øøø✓: soup is finished\n"
                            # "- 0{øøø✓: chef_0 is holding a dish of soup"
                            ),
                        "image_path" : data[key]['prompt']['image_path'][-1:],
                        "action" : """INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    "grid": <GRID>\n}\n```\nwhere <GRID> is the 4x5 array representing the current board. For example:\n```json\n{\n    "grid": [["X", "X", "P", "X", "X"], ["O", "→0", "←1", "M", "O"], ["X", "M", "M", "M", "X"], ["X", "D", "X", "S", "X"]]\n}\n```\nHere, the first row is the top row of the board, and the first column is the leftmost column of the board. Each element should be the exact symbol that appears at that position in the game."""
                    },
                    "label" : data[key]['true_grid']
                }
            
            elif game == 'coin_dilemma':
                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that maximizes your score in the Coin Dilemma.",
                        "observation" : """You are an AI agent that maximizes your score in the game of Coin Dilemma.\n\nGAME RULES:\n1. The Coin Dilemma is a general-sum game played on a 5x5 grid board with two players (red and blue) and two types of coins (red and blue).\n2. Players receive rewards on different events:\n    a. A player collects one coin of its own color: the player +1 point.\n    b. A player collects one coin of the other player's color: the player +1 point, the other player -2 points.\n3. New coins spawn randomly on the board after each collection.\n\nPLAYER INFORMATION:\n1. You are the blue player.\n2. The latest game frame is given, which contains a snapshot of the game board on the left, and a table of events and counters on the right.\n3. The red and blue players are represented by a red and blue pacman icon, respectively. The red and blue coins are represented by red and blue coin icons, respectively. If both players are in the same position, they are represented by a half-red-half-blue pacman icon.\n\nOBJECTIVE:\nYou need to identify the current positions of each game element (players and coins) from the given game frame, and output a 5x5 array representing the current board. Use `0` for empty cells, `1` and `2` for the red and blue player, `3` and `4` for the red and blue coin, respectively. In particular, use `5` when the two players are in the same cell.""",
                        "image_path" : data[key]['prompt']['image_path'][-1:],
                        "action" : """INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    "board": <BOARD>\n}\n```\nwhere <BOARD> is the 5x5 array representing the current board. For example, ```json\n{\n    "board": [[0, 0, 1, 0, 0], [0, 0, 0, 0, 2], [0, 0, 0, 3, 0], [0, 0, 0, 0, 0], [0, 4, 0, 0, 0]]\n}\n```. Here, the first row is the top row of the board, and the first column is the leftmost column of the board."""
                        },
                    "label" : data[key]['state']
                }

            elif game == 'monster_hunt':
                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of Monster Hunt.",
                        "observation" : """You are an AI agent that maximizes your score in the game of Monster Hunt.\n\nGAME RULES:\n1. Monster Hunt is a general-sum game played on a 5x5 grid board with two players (red and blue), one monster, and two apples.\n2. The monster moves towards the closest player in each step.\n3. Players move in the grid-world and receive rewards on different events:\n    a. One player eats an apple: the player +2 points and the apple respawns at a random position.\n    b. One player encounters the monster alone: the player -2 points and respawns at a random position.\n    c. Two players defeat the monster together: both players +5 points and the monster respawns at a random position.\n\nPLAYER INFORMATION:\n1. You are the blue player.\n2. The latest game frame is given, which contains a snapshot of the game board on the left, and a table of events and counters on the right.\n3. The red and blue players are represented by a red and blue pacman icon, respectively. The monster is represented by a black demon icon, and the apples are represented by green apple icons. If both players are in the same position, they are represented by a half-red-half-blue pacman icon.\n\nOBJECTIVE:\nYou need to identify the current positions of each game element (players, apples, and the monster) from the given game frame, and output a 5x5 array representing the current board. Use `0` for empty cells, `1` and `2` for the red and blue player, `3` for the monster, and `4` for the two apples. In particular, use `5` when the two players are in the same cell.""",
                        "image_path" : data[key]['prompt']['image_path'][-1:],
                        "action" : """INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    "board": <BOARD>\n}\n```\nwhere <BOARD> is the 5x5 array representing the current board. For example, ```json\n{\n    "board": [[0, 0, 1, 0, 0], [0, 0, 0, 0, 2], [0, 0, 0, 3, 0], [0, 0, 4, 0, 0], [0, 4, 0, 0, 0]]\n}\n```. Here, the first row is the top row of the board, and the first column is the leftmost column of the board."""
                        },
                    "label" : data[key]['state']
                }

            elif game == 'battle_of_the_colors':
                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of Battle of the Colors.",
                        "observation" : """You are an AI agent that maximizes your score in the game of Battle of the Colors.\n\nGAME RULES:\n1. The Battle of the Colors is a general-sum game played on a 5x5 grid board with two players (red and blue) and two types of blocks (red and blue).\n2. Players receive rewards on different events:\n    a. When both players are on a red block: red player +2 points, blue player +1 point, and the red block will be refreshed to a new random position.\n    b. When both players are on a blue block: red player +1 point, blue player +2 points, and the blue block will be refreshed to a new random position.\n    c. When players are on different blocks: both players +0 points, and both blocks will be refreshed to new random positions.\n\nPLAYER INFORMATION:\n1. You are the blue player.\n2. The latest game frame is given, which contains a snapshot of the game board on the left, and a table of events and counters on the right.\n3. The red and blue players are represented by red and blue pacman icons, respectively. The red and blue blocks are represented by red and blue rectangles, respectively. If both players are in the same position, they are represented by a half-red-half-blue pacman icon.\n\nOBJECTIVE:\nYou need to identify the current positions of each game element (players and blocks) from the given game frame, and output a 5x5 array representing the current board. Use `0` for empty cells, `1` and `2` for the red and blue player, and `3` and `4` for the red and blue block, respectively. In particular, use `5` when the two players are in the same cell, `6` when the red player is on the red block, `7` when the red player is on the blue block, `8` when the blue player is on the red block, and `9` when the blue player is on the blue block.""",
                        "image_path" : data[key]['prompt']['image_path'][-1:],
                        "action" : """INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    "board": <BOARD>\n}\n```\nwhere <BOARD> is the 5x5 array representing the current board. For example, ```json\n{\n    "board": [[0, 0, 6, 0, 0], [0, 0, 0, 0, 2], [0, 0, 0, 0, 0], [0, 0, 0, 0, 0], [0, 4, 0, 0, 0]]\n}\n```. Here, the first row is the top row of the board, and the first column is the leftmost column of the board."""
                        },
                    "label" : data[key]['state']
                }

            elif game == 'hanabi':

                example = """
    {
    "life_tokens": "2",
    "info_tokens": "5",
    "fireworks": ["R0", "Y2", "G0", "W3", "B0"],
    "mycard": {
        "card_0": { "visible_card": "?", "digits": ["1", "2"], "colors": ["R", "Y"] },
        "card_1": { "visible_card": "?", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_2": { "visible_card": "?", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_3": { "visible_card": "?", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_4": { "visible_card": "?", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] }
    },
    "othercard": {
        "card_0": { "visible_card": "Y5", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_1": { "visible_card": "W5", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_2": { "visible_card": "Y2", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_3": { "visible_card": "G2", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] },
        "card_4": { "visible_card": "G5", "digits": ["1", "2", "3", "4", "5"], "colors": ["R", "Y", "G", "W", "B"] }
    }
    }
    """
                action = """
    {
    "life_tokens": "<LIFE_TOKENS>",
    "info_tokens": "<INFO_TOKENS>",
    "fireworks": "<LIST[STR]>",
    "mycard": {
        "<CARD_I>": { "visible_card": "<CARD>", "digits": "<LIST[STR]>", "colors": "<LIST[STR]>" },
        ...
    },
    "othercard": {
        "<CARD_I>": { "visible_card": "<CARD>", "digits": "<LIST[STR]>", "colors": "<LIST[STR]>" },
        ...
    }
    }
    """

                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of hanabi.",
                        "observation" : ("GAME RULES:\n1. Hanabi is a cooperative card game for 2 players.\n2. The deck consists of 5 colors: R(Red), Y(Yellow), G(Green), W(White), B(Blue), with ranks ranging from 1 to 5. Each color contains 10 cards: three of rank 1, two each of rank 2 through 4, and one of rank 5, for a total of 50 cards.\n3. Each player holds 5 cards in hand.\n4. There are 8 Info tokens (used to give hints) and 3 Life tokens (penalties for misplays).\n5. As in blind man's bluff, players can see each other's cards but they cannot see their own. Play proceeds around the table; each turn, a player must take one of the following actions:\n    a. (Play i): play the i-th card from your hand (0-indexed) and attempt to add it to the cards already played. This is successful if the card is a 1 in a suit that has not yet been played, or if it is the next number sequentially in a suit that has been played. Otherwise a Life token is consumed and the misplayed card is discarded. Successfully playing a 5 of any suit replenishes one Info token. Whether the play was successful or not, the player draws a replacement card from the deck (if any remain).\n    b. (Discard i): discard the i-th card from your hand and draw a replacement card from the deck (if any remain). The discarded card is out of the game and can no longer be played. Discarding a card replenishes one Info token.\n    c. (Reveal player +1 color c): spend one Info token to reveal all cards of color c in the other player's hand.\n    d. (Reveal player +1 rank r): spend one Info token to reveal all cards of rank r in the other player's hand.\n6. The game ends immediately when either all Life tokens are used up, resulting in a game loss with a score of 0, or when all 5s have been successfully played, resulting in a game win with a score of 25. Otherwise, the game continues until the deck runs out and one final round is completed. At the end of the game, the final score is calculated as the sum of the highest card played in each suit, up to a maximum of 25 points. Your goal is to maximize the combined score between you and your teammates. \n\nPLAYER INFORMATION:\nYou are player 0.\n\nGAME STATE:\nBelow is a visual representation of the current game state:\n    - The first section, located above the image, presents the game's basic state information.\n    - The second section summarizes the most recent player actions.\n    - The third section displays the current firework stacks, with each color labeled by the highest successfully played rank.\n    - The fourth section shows your own hand, represented as gray squares marked with '?', reflecting the fact that you cannot see your own cards.\n    - The fifth section presents the other player's hand, with each card shown in its true color and rank, since it is fully visible to you.\nBelow each card, you will find two lines of inferred information:\n    - Color: a list of all possible colors deduced for that card so far.\n    - Rank: a list of all possible ranks deduced for that card so far.\nThe information displayed below your cards reflects the hints the other player has given you so far.\nThe information below the other player's cards represents what they currently believe about their own cards, based on all the useful hints you have provided them up to this point. For example, below your first card you might see:\n    Card 0:\n    Color: R, Y\n    Rank: 2, 3\nindicating that your card 0 is either Red or Yellow and has rank 2 or 3.\n\nLEGAL ACTIONS:\n(Play 0), (Play 1), (Play 2), (Play 3), (Play 4), (Reveal player +1 color Y), (Reveal player +1 color G), (Reveal player +1 color W), (Reveal player +1 rank 2), (Reveal player +1 rank 5).\n\n"
f"Objective:\nYour goal now is to identify various pieces of game information from the current state, including life_tokens, info_tokens, the current progress of fireworks, and the hand information of both players. Be aware that hand information not only includes the actual card values but also all the knowledge each player currently has about those cards. For example, \n{example}\n means that the current life_token and info_token are 2 and 5 respectively, and the fireworks progress is Y2 and W3, while all others are still at 0. As for the cards: I don't know the exact values of my own cards, so they are represented with a '?', but I do know all of my opponent's cards: they are Y5, W5, Y2, G2, G5. Regarding card information: I only know that my first card's digits could be 1 or 2, and its color could be R or Y; for my other cards, I have no information. Meanwhile, my opponent currently has no information about any of their cards."
                        ),                        
                        "image_path" : data[key]['image_path'][-1:],
                        "action" : f"INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n{action}\nNotice that you need to output each card of both players."
                        },
                    "label" : data[key]['label']
                }



            elif game == 'breakthrough':

                example = "[['b','b','b','b','b','b','b','b'],['b','b','b','b','b','b','b','b'],['.','.','.','.','.','.','.','.'],['.','.','.','.','.','.','.','.'],['.','.','.','.','.','.','.','.'],['.','.','.','.','.','.','.','.'],['w','w','w','w','w','w','w','w'],['w','w','w','w','w','w','w','w']]"
                action = "[['<PIECE>', ..., ], [...], ...]"

                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of breakthrough.",
                        "observation" : ("GAME RULES:\n1. Breakthrough is a two-player strategy game played on a 8x8 grid.\n2. Each player controls pieces of a color: 'White' or 'Black'. 'White' starts at the bottom (rows 1 and 2), while 'Black' starts at the top (rows 7 and 8).\n3. If 'White' moves a piece to row 8, 'White' wins the game. Conversely, if 'Black' moves a piece to row 1, 'Black' wins the game.\n4. Players alternate turns, moving one piece per turn, with 'Black' going first.\n5. A piece may only move one space straight or diagonally forward, and only if the destination square is empty.\n6. A piece may only capture an opponent's piece by moving one space diagonally forward into its square. In this case, the opponent's piece is removed, and your piece takes its place.\n7. 'Black' moves forward by decreasing row indices (downward), while 'White' moves forward by increasing them (upward).\n8. Moves are specified by their start and end positions. For example, 'a2a3' indicates moving a piece from a2 (column a, row 2) to a3 (column a, row 3).\n9. The board is labeled with columns a-h and rows 1-8. Thus, h8 is the top-right corner, and a1 is the bottom-left corner.\n\n"
f"Objective:\nYour goal is to identify the placement of pieces on every point of the current board, where empty points are represented by '.', and white and black pieces are represented by 'w' and 'b', respectively; your recognition should proceed from top to bottom and from left to right; for example, in the initial state, your output should be \n{example}\n, because the top two rows contain black pieces, the bottom two rows contain white pieces, and the middle rows are empty.\n"
                        ),                        
                        "image_path" : data[key]['image_path'][-1:],
                        "action" : f"INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n{action}"
                        },
                    "label" : data[key]['label']
                }


            elif game == 'kuhn_poker':

                example = '{"card_0":"J","card_1":"unknown","chip_0":"1","chip_1":"1"}'
                action = '{"card_0":"<CARD_0>", "card_1":"<CARD_1>", "chip_0":"<CHIP_0>", "chip_1":"<CHIP_1>"}'

                new_json[key] = {
                    "prompt" : {
                        "system" : "You are an AI agent that makes optimal decisions in the game of Kuhn poker.",
                        "observation" : ("GAME RULES:\n1. Kuhn poker is a two-player card game. The deck includes only three cards: King (K) > Queen (Q) > Jack (J).\n2. At the start of each game, both player_0 and player_1 place 1 chip into the pot as a blind ante.\n3. Each player is dealt a private card, and the third card is set aside unseen.\n4. The two players take turns acting, starting with player_0. A player can choose to:\n    a. <PASS>: place no additional chips into the pot.\n    b. <BET>: place 1 additional chip into the pot.\n5. If a player chooses to <PASS> after the other player's <BET>, the betting player wins the pot.\n6. If both players choose to <PASS> or both players choose to <BET>, the player with the higher card wins the pot.\n\n"
f"Objective:\nYou need to identify the hand cards of both players and their betting status based on the current image.\nFor example, suppose you are player_0 and your hand card is 'J'. Since you cannot see your opponent's card, you should mark it as 'unknown'.\nIf both you and your opponent have each bet 1 chip so far, your output should look like this:\n{example}"\
),                        
                        "image_path" : data[key]['image_path'][-1:],
                        "action" : f"INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n{action}"
                        },
                    "label" : {
                        "card_0": data[key]['card_0'],
                        "card_1": data[key]['card_1'],
                        "chip_0": data[key]['chip_0'],
                        "chip_1": data[key]['chip_1'],
                    }
                }

    with open(output_path, "w", encoding="utf-8") as f:
        json.dump(new_json, f, ensure_ascii=False, indent=4)

    print(f"JSON save {output_path}")

# for game in ['overcooked', 'coin_dilemma', 'monster_hunt', 'battle_of_the_colors', 'hanabi', 'breakthrough', 'kuhn_poker']:
#     path = f"data/perception-ori/{game}.json"
#     output = f"data/perception/{game}.json"
#     get_perception_json(path, output, game)

from io import BytesIO
from PIL import Image
import base64
import json
import numpy as np
import os
import pandas as pd
import random
import yaml
import copy

def image_to_b64(image_path, reasoning=False):
    if not reasoning:
        # png -> base64
        image = Image.open(image_path)
        with BytesIO() as image_buffer:
            image.save(image_buffer, format="PNG")
            byte_data = image_buffer.getvalue()
            image_b64 = base64.b64encode(byte_data).decode("utf-8")
        return image_b64
    else:
        # base64 -> just return
        return image_path
    
def get_atrai_pong_json():
    json_path = 'tests/decision-making/atari_pong/random_agent+builtin_agent/default/20250918_202833/results.json'
    with open(json_path, 'r') as f:
        data = json.load(f)

    perception = {}
    num = 0
    system = "You are an AI agent that maximizes your score in the game of Atari Pong."
    observation = "GAME RULES:\n1. Atari Pong is a zero-sum game played on a 2D screen with two players (left and right) and a ball.\n2. Players each controls a paddle and receive rewards on different events:\n    a. If the ball passes your paddle: the opponent +1 point.\n    b. If the ball passes the opponent's paddle: you +1 point.\n3. The ball bounces off the top/bottom walls and the paddles.\n4. Paddles can only move vertically within the top and bottom walls.\n5. First player to score 3 points wins.\n\nPLAYER INFORMATION:\n1. You are controlling the right paddle.\n2. The recent 4 game frames are given in chronological order, with the most recent frame at the end.\n3. The ball is represented by a white square, and the paddles are represented by vertical rectangles.\n4. Scores are displayed at the top of the screen.\n\nOBJECTIVE:\nYou are required to identify the positions of both players' paddles, the current overall score in the game, and the coordinates of the ball.\nTo help you estimate the positions, coordinate axes and rulers are marked on both sides of the image.\nNote that the paddles can only move vertically, so you only need to determine their y-axis center positions.\nSince each paddle has a certain length, please report the center y-coordinate of each paddle.\nHowever, the ball can move freely in both directions, so you must report both the x and y coordinates of the ball, respectively."
    action = "INSTRUCTIONS: Please output your answer in the following JSON format with no extra text:\n```json\n{\n    \"position_left\": float,\n    \"position_right\": float,\n    \"score_left\": int,\n    \"score_right\": int,\n    \"ball_x\": float,\n    \"ball_y\": float\n}\n```\nHere, 'position_left' and 'position_right' refer to the y-axis center positions of the left paddle (light brown) and the right paddle (green), respectively. 'score_left' and 'score_right' represent the current scores of the two players. 'ball_x' and 'ball_y' represent the x and y coordinates of the ball."
    for epi in data.keys():
        if epi.startswith('episode_'):
            for step in data[epi].keys():
                if step.startswith('step_'):
                    agent_value = data[epi][step]['agent_0']
                    position_left = agent_value['position']['position'][0]
                    position_right = agent_value['position']['position'][1]
                    score_left = int(agent_value['position']['score'][1])
                    score_right = int(agent_value['position']['score'][0])   
                    ball_x = agent_value['position']['ball'][0]
                    ball_y = agent_value['position']['ball'][1]
                    if position_left > 30 and position_right > 30:
                        perception[str(num)] = {
                            'prompt':
                            {
                                "system": system,
                                "observation": observation,
                                "image_path": [image_to_b64(agent_value['image_path'][0])],
                                "action": action,
                            },
                            'label':
                            {
                                "position_left": position_left,
                                "position_right": position_right,
                                "score_left": score_left,
                                "score_right": score_right,
                                "ball_x": ball_x,
                                "ball_y": ball_y
                            }
                        }
                        num += 1
                        if num == 400:
                            break

    with open("data/perception/atari_pong.json", 'w') as f:
        json.dump(perception, f, indent=2)


def get_simple_push_json():
    json_path = 'tests/decision-making/simple_push/random_agent+builtin_agent/default/20250918_211603/results.json'
    with open(json_path, 'r') as f:
        data = json.load(f)

    perception = {}
    num = 0
    system = "You are an AI agent that makes optimal decisions in the game of Simple Push."
    observation = "You are playing as the 'blocker' in the Simple Push environment, a simple 2D competitive game. Analyze the game state from the given observation and make actions to win.\n\nGAME RULES:\n1. The environment contains two moving players (blocker and charger) and a target position in a 2D environment.\n2. The charger's goal is to reach the target location.\n3. The blocker's goal is to prevent the charger from reaching the target by blocking and pushing.\n4. At each step:\n    a. the charger is rewarded based on the distance to the target - the smaller the distance, the higher the reward.\n    b. the blocker is rewarded if it is close to the target, and if the charger is far from the target (the difference of the distances)\n5. The player with the higher cumulative reward at the end of the game wins.\n\nPLAYER INFORMATION:\nYour observation includes the following information: an image representing the current game state. The target is marked with a red \"X\", while the charger and blocker are represented by green and blue circles, respectively. The coordinate grid in the image indicates the positions and distances between the players and the target.\n\nOBJECTIVE:\nYou are required to identify the coordinates of both players — the charger (represented in green) and the blocker (represented in blue)."
    action = "INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    \"charger_x\": float,\n    \"charger_y\": float,\n    \"blocker_x\": float,\n    \"blocker_y\": float\n}\n```\nHere, charger_x, charger_y, blocker_x, and blocker_y represent the x and y coordinates of the charger and blocker, respectively."    
    for epi in data.keys():
        if epi.startswith('episode_'):
            for step in data[epi].keys():
                if step.startswith('step_'):
                    agent_value = data[epi][step]['agent_0']
                    target_x, target_y = agent_value['position']['target_pos']
                    charger_x, charger_y = agent_value['position']['charger_pos']
                    blocker_x, blocker_y = agent_value['position']['blocker_pos']
                    ax_x = agent_value['position']['ax_x'][1] - agent_value['position']['ax_x'][0]
                    ax_y = agent_value['position']['ax_y'][1] - agent_value['position']['ax_y'][0]
                    perception[str(num)] = {
                        'prompt':
                        {
                            "system": system,
                            "observation": observation,
                            "image_path": [image_to_b64(agent_value['image_path'][0])],
                            "action": action,
                        },
                        'label':
                        {
                            "charger_x": charger_x,
                            "charger_y": charger_y,
                            "blocker_x": blocker_x,
                            "blocker_y": blocker_y,
                            "ax_x": ax_x,
                            "ax_y": ax_y,
                        }
                    }
                    num += 1
                    if num == 400:
                        break

    with open("data/perception/simple_push.json", 'w') as f:
        json.dump(perception, f, indent=2)

def get_kaz_json():
    json_path = 'tests/decision-making/knights_archers_zombies/random_agent+random_agent/default/20250919_120450/results.json'
    with open(json_path, 'r') as f:
        data = json.load(f)

    perception = {}
    num = 0
    system = "You are an AI agent that makes optimal decisions in the game of Knights Archers Zombies."
    observation = "GAME RULES:\n1. Knights Archers Zombies (KAZ) is a cooperative survival game played on a 2D battlefield. Your goal is to survive as long as possible while maximizing zombie kills and protecting yourself and your teammate.\n2. Zombies spawn from the top and walk down towards the bottom border in unpredictable paths.\n3. You control either a Knight (melee fighter) or an Archer (ranged fighter) starting at the bottom.\n4. In the image, the green units represent zombies, the red units represent Archers, and the white units represent Knights.\n5. Game ends when: (a) One agent die, or (b) A zombie reaches the bottom border.\n6. Rewards: +1 point for each zombie killed.\n7. Knights attack with a mace in an arc in front of them. Archers shoot arrows in straight lines.\n8. All agents can move forward/backward and rotate left/right to change facing direction.\n\nPLAYER INFORMATION:\n1. You are playing as an ARCHER (ranged fighter).\n2. Your weapon shoots arrows in a straight line in your facing direction.\n3. You can attack zombies from a distance but have limited arrows.\n4. Strategy: Maintain distance, rotate to aim at zombies, then shoot arrows.\n5. The recent 1 game frames show the complete battlefield from a top-down view.\n6. You can see all zombies, your teammate, and the entire map in these images.\n\nOBJECTIVE:\nYour objective is to accurately identify the current game state, specifically the coordinates of the Archer (represented by the red unit) and the Knight (represented by the white unit). For simplicity, there is exactly one Archer and one Knight in the game state.\nAdditionally, you must accurately identify the number of zombies and their coordinates in the current game state. To assist you, coordinate axes have been added around the game area for easier reference. You only need to output the center coordinates of the identified targets."
    action = "INSTRUCTIONS:\nPlease output your answer in the following JSON format with no extra text:\n```json\n{\n    \"Archer\": [float, float],\n    \"Knight\": [float, float],\n    \"Zombies_count\": int,\n    \"Zombies\": [[float, float], ..., [float, float]]\n}\n```\nHere, each [float, float] represents the corresponding (x, y) coordinate. Since there are multiple zombies, 'Zombies' is a two-dimensional array, with each row representing the coordinates of one zombie. Additionally, you don't need to worry about the order of the zombie coordinates — you can treat them as unordered."
    for epi in data.keys():
        if epi.startswith('episode_'):
            for step in data[epi].keys():
                if step.startswith('step_'):
                    agent_value = data[epi][step]['agent_0']
                
                    perception[str(num)] = {
                        'prompt':
                        {
                            "system": system,
                            "observation": observation,
                            "image_path": [image_to_b64(agent_value['image_path'][0])],
                            "action": action,
                        },
                        'label':
                        {
                            "Knight": agent_value['position']['knights_pos'],
                            "Archer": agent_value['position']['archers_pos'],
                            "Zombies": agent_value['position']['zombies_pos'],
                            "Zombies_count": agent_value['position']['zombies_count'],
                        }
                    }
                    num += 1
                    if num == 400:
                        break

    with open("data/perception/knights_archers_zombies.json", 'w') as f:
        json.dump(perception, f, indent=2)

get_kaz_json()

# {
#     "Archer": [float, float],
#     "Knight": [float, float],
#     "Zombies_count": int,
#     "Zombies": [[float, float], ..., [float, float]]
# }

"""

GAME RULES:
1. Knights Archers Zombies (KAZ) is a cooperative survival game played on a 2D battlefield. Your goal is to survive as long as possible while maximizing zombie kills and protecting yourself and your teammate.
2. Zombies spawn from the top and walk down towards the bottom border in unpredictable paths.
3. You control either a Knight (melee fighter) or an Archer (ranged fighter) starting at the bottom.
4. In the image, the green units represent zombies, the red units represent Archers, and the white units represent Knights.
5. Game ends when: (a) One agent die, or (b) A zombie reaches the bottom border.
6. Rewards: +1 point for each zombie killed.
7. Knights attack with a mace in an arc in front of them. Archers shoot arrows in straight lines.
8. All agents can move forward/backward and rotate left/right to change facing direction.

PLAYER INFORMATION:
1. You are playing as an ARCHER (ranged fighter).
2. Your weapon shoots arrows in a straight line in your facing direction.
3. You can attack zombies from a distance but have limited arrows.
4. Strategy: Maintain distance, rotate to aim at zombies, then shoot arrows.
5. The recent 1 game frames show the complete battlefield from a top-down view.
6. You can see all zombies, your teammate, and the entire map in these images.

OBJECTIVE:
Your objective is to accurately identify the current game state, specifically the coordinates of the Archer (represented by the red unit) and the Knight (represented by the white unit). For simplicity, there is exactly one Archer and one Knight in the game state.
Additionally, you must accurately identify the number of zombies and their coordinates in the current game state. To assist you, coordinate axes have been added around the game area for easier reference.




INSTRUCTIONS:
Please output your answer in the following JSON format with no extra text:
```json
{
    "Archer": [float, float],
    "Knight": [float, float],
    "Zombies_count": int,
    "Zombies": [[float, float], ..., [float, float]]
}
```
Here, each [float, float] represents the corresponding (x, y) coordinate. Since there are multiple zombies, 'Zombies' is a two-dimensional array, with each row representing the coordinates of one zombie. Additionally, you don't need to worry about the order of the zombie coordinates — you can treat them as unordered.
"""