import json
from textwrap import dedent

from api_key import api_key
from utils.gpt import Debug_with_GPT4V
from utils.functions import response_to_json


template = f"""
    You are a dedicated assistant for predicting attribute-tag combinations that will make data from one class resemble another class, thereby confusing existing image classification neural network models.

    Each time, the users will provide you with the following information:
    - The class of the main object. 
    - The target class for confusion:
    - A json form that records all attributes and tags involved. For each object class, the attributes and tags can be categorized as `main object`, `background` and `global`. Each attribute corresponds to multiple tags. All attributes and tags of all categories and object classes compose the json form. - Visual attributes in "main object" are related to the given main object. Visual attributes in "background" are related to the background scene. Visual attributes in "global" are related to the image quality.
    - A positive integer.

    Your task is to predict as many combinations of attribute-tag pairs as possible. The combinations are supposed to be highly possible to make existing image classification neural network models fail.
    - Your output is a form. The form is a dictionary, with "predictions" as key and a list of dictionaries as value. In each dictionary in the list, the key is attribute category and value is a dictionary with attributes as keys and tags as values. 
    - You need to predict as many combinations as possible.
    - You need to use the given attributes and tags, and not create new ones.
    - For each predicted attribute, you need to assign one and only one tag.
    - In each combination, the total number of attribute-tag pairs must be equal to the given integer.
    - In the predicted combinations, there can be multiple attributes of the same category.
    - You need to consider the class of the main object and the target class for confusion. You should ensure that the predicted combinations are highly possible to make existing models confuse main object class with target object class. For example, the predicted combinations might make the main object looks like the target class.
    - You output the form only. No explanation in your output.

    **Example 1 input:**
    The main object is "brown bear". The target class for confusion is "teddy bear"
    The number of attribute-tag pairs in each predicted combination is 2
    ```json
    {json.dumps({
        "brown bear": {
            "main object": {
                "size":["big","medium","small"],
                "number":["one","two","many"],
                "pose":["sitting","standing"],
                "expression":["angry","peace","smile"]
            },
            "background": {
                "background object": ["tree", "rocks", "city","indoor"],
            },
            "global": {
                "contrast": ["high", "medium", "low"],
                "light":["high", "medium", "low"]
            },
        },
    })}
    ```

    **Example 1 output:**
    ```json
    {json.dumps({
        "predictions": [
            {
                "main object": {"size": "small"},
                "background": {"background object": "city"},
                "global": {},
            },
            {
                "main object": {"size": "small","pose":"sitting"},
                "background": {},
                "global": {},
            },
            {
                "main object": {"expression":"smile"},
                "background": {"background object": "city"},
                "global": {},
            },
            {
                "main object": {},
                "background": {"background object": "indoor"},
                "global": {"light": "low"},
            },
        ],
    })}
    ```

    For the first combination: Small brown bear in city can make it looks like a teddy bear, since teddy bear are usually small and in the city.
    For the second combination: Small and smiling are common for teddy bear toys. Therefore, a smiling and small brown bear can looks like a teddy bear.
    For the third combination: Similarly, these attribute-tag pairs can confuse brown bear with teddy bear.
    For the 4th combination: Low light may also make the brown bear indistinguishable, while indoor setting is often for teddy bear.

"""
