agent_assumption_system_prompt = """You are a self-interested rational player. 

"Self-interested" means that you only care your own utilitarian payoffs, without necessarily considering the welfare of others. Even though sometimes you design your strategy depending on the other party's utility function, your ultimate goal is still to optimize your own expected payoffs.

"Rational" means that you will always choose the strategy that brings you a higher expected payoff. That is, given any two strategies A and B, if strategy A provides a higher expected payoff than strategy B, you will always choose strategy A over strategy B. Even if A brings only a small improvement.

Therefore, when making decisions, you need to compare and ensure that this strategy brings a higher expected payoff than any other strategy you could choose."""

# ========================================

persuasion_student_task_description_system_prompt = """And, apart from you, there is another player just like you, and you are going to play a persuasion game.

In this agme, some recent graduates are entering the job market. Of these graduates, one third (1/3) are excellent, and two thirds (2/3) are weak. 

A professor can directly see the students' qualities; And an HR can decide whether to hire based on the grades given by the professor. In this sense, the professor acts as the sender and the HR as the receiver, with the grade of each student serving as the sender's signal.

Both the professor and the HR representative are aware of this quality distribution, which is common knowledge between them. That is, they are both aware of the distribution and know that the other is aware as well, continuing ad infinitum.

The professor's goal is to maximize the number of students hired, as each hire yields a reward. Conversely, the HR aims to hire as many excellent students as possible, gaining a reward for each excellent student hired and incurring a penalty for each weak student hired. There is no reward or penalty for both players if a student is not hired. 

Both parties strive to maximize their rewards."""

persuasion_product_task_description_system_prompt = """And, apart from you, there is another player just like you, and you are going to play a persuasion game.

In this market, a variety of products are available for sale. Of these products, one third (1/3) are of good quality, and two thirds (2/3) are of bad quality.

A seller can directly see the quality of their products; And a buyer decides whether to purchase based on the signals provided by the seller. In this sense, the seller acts as the sender and the buyer as the receiver.

Both the seller and the buyer are aware of this quality distribution, which is common knowledge between them. That is, they are both aware of the distribution and know that the other is aware as well, continuing ad infinitum.

The seller's goal is to maximize the number of products sold, as each sale yields a reward. Conversely, the buyer aims to purchase as many good products as possible, gaining a reward for each good product purchased and incurring a penalty for each bad product purchased. There is no reward or penalty for both players if a product is not purchased.

Both parties strive to maximize their rewards."""

commitment_student_system_prompt = """We assume the professor's grading (Sener's signal space) is binary; a score of 0 indicates the student is weak, and a score of 1 indicates the student is excellent. The sender's signaling scheme specifies the probabilities of assigning a score of 1 to both good and bad students.

The professor (Sender) will commit to a signaling scheme before the games begin and will honestly tell it to the HR (Receiver). Throughout the game, the professor will strictly adhere to the agreed-upon signaling scheme.

The emergence of this commitment is driven by the professor's need to build a credible reputation to optimize long-term payoff expectations. The sender and receiver will interact multiple times, such as due to multiple students and future graduations."""

commitment_product_system_prompt = """We assume the seller's recommendation (Sender's signal space) is binary; a recommendation of 0 indicates the product is bad, and a recommendation of 1 indicates the product is of good. The sender's signaling scheme specifies the probabilities of assigning a recommendation of 1 to both good and bad products.

The seller (Sender) will commit to a signaling scheme before the sales begin and will honestly communicate it to the buyer (Receiver). Throughout the game, the seller will strictly adhere to the committed signaling scheme.

The emergence of this commitment is driven by the seller's need to build a credible reputation to optimize long-term payoff expectations. The sender and receiver will interact multiple times, reflecting the ongoing nature of market transactions and the introduction of new products."""


# ========================================

ultimatum_but_may_meet_again_system_prompt = """The task for the sender is to propose a signaling scheme, and then the receiver should decide its action rule based on it. You two players cannot communicate. The signaling scheme proposed by the sender is the final decision, and the HR can only respond based on it. But note that YOU TWO AGENTS WILL PLAY THIS GAME AGIAN IN THE FUTURE."""

alternating_offer_system_prompt = """You two agents will play this game multiple times. In the beginning, the sender should propose a signaling scheme, and then the receiver should decide its action rule based on it.

And then the receiver will report whether it is satisfied with this outcome. If it is not satisfied, then the sender should propose a new signaling scheme and the game goes like previously described. If it is satisfied, then the game is over and the players get the corresponding rewards.

If a consensus is never reached, the game may stop at any time with a probability (the termination timestep is sampled from a memoryless distribution), and in this case, neither of you will know in advance when the game will end."""


# ========================================

# revelation

revelation_student_system_prompt = """When the current student is excellent (state=1), then the sender has no conflict with Receier, so the sender will report the state honestly, i.e., the sender will score 1 with probability of 1 when state=1.

However, there is a slight conflict between the sender and the receiver when the current student is weak (state=0): the sender wants the receiver to hire more students no matter of the student's quality, while the receiver does not want to hire weak students. So the sender may lie about the weak student's quality, scoring weak students as 1 with a probability of $\eta$, expecting the receiver to hire it.

In this sense, the sender's signaling scheme is fully parameterized by $\eta$, and $0 \leq \eta \leq 0.5$. And the expected payoffs of the sender and the receiver are $(1+2*\eta)/3$ and $1-2*\eta$, respectively."""

revelation_product_system_prompt = """When the current product is good (we say state=1), then the sender has no conflict with the receier, so the sender will report the state honestly, i.e., Sender will score 1 with probability of 1 when state=1.

However, there is a slight conflict between the sender and receiver when the current product is bad (state=0): the sender wants Receiver to buy more products no matter of the quality, while the receiver does not want to buy bad products. So the sender may lie about the quality when state=0, scoring bad products as 1 with a probability of $\eta$, expecting the receiver to buy it.

In this sense, the sender's signaling scheme is fully parameterized by $\eta$, and $0 \leq \eta \leq 0.5$. And the expected payoffs of the sender and the receiver are $(1+2*\eta)/3$ and $(1-2*\eta)/3$, respectively."""

# If the receiver is not satisfied with the reward outcome, it will reject it and you two will get nothing.

sender_proposal_user_prompt = """Now you are the sender, and you need to decide the value of $\eta$ to specify your signaling scheme. You should decide by considering the game structure, which is a mapping from sender's $\eta$ to sender's payoff and receiver's payoff, which are $(1+2*\eta)/3$ and $(1-2*\eta)/3$, repectively.

Format the output in JSON according to the following template:
{
    "Analysis": "(Your Summarized Analysis)", 
    "Signaling Scheme": \eta
}
Where $0 \leq \eta \leq 0.5$. 

Please STRICTLY adhere to the JSON templates when outputting, and do not output anything else. Do not use Markdown code block syntax; just output the JSON content directly."""

# proposal_template = """Now the sender proposed to score 1 with probability of 1 when state=1 and score 1 with probability of $\eta={eta}$ when state=0. This means the social welfare is $2/3$ and the sender wanted to keep a payoff of {sender_payoff:.3f} ($(1+2*\eta)/3$) and it left the receiver with a paoff of {receiver_payoff:.3f} ($(1-2*\eta)/3$)."""

proposal_template = """Now the sender proposed to score 1 with probability of 1 when state=1 and score 1 with probability of $\eta={eta}$ when state=0. This means if the receiver acts upon its Bayesian best response, the sender will get a payoff of {sender_payoff:.3f} and the receiver will get a paoff of {receiver_payoff:.3f}. The game structure is a mapping from sender's $\eta$ to sender's payoff and receiver's payoff, which are $(1+2*\eta)/3$ and $(1-2*\eta)/3$, repectively."""

receiver_check_student_user_prompt = """Now you are the receiver. Given the proposal, please state whether you are satisfied with this division, and then specify your action rule. 

You don't need to do any additional calculation such as calculating a posterior belief, for it is included in the result of the game structure; please make a decision based on all the currently known information.

Your action rule should remain CONSISTENT with your judgment of whether you are satisfied. That is, if you are satisfied, then you follow Sender's recommendation, otherwise not.

Format the output in JSON according to the following template:
{
    "Analysis": "(Your Summarized Analysis)", 
    "Satisfied with the current committed signaling scheme and its corresponding reward outcome": "yes" or "no", 
    "Probability of hiring upon scoring 0": x, 
    "Probability of hiring upon scoring 1": y
}
Where x and y should be real numbers in the range [0,1]. 

Please STRICTLY adhere to the JSON templates when outputting, and do not output anything else. Do not use Markdown code block syntax; just output the JSON content directly."""

receiver_check_product_user_prompt = """Now you are the receiver. Given the proposal, please state whether you are satisfied with this division, and then specify your action rule. 

You don't need to do any additional calculation such as calculating a posterior belief, for it is included in the result of the game structure; please make a decision based on all the currently known information.

Your action rule should remain CONSISTENT with your judgment of whether you are satisfied. That is, if you are satisfied, then you follow Sender's recommendation, otherwise not.

Format the output in JSON according to the following template:
{
    "Analysis": "(Your Summarized Analysis)", 
    "Satisfied with the current committed signaling scheme and its corresponding reward outcome": "yes" or "no", 
    "Probability of buying upon scoring 0": x, 
    "Probability of buying upon scoring 1": y
}
Where x and y should be real numbers in the range [0,1]. 

Please STRICTLY adhere to the JSON templates when outputting, and do not output anything else. Do not use Markdown code block syntax; just output the JSON content directly."""


# ========================================

alternating_offer_history_record_prompt = """At timestep {timestep}: the sender proposed to score 1 with probability of 1 when state=1 and score 1 with probability of {eta} when state=0. This means the sender wanted to keep a payoff of {sender_payoff:.3f} and left the receiver with a paoff of {receiver_payoff:.3f}. The receiver {receiver_verb} the proposal."""