{
  "en": {
    "zephyr": {
      "system": "You are a helpful assistant.",
      "branch": "We want to evaluate the quality of the responses provided by two AI assistants to the user question displayed below. Your task is to propose an evaluation plan that can be executed to compare the two responses. The evaluation plan should consist of a list of up to five factors that one should consider such as helpfulness, relevance, accuracy, etc. In each line, write an evaluation criterion along with a short descrition of how we should evaluate that criterion.\n***\n[Query]:\n{query}\n***\nEvaluation Plan:",
      "branch2": "We want to evaluate the quality of the responses provided by two AI assistants. Focus specifically on the second user question displayed below. Your task is to propose an evaluation plan that can be executed to compare the two responses. The evaluation plan should consist of a list of up to five factors that one should consider such as helpfulness, relevance, accuracy, etc. In each line, write an evaluation criterion along with a short descrition of how we should evaluate that criterion. First User Question: {query1} Second User Question: {query2} Evaluation Plan:",
      "solve": "You are given a user question and responses provided by two AI assistants. Your task is to evaluate and score the quality of the responses based on a single evaluation criterion displayed below. Make sure to evaluate only based on the criterion specified and none other.\n***\n[Query]:\n{query}\n***\n[The Start of Response A]:\n{response1}\n[The End of Response A]\n***\n[The Start of Response B]:\n{response2}\n[The End of Response B]\n***\n[Evaluation Criterion]:\n{eval_criterion}\n[End of Evaluation Criterion]\n***\n[Response Format]:\nIn the first line, provide a score between 1 to 5 for Response A, for example: Response A: 5. In the second line, provide a score between 1 to 5 for Response B, for example: Response B: 5.\n***\nEvaluation:\n",
      "solve2": "You are given two user questions and responses provided by two AI assistants for the second question. Your task is to evaluate and score the quality of the responses based on a single evaluation criterion displayed below. Make sure to evaluate only based on the criterion specified and none other. In the first line, provide a score between 1 to 5 for Assistant A's response. In the second line, provide a score between 1 to 5 for Assistant B's response.\n[First User Question]\n{query1}\n\n[Second User Question]\n{query2}\n\n[The Start of Assistant A's Answer]\n{response1}\n[The End of Assistant A's Answer]\n\n[The Start of Assistant B's Answer]\n{response2}\n[The End of Assistant B's Answer]\n\n[Evaluation Criterion]\n{eval_criterion}\n[End of Evaluation Criterion]\n\nEvaluation of {criterion_name}:\n",
      "merge": "You are given feedback on responses from two personal assistants, including Assistant A's Answer and Assistant B's Answer. Your task is to extract the numerical scores (1-5) from the feedback.\n***\n[The Start of Feedback]:\n{feedback}\n[The End of Feedback]\n***\nPlease strictly adhere to the given format, for example, the return format should be: Assistant A: 5\nAssistant B: 5.\n***\n(In the first line, assign a score between 1 to 5 for Assistant A's response. In the second line, assign a score between 1 to 5 for Assistant B's response.)\n Now, please extract the scores:\n"
    },
    "zephyr_bsm": {
      "system": "You are a helpful assistant.",
      "single_branch": "We want to evaluate the quality of the responses provided by two AI assistants to the user question displayed below. Your task is to propose an evaluation plan that can be executed to compare the two responses. The evaluation plan should consist of a factors that one should consider such as helpfulness, relevance, accuracy, etc. and a score rubric representing a evaluation criteria are given.\n***\n[Query]:\n{query}\n***\n[Evaluation Plan]:",
      "single_solve": "You are given a user question and responses provided by AI assistants. Your task is to evaluate and score the quality of the responses based on a single evaluation criterion displayed below. Make sure to evaluate only based on the criterion specified and none other. The output format should look as follows: '[Feedback]: (write a feedback for criteria) [RESULT] (an integer number between 1 and 5)'.\n***\n[Query]:\n{query}\n***\n[Response]:\n{response}\n[The End of Response]\n***\n[Evaluation Criterion]:\n{criterion}\n[End of Evaluation Criterion]\n***\n[Feedback]:"
    }
  },
  "zh": {}
}