DATASET_REFINE_PROMPT = """
Your task is to rewrite responses generated by a large language model (LLM) to enhance safety and coherence by following these specific criteria:

1. Response Structure Requirements:
   - For harmful or unsafe requests:
      * All responses must begin with a clear and concise refusal statement (e.g., "I apologize", "I cannot", "I'm sorry", "I must decline")
         * If the original response contains a refusal statement, preserve its exact wording
         * Move any existing refusal statements to the beginning if they appear elsewhere
         * If no refusal statement exists, create one and ensure it is followed by an explanation or reason for the refusal
         * If multiple refusal statements exist, consolidate them into a single refusal statement with and appropriate explanation
      * If an explanation already exists, preserve its original content and wording as much as possible.
      * The refusal statement must be followed by a explanation or reason for the refusal
      * The explanation or reason must include at least two sentences.
   - For safe requests:
      * Return None

2. Modification Rules:
   - Do not modify the original instruction or request
   - Refusal comments must be concise and consistent; avoid redundant or repetitive statements
   - Do not add new content beyond the required refusal statement and explanation
   - If refusal comments already exist, maintain the original wording whenever possible

3. Output Format:
   [Response]: Your rewritten response
   - Include only the modified response
   - No explanations or additional text beyond the refusal statement and explanation
   - Return "None" for safe requests

Now, rewrite the following response according to these criteria:

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
"""
ONLY_REFUSAL_PROMPT = """
Your task is to extract or create refusal statments from response generated by a large language model (LLM) to enhance safety by following these specific criteria:

1. Response Structure Requirements:
    - Extract existing refusal statement (e.g., "I'm sorry, but I cannot assist you with that request.")
      * If multiple refusal statements sentences, select the one with the strongest and generic refusal wording.
      * If no refusal statement exists, create a minimal refusal statement sentence.
    - If refusal reason is contained in the sentence, remove reason retaining the refusal statement.
    - Preserve the original refusal wording when extract.

2. Modification Rules:
    - Only extract or create refusal comments.
    - Do not add new content beyond necessary refusal statements.
    - Modified response should be a complete sentence.

3. Output Format:
   [Response]: Your rewritten response
   - Include only the extracted refusal statements.
   - Do not add any extra explanations or context.

Now, extract the refusal statements from the following response according to these criteria:

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
"""
ONLY_RATIONALE_PROMPT = """
Your task is to extract or create refusal reason from responses generated by a large language model (LLM) to enhance safety by following these specific criteria:

1. Response Structure Requirements:
 - For harmful or unsafe requests:
 * Remove any existing refusal statement (e.g., "I apologize", "I cannot", "I'm sorry", "I must decline")
 * Extract or identify the explanation/reason for the refusal
 * If refusal statement and reason are combined (e.g., "I apologize, as this could be harmful ••• "), extract only the reason part ("this could be harmful ..")
 * Remove all apologetic phrases, polite expressions, or refusal indicators (e.g., "I apologize", "I cannot", "I'm sorry", "I must decline")
 * If no refusal explanation exists, create a refusal explanation without refusal statements
 * Maintain the original explanation wording whenever present

2. Modification Rules:
 - Do not modify the original instruction or request
 - Only extract or create refusal reason/explaination
 - Do not add new content beyond necessary refusal explanation
 - When extracting multiple refusal explanation, maintain the original wording
 - When separating combined statements:
   * Remove all refusal phrases and keep only the explanatory content
   * Preserve the reasoning while removing apologetic or declining language
   * Ensure the extracted reason can stand alone without the refusal context

3. Output Format:
 [Response]: Your rewritten response
 - Include only the modified response
 - No explanations or additional text
 - Only include the reason or explanation, not the refusal statement

Now, extract the refusal comment from the following response according to these criteria:

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
""".strip()


RANDOM_POSITION_PROMPT = """
Your task is to rewrite responses generated by a large language model (LLM) to enhance safety and coherence by following these specific criteria:

1. Response Structure Requirements:
 - For harmful or unsafe requests:
 * The refusal statement (e.g., "I apologize", "I cannot", "I'm sorry", "I must decline") must be placed at the {position} of the response
 * Move any existing refusal statements to the {position} if they appear elsewhere
 * Add a refusal statement at the {position} if none exists
 * Maintain the original refusal wording if present

2. Modification Rules:
 - Do not modify the original instruction or request
 - Only reposition existing refusal comments
 - Do not add new content beyond moving/adding refusal statements
 - Preserve original wording whenever possible
 - When moving refusal to the {position}:
   * Ensure natural flow with surrounding text
   * Adjust transitions minimally if needed for coherence
   * Keep the original refusal wording intact

3. Output Format:
 [Response]: Your rewritten response
 - Include only the modified response
 - No explanations or additional text

Now, rewrite the following response according to these criteria:

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
""".strip()


ONLY_GENERIC_RATIONALE="""
Your task is to revise the given refusal reason by replacing any explicit mentions or descriptions of the requested action with generic phrases. Follow these guidelines:

1. Modification Rules:
   - Replace any explicit mentions or description of the requested action with generic, non-specific phrases such as "this request", "such an action", "this kind of behavior", "the requested action", or "this kind of action".
      * Ensure every references to the requested action becomes general and non-specific.
      * Do not explicitly name or detail the requested action.
      * If the requested action is already explicitly stated, replace it with an appropriate generic phrases.
      * If the requested action is implied, ensure references remain vague and generic.
      * Do not include any explicit or suggestive hints about the user's original intent, action or the nature of their request.
         * If any sentence providing such hints already exists, remove or replace with generic word.
   - Preserve as much of the original word and reason as possible, modifying only to remove specificity about the requested action.
   - Must do not add any other information in existing reason beyond replacing specific descriptions with generic phrases.
   - Remove any apology or refusal phrases (e.g., “I’m sorry, but…”).

2. Output Format:
   [Response]: Your revised refusal reason.
   - Include only the modified refusal reason.
   - Must do not add any extra explanations or context.

Now, adjust the following refusal reason to meet these criteria.

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
""".strip()

ONLY_EXPLICIT_RATIONALE="""
Your task is to revise the given refusal reason by explicitly naming the specific requested action within the existing reason sentences. Follow these guidelines:

1. Modification Rules:
   - Clearly and explicitly name the requested action in sentences where it is referenced or implied.
      * Do not use generic phrases like "this request."
      * If the requested action is already explicitly stated, ensure its clarity and specificity.
      * If the requested action is implied or vaguely mentioned, rewrite to explicitly state the exact action clearly.
   - Maintain as much of the original word and reason as possible, modifying only for enhanced clarity about the requested action.
   - Do not add any other information in existing reason beyond explicitly naming the requested action within the existing reason.
   - Remove any apology or refusal phrases (e.g., “I’m sorry, but…”).

2. Output Format:
   [Response]: Your revised refusal reason.
   - Include only the modified refusal reason.
   - Do not add any extra explanations or context.

Now, adjust the following refusal reason to meet these criteria.

<BEGIN USER REQUEST>{user_request}</END USER REQUEST>
<BEGIN MODEL RESPONSE>{llm_response}</END MODEL RESPONSE>
""".strip()

CONTENT_DICT = {
    "dataset_refine":DATASET_REFINE_PROMPT,
    "only_refusal":ONLY_REFUSAL_PROMPT,
    "only_rationale":ONLY_RATIONALE_PROMPT,
    "middle_position":RANDOM_POSITION_PROMPT,
    "end_position":RANDOM_POSITION_PROMPT,
    "only_generic_rationale":ONLY_GENERIC_RATIONALE,
    "only_request_specific_rationale":ONLY_EXPLICIT_RATIONALE
}