prompt_stage2 = """
Your available action types are \n'action none leave non-verbal communication speak'.
Note: You can "leave" this conversation if 1. you have achieved your social goals, 2. this conversation makes you uncomfortable, 3. you find it uninteresting/you lose your patience, 4. or for other reasons you want to leave.

Please only generate a JSON string including the action type and the argument.
Your action should follow the given format:
'{"properties": {"action_type": {"description": "whether to speak at this turn or choose to not do anything", "enum": ["none", "speak", "non-verbal communication", "action", "leave"], "title": "Action Type", "type": "string"}, "argument": {"description": "the utterance if choose to speak, the expression or gesture if choose non-verbal communication, or the physical action if choose action", "title": "Argument", "type": "string"}}, "required": ["action_type", "argument"], "title": "AgentAction", "type": "object"}'
This is a well-thought-out strategy: <strategy>
And the mode of your utterances is: <mode> (Goal-oriented utterances directly advance the task, while social-oriented ones maintain rapport and engagement.)
Please refer to the content, pay attention to some key points in it, and then take action:
"""
