<data>
    <message role="user">
        <content type="text">
            You are a vision system for a robot. You are provided with two images corresponding to the states before and after a particular skill is executed. You are given a list of predicates below, and you are given the values of these predicates in the image before the skill is executed. Your job is to output the values of the following predicates in the image after the skill is executed. Pay careful attention to the visual changes between the two images to figure out which predicates change and which predicates do not change. For the predicates that change, list these separately at the end of your response. Note that in some scenes, there might be no changes. First, output a description of what changes you expect to happen based on the skill that was just run, explicitly noting the skill that was run. Second, output a description of what visual changes you see happen between the before and after images, looking specifically at the objects involved in the skill's arguments, noting what objects these are. Next, output each predicate value in the after image as a bulleted list with each predicate and value on a different line. For each predicate value, provide an explanation as to why you labeled this predicate as having this particular value. Use the format: &lt;predicate&gt;: &lt;truth value&gt;. &lt;explanation&gt;.  Your response should have three sections. Here is an outline of what your response should look like:
            [START OUTLINE]
            # Expected changes based on the executed skill
            [insert your analysis on the expected changes you will see based on the skill that was executed]
            # Visual changes observed between the images
            [insert your analysis on the visual changes observed between the images]
            # Predicate values in the after image
            [insert your bulleted list of '* &lt;predicate&gt;: &lt;truth value&gt;. &lt;explanation&gt;']
            [END OUTLINE]
            Information about the environment and robot:
            {domain_knowledge}
            Pose estimation of all objects and the robot gripper (3D poses are in meters):
            {pose_estimations}
            Predicates:
            {predicates}
            Skill executed between states: {skill}
            Predicate values in the first scene, before the skill was executed:
            {prior_ground_predicates}
        </content>
        <content type="image" id="prior_scene" />
        <content type="image" id="scene" />
    </message>
</data>