{
    "Textual_Faithfulness": "The score is 5. Reason: The edited video fully aligns with the text description, capturing all details accurately. \n",
    "Frame_Consistency": "The score is 2. Reason: Although the overall content aligns with the text condition, there are noticeable jumps in the video, particularly between the movement of the gorillas. This suggests poor frame consistency and a less-than-smooth viewing experience. \n",
    "Video_Fidelity": "The score is 2. Reason: The model replaced the monkeys with gorillas, but the movements of the gorillas look unnatural and don't quite fit the action of \"picking things to eat\". The overall visual quality is acceptable but there is a noticeable difference between the quality of the gorillas and the quality of the grass. \n"
}