{
    "Textual_Faithfulness": "The score is 1. Reason: The animals in the video are gorillas, but they are not picking things to eat. \n",
    "Frame_Consistency": "The score is 1. Reason: The text condition requests a video of gorillas, but the video still shows monkeys. This complete mismatch between the text condition and the generated video indicates a lack of frame consistency, resulting in a nonsensical video. \n",
    "Video_Fidelity": "The score is 2. Reason: The video has significant distortion, the faces of the gorillas are poorly grafted on, and the movements are unnatural. \n"
}