{
  "root": {
    "name": "Change lecture number from 'Lecture 3' to 'Lecture 1'",
    "description": "Evaluates whether the agent successfully changed the lecture number from 'Lecture 3' to 'Lecture 1' in both the presentation title and slide footers, without making extraneous changes",
    "is_critical": false,
    "metadata": {},
    "children": [
      {
        "name": "Presentation title updated correctly",
        "description": "Verifies that 'Lecture 3' has been changed to 'Lecture 1' in the presentation title on the first slide",
        "is_critical": true,
        "metadata": {},
        "scorer": {
          "type": "function",
          "function_code": "def compute_score() -> tuple[str, float]:\n    from pptx import Presentation\n    \n    try:\n        # Load the modified presentation\n        prs = Presentation(modified_ppt_path)\n        \n        if len(prs.slides) == 0:\n            return \"No slides found in presentation\", 0.0\n        \n        # Get the first slide\n        first_slide = prs.slides[0]\n        \n        # Look for title in slide title placeholder or any text shape\n        title_text = \"\"\n        \n        # Check title placeholder first\n        if first_slide.shapes.title:\n            title_text = first_slide.shapes.title.text.strip()\n        \n        # If no title placeholder or empty, check all text shapes\n        if not title_text:\n            for shape in first_slide.shapes:\n                if hasattr(shape, 'text') and shape.text.strip():\n                    # Look for text that might be the title (containing 'Lecture')\n                    if 'Lecture' in shape.text:\n                        title_text = shape.text.strip()\n                        break\n        \n        if not title_text:\n            return \"No title text found on first slide\", 0.0\n        \n        # Check if title contains 'Lecture 1'\n        if 'Lecture 1' in title_text:\n            # Also verify it doesn't still contain 'Lecture 3'\n            if 'Lecture 3' in title_text:\n                return f\"Title still contains 'Lecture 3': {title_text}\", 0.0\n            return f\"Title correctly updated to contain 'Lecture 1': {title_text}\", 1.0\n        elif 'Lecture 3' in title_text:\n            return f\"Title still contains 'Lecture 3' instead of 'Lecture 1': {title_text}\", 0.0\n        else:\n            return f\"Title does not contain 'Lecture 1': {title_text}\", 0.0\n    \n    except Exception as e:\n        return f\"Error checking presentation title: {str(e)}\", 0.0\n"
        }
      },
      {
        "name": "No extraneous changes made",
        "description": "Ensures that only the intended lecture number changes were made and no other content, formatting, animations, or transitions were modified",
        "is_critical": false,
        "metadata": {},
        "scorer": {
          "type": "function",
          "function_code": "def compute_score() -> tuple[str, float]:\n    from pptx import Presentation\n    import re\n    \n    try:\n        # Check for extraneous changes using the diff data\n        issues = []\n        \n        # Check for animation changes\n        if ppt_diff.added_animations or ppt_diff.removed_animations or ppt_diff.modified_animations:\n            issues.append(f\"Unexpected animation changes: {len(ppt_diff.added_animations)} added, {len(ppt_diff.removed_animations)} removed, {len(ppt_diff.modified_animations)} modified\")\n        \n        # Check for transition changes\n        if ppt_diff.added_transitions or ppt_diff.removed_transitions or ppt_diff.modified_transitions:\n            issues.append(f\"Unexpected transition changes: {len(ppt_diff.added_transitions)} added, {len(ppt_diff.removed_transitions)} removed, {len(ppt_diff.modified_transitions)} modified\")\n        \n        # Check for slide structure changes\n        if ppt_diff.added_slides or ppt_diff.removed_slides:\n            issues.append(f\"Unexpected slide structure changes: {len(ppt_diff.added_slides)} added, {len(ppt_diff.removed_slides)} removed\")\n        \n        # Check for slide metadata changes that aren't related to our task\n        for old_slide, new_slide in ppt_diff.modified_slides:\n            # Allow title changes on first slide (slide_number 1) and notes changes for footers\n            if old_slide.slide_number == 1:\n                # Title change is expected on first slide\n                if old_slide.layout_type != new_slide.layout_type or old_slide.element_count != new_slide.element_count:\n                    issues.append(f\"Unexpected changes to slide 1 layout or element count\")\n            else:\n                # For other slides, only footer-related changes should be allowed\n                if (old_slide.title != new_slide.title or \n                    old_slide.layout_type != new_slide.layout_type or \n                    old_slide.element_count != new_slide.element_count):\n                    issues.append(f\"Unexpected changes to slide {old_slide.slide_number} beyond footer updates\")\n        \n        # Additional check: Load both presentations and verify only expected text changes\n        try:\n            original_prs = Presentation(original_ppt_path)\n            modified_prs = Presentation(modified_ppt_path)\n            \n            if len(original_prs.slides) != len(modified_prs.slides):\n                issues.append(\"Number of slides changed unexpectedly\")\n            else:\n                # Check each slide for unexpected text changes\n                for i in range(len(original_prs.slides)):\n                    orig_slide = original_prs.slides[i]\n                    mod_slide = modified_prs.slides[i]\n                    \n                    # Get all text from both slides\n                    orig_texts = []\n                    mod_texts = []\n                    \n                    for shape in orig_slide.shapes:\n                        if hasattr(shape, 'text'):\n                            orig_texts.append(shape.text)\n                    \n                    for shape in mod_slide.shapes:\n                        if hasattr(shape, 'text'):\n                            mod_texts.append(shape.text)\n                    \n                    # For first slide, allow Lecture 3 -> Lecture 1 changes\n                    if i == 0:\n                        orig_combined = ' '.join(orig_texts)\n                        mod_combined = ' '.join(mod_texts)\n                        \n                        # Replace expected changes and see if anything else changed\n                        normalized_orig = re.sub(r'Lecture\\s*3', 'Lecture 1', orig_combined)\n                        if normalized_orig != mod_combined:\n                            # Check if it's just whitespace or minor formatting\n                            if normalized_orig.replace(' ', '').replace('\\n', '') != mod_combined.replace(' ', '').replace('\\n', ''):\n                                issues.append(f\"Unexpected text changes on slide 1 beyond lecture number\")\n                    \n                    # For other slides, allow Lec 3.X -> Lec 1.X changes\n                    else:\n                        orig_combined = ' '.join(orig_texts)\n                        mod_combined = ' '.join(mod_texts)\n                        \n                        # Replace expected footer changes\n                        normalized_orig = re.sub(r'Lec\\s*3\\.(\\d+)', r'Lec 1.\\1', orig_combined)\n                        if normalized_orig != mod_combined:\n                            # Check if it's just whitespace or minor formatting\n                            if normalized_orig.replace(' ', '').replace('\\n', '') != mod_combined.replace(' ', '').replace('\\n', ''):\n                                issues.append(f\"Unexpected text changes on slide {i+1} beyond footer updates\")\n        \n        except Exception as e:\n            issues.append(f\"Could not verify text changes: {str(e)}\")\n        \n        if not issues:\n            return \"No extraneous changes detected - only intended lecture number updates made\", 1.0\n        else:\n            # Return partial score based on severity\n            severity_score = max(0.0, 1.0 - (len(issues) * 0.3))  # Deduct 0.3 per issue\n            return f\"Extraneous changes detected: {'; '.join(issues[:2])}{'...' if len(issues) > 2 else ''}\", severity_score\n    \n    except Exception as e:\n        return f\"Error checking for extraneous changes: {str(e)}\", 0.5\n"
        }
      }
    ]
  },
  "metadata": {
    "task": "Change the lecture number from 'Lecture 3' to 'Lecture 1' both in the presentation title and in the slide footers for the remaining slides.\nSome context for rubric design: 1. critical node checking that Lecture 3 is changed to Lecture 1 in the presentation title on the first slide. 2. 1 non-critical node making sure that the lecture number in the footer of form Lec 3.<slide number> is changed to Lec 1.<slide number> for all slides starting from the second one. 3. 1 non-critical node/branch to make sure extraneous changes have not been made."
  }
}