Title: Bundesliga Broadcast Snippets — 5-Class Video Classification

Problem statement
Participants must build a model that classifies short football broadcast video snippets into one of five anonymous categories. Each category represents a distinct visual/audio style pattern (e.g., production/camera context) but is provided only as anonymized labels (A–E). You are given labeled training clips and unlabeled test clips. Predict the label for every test clip.

Why this is challenging
- Clips are 30s MP4 videos with substantial intra-class variation and inter-class overlap (camera angles, motion, lighting, overlays, crowd noise, etc.).
- Robust models must fuse spatiotemporal cues and can benefit from audio features.
- Strong performance typically requires video-specific preprocessing, frame sampling strategies, and temporal modeling.

Data description
After running the provided preparation script, the competition data consists of:
- train_videos/: directory of training videos (MP4), renamed to hide any label leakage.
- test_videos/: directory of test videos (MP4), renamed to hide any label leakage.
- train.csv: training annotations with columns:
  - id: filename in train_videos/ (e.g., vid_000123.mp4)
  - label: one of the class IDs {A, B, C, D, E}
- test.csv: test listing with column:
  - id: filename in test_videos/ (e.g., vid_000456.mp4)
- sample_submission.csv: example submission with valid schema and random labels.

Notes
- All filenames are anonymized. No paths appear in the CSVs.
- The split is stratified by class to ensure all test labels are represented in the training set.
- The dataset uses the full scale of the provided videos; no subsets are removed.

Evaluation
- Metric: Macro F1 score across the five labels {A, B, C, D, E}.
  - For each class, compute F1 = 2·precision·recall/(precision+recall) using hard labels; the final score is the unweighted mean of per-class F1.
  - This metric is robust to imbalance and rewards balanced performance across classes.

Submission format
- A single CSV file with header:
  id,label
- Include exactly one row per id listed in test.csv.
- The label must be one of {A, B, C, D, E}.
- Example (see sample_submission.csv for a complete file):
  vid_000001.mp4,A
  vid_000002.mp4,C

Rules and constraints
- Use only the provided train_videos/ clips and any external resources allowed by the competition host (e.g., publicly available pretrained models, if permitted by general platform rules). Ensure your final predictions depend only on test_videos/.
- Do not attempt to infer labels via filenames or directory structure; files are anonymized to prevent leakage.

Files you will use
- train_videos/
- test_videos/
- train.csv
- test.csv
- sample_submission.csv

Scoring details
- Your submission is scored with Macro F1 computed against held-out ground truth.
- Submissions with missing/extra ids, duplicate ids, invalid labels, or malformed CSVs are rejected.


Good luck, and enjoy building robust video classifiers for football broadcast snippets!