Title: RSNA Pneumonia Tri-Task Challenge (Classification + Segmentation)

Overview
Build a model that simultaneously classifies chest X-rays into three categories and localizes pneumonia when present.
- Classes: Lung Opacity, Normal, No Lung Opacity / Not Normal
- Inputs: PNG chest X-ray images with tabular metadata (age, sex, modality, position, height, width)
- Outputs (prediction per image id): class (one of the three above) and a binary mask encoded via run-length encoding (RLE). For non–Lung Opacity predictions, the mask should be empty.

Why this is challenging
- Multi-signal learning: combine image features with metadata to improve generalization.
- Long-tail and ambiguity: "No Lung Opacity / Not Normal" is clinically heterogeneous.
- Precise localization required on positives; segmentation accuracy is scored only where ground truth has pneumonia.

Files provided
- train.csv: training annotations and metadata. Columns: id, class, age, sex, modality, position, height, width
- test.csv: test metadata. Columns: id, age, sex, modality, position, height, width
- train_images/: training PNG images (filenames correspond to id + .png)
- train_masks/: training binary PNG masks for training images (id + .png). Masks may be empty for negatives.
- test_images/: test PNG images (filenames correspond to id + .png)
- sample_submission.csv: sample with random, valid labels and RLE masks. Use this as a format reference.

Task
- Train on train.csv with images in train_images/ (and optional use of train_masks/ for segmentation learning).
- For each id in test.csv, predict both:
  1) class (one of: Lung Opacity, Normal, No Lung Opacity / Not Normal)
  2) mask_rle (RLE string). If class != Lung Opacity, mask_rle should be empty.
- Submit a CSV with columns: id, class, mask_rle

Evaluation
The leaderboard metric balances classification quality and positive-case localization quality.
- Classification: macro F1 across the three classes.
- Segmentation: mean Dice score computed only on test images whose ground-truth mask has at least one positive pixel (i.e., true class = Lung Opacity). Empty predictions for negatives do not affect the Dice.
- Final score = 0.5 * macro_F1 + 0.5 * mean_Dice_on_positives. The score is clamped to [0, 1].

Notes and constraints
- All ids are anonymized; file names do not reveal labels.
- Images and masks align 1:1 by id in training; test has only images and metadata.
- Height/width are provided for convenience and for RLE decoding/encoding.
- RLE follows the common Kaggle convention: 1-indexed, column-major ordering.

Submission format
- CSV with columns: id, class, mask_rle
- id must match exactly the ids in test.csv; one row per id; no duplicates.
- class must be one of: Lung Opacity, Normal, No Lung Opacity / Not Normal
- mask_rle is an RLE string; leave empty for negatives.

Reproducibility
- The dataset split and files are produced by prepare.py with deterministic behavior.
- No original file paths appear in the CSVs; only ids are used to reference images/masks within the provided folders.

Good luck and happy modeling!
