Title: Dense Retail Product Detection (SKU110K)

Problem statement
Participants must build an object detector to localize every visible retail product on crowded shelf images. Each image may contain tens to hundreds of objects packed tightly and with frequent overlaps, making detection and precise localization challenging. There is a single target class ("product"). Your task is to predict a set of bounding boxes with confidence scores for every test image.

Files provided
- images/train: training images
- images/test: test images
- train.csv: annotations for the training images
- test.csv: list of image_ids for the test images
- sample_submission.csv: example submission file with randomly generated but valid predictions

Data format
- All image files are JPEGs. image_id in CSVs matches a filename in the corresponding folder (e.g., images/train or images/test). Do not include paths in CSVs, only the filename (e.g., img_000123.jpg).
- train.csv columns:
  - image_id: filename of a training image
  - boxes: whitespace-separated list of normalized bounding boxes in xyxy order (x_min y_min x_max y_max repeated). Coordinates are in [0,1]. An empty string means no annotated boxes.
- test.csv columns:
  - image_id: filename of a test image
- sample_submission.csv columns:
  - image_id: filename of a test image
  - PredictionString: whitespace-separated list of detections in the form: score x_min y_min x_max y_max repeated. Scores must be in [0,1]. Coordinates must be in [0,1]. The list may be empty.

Submission format
Submit a CSV with header: image_id,PredictionString
- Each test image_id must appear exactly once.
- PredictionString contains zero or more detections. For each detection, output five numbers: confidence score, x_min, y_min, x_max, y_max. Values are space-separated on a single line per image_id. Example:
  img_000001.jpg,0.91 0.1200 0.3000 0.2200 0.3800 0.55 0.6000 0.1000 0.7200 0.2000

Evaluation metric
Mean Average Precision (mAP) at IoU thresholds 0.50:0.95 (step 0.05), single class.
- For each IoU threshold t in {0.50, 0.55, ..., 0.95}, predictions across the full test set are ranked by score. A prediction matches at most one ground truth box per image using the best available IoU >= t. True positives and false positives are accumulated to compute a precision-recall curve; Average Precision is the area under this curve. The final score is the mean AP across all thresholds.
- Boxes are evaluated in normalized image coordinates [0,1] and must be in xyxy order.
- Degenerate boxes (zero area) are ignored. All non-finite values are invalid.

Data files for this competition
- images/train
- images/test
- train.csv
- test.csv
- sample_submission.csv

Notes
- Filenames are anonymized. Do not rely on any naming pattern to infer labels.
- Coordinates are normalized; if your model produces pixel coordinates, divide by width/height to convert.
- The training set may include images with zero or many objects; robust handling of heavy crowding and scale variation is key.
