# Sample PDFs

This directory contains a small subset of sample PDF documents from the MADQA benchmark.

## Size Constraint Notice

Due to the ICML supplementary material maximum file size limit of **100MB**, the full PDF corpus (~500MB+) could not be included in this submission. Only 5 small sample documents are provided.

## Purpose

These sample PDFs are provided to:
- Demonstrate document types and formats in the benchmark
- Allow testing of the baseline code
- Show the quality and complexity of source documents

## Full Dataset

The full PDF corpus will be made available upon publication. For complete evaluation, you will need to obtain the full document set separately.

## Document Categories

The full dataset includes documents from these categories:
- **Form**: Application forms, registration documents
- **Invoice**: Commercial invoices, receipts
- **Letter**: Business correspondence, official letters
- **Poster**: Event announcements, promotional materials
- **Report**: Financial reports, technical documents
- **Guide**: Instruction manuals, reference guides
- **Scientific**: Research papers, academic documents
