ArgumentationQA (ArgQA) Dataset
(for Argumentation-based Multiple Choice Questions)

## 1. Overview ##################################

This dataset consists of multiple-choice questions (MCQs) derived from structured argument triplets. 
The triplets are based on different logical structures: linear, convergent, and divergent.

Each MCQ is generated from these logical forms to evaluate reasoning over arguments.  
(Please refer to the paper for more details.)

When the paper is published, we will release the entire dataset.  
Currently, we provide a small sample consisting of the development and validation sets, stored in the directory "example_dataset".  
Code for Construction pipelines is provided in the "code" directory.  

## 2. Format ##################################

Each MCQ entry is a map/dictionary with the following fields:

{
  "docID": string,               // Index that links the item to its source document in the original argument mining corpus.

  "instanceID": string,          // Globally unique identifier formed from the split label (dev, val, test) and a counter.

  "structure": string,           // The logical structure label, one of lin, conv, div for linear, convergent, or divergent arguments.

  "q_type": string,              // Question type code, for example "1.1" for proposition prediction.

  "context": array of 2 strings, // Two sentences that make up the argument fragment shown to the solver.

  "choices": array of 4 objects, // Each object has text and type.
}

where
%choice% =
{
  "text": string, // Answer text shown to the solver.
  "type": string  // Categorical label where _ marks the single gold answer whose reasoning chain matches the target structure, and i, ii, iii mark other distractor subtypes (for example simple backward, complex forward, complex linear).
}