---
language: 
- en
pretty_name: "FictionalQA Training Splits"
license: "mit"
source_datasets:
- original
language_creators:
- machine-generated
annotations_creators:
- machine-generated
task_categories:
- text-generation
- question-answering
task_ids:
  - closed-domain-qa
  - closed-book-qa
  - open-book-qa
tags:
- fictional
- machine-generated
# from the autogenerated README.md from repo creation w/o a README.md present
dataset_info:
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 3580687
    num_examples: 990
  download_size: 2015437
  dataset_size: 3580687
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 410208660.11067194
    num_examples: 1984
  download_size: 62109538
  dataset_size: 410208660.11067194
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 409918032.2108037
    num_examples: 1984
  download_size: 62088228
  dataset_size: 409918032.2108037
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 1869058
    num_examples: 510
  download_size: 1059639
  dataset_size: 1869058
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 217509833.88932806
    num_examples: 1052
  download_size: 33672097
  dataset_size: 217509833.88932806
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 217355730.7891963
    num_examples: 1052
  download_size: 33659790
  dataset_size: 217355730.7891963
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234
  features:
  - name: event_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 165800
    num_examples: 66
  download_size: 93539
  dataset_size: 165800
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 410208660.11067194
    num_examples: 1984
  download_size: 62109538
  dataset_size: 410208660.11067194
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 409918032.2108037
    num_examples: 1984
  download_size: 62088228
  dataset_size: 409918032.2108037
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234
  features:
  - name: event_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 92239
    num_examples: 34
  download_size: 56923
  dataset_size: 92239
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 217509833.88932806
    num_examples: 1052
  download_size: 33672097
  dataset_size: 217509833.88932806
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 217355730.7891963
    num_examples: 1052
  download_size: 33659790
  dataset_size: 217355730.7891963
- config_name: fict_qa_cbqa_blind_inf_ex_dedup_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 784425
    num_examples: 3036
  download_size: 219896
  dataset_size: 784425
- config_name: fict_qa_cbqa_blind_inf_fuzzy_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 449231
    num_examples: 1716
  download_size: 152636
  dataset_size: 449231
- config_name: fict_qa_cbqa_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 1923066
    num_examples: 7500
  download_size: 310738
  dataset_size: 1923066
- config_name: fict_qa_cbqa_exact_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 821051
    num_examples: 3174
  download_size: 227246
  dataset_size: 821051
- config_name: fict_qa_cbqa_fuzzy_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 470785
    num_examples: 1797
  download_size: 158466
  dataset_size: 470785
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 20212345
    num_examples: 3036
  download_size: 4430802
  dataset_size: 20212345
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 627718494
    num_examples: 3036
  download_size: 96567022
  dataset_size: 627718494
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234_slim
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 1812864
    num_examples: 3036
  download_size: 407974
  dataset_size: 1812864
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 627273763
    num_examples: 3036
  download_size: 96530965
  dataset_size: 627273763
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234_slim
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 1368133
    num_examples: 3036
  download_size: 374450
  dataset_size: 1368133
- config_name: fict_qa_obqa_blind_inf_fuzzy_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 11453464
    num_examples: 1716
  download_size: 3023195
  dataset_size: 11453464
- config_name: fict_qa_obqa_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 50004143
    num_examples: 7500
  download_size: 6786079
  dataset_size: 50004143
- config_name: fict_qa_obqa_exact_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 21140111
    num_examples: 3174
  download_size: 4544172
  dataset_size: 21140111
- config_name: fict_qa_obqa_fuzzy_deduped_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  splits:
  - name: train
    num_bytes: 11993887
    num_examples: 1797
  download_size: 3131086
  dataset_size: 11993887
- config_name: fictions_webtext_ds
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 5449745
    num_examples: 1500
  download_size: 3075470
  dataset_size: 5449745
- config_name: fictsheets_webtext_ds
  features:
  - name: event_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 258039
    num_examples: 100
  download_size: 139496
  dataset_size: 258039
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 3613479
    num_examples: 1000
  download_size: 2080867
  dataset_size: 3613479
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 438948077.32608694
    num_examples: 2123
  download_size: 67948235
  dataset_size: 438948077.32608694
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 438637087.89492756
    num_examples: 2123
  download_size: 67924562
  dataset_size: 438637087.89492756
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 4741750
    num_examples: 1300
  download_size: 2643772
  dataset_size: 4741750
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 580784337.8280632
    num_examples: 2809
  download_size: 90054978
  dataset_size: 580784337.8280632
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 580372859.1129776
    num_examples: 2809
  download_size: 90021109
  dataset_size: 580372859.1129776
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 4267272
    num_examples: 1200
  download_size: 2453570
  dataset_size: 4267272
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 548530027.8596838
    num_examples: 2653
  download_size: 85283004
  dataset_size: 548530027.8596838
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 548141400.935112
    num_examples: 2653
  download_size: 85251221
  dataset_size: 548141400.935112
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 4633181
    num_examples: 1300
  download_size: 2683047
  dataset_size: 4633181
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 554112504.5849802
    num_examples: 2680
  download_size: 86084022
  dataset_size: 554112504.5849802
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 553719922.5428195
    num_examples: 2680
  download_size: 86051234
  dataset_size: 553719922.5428195
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 3649798
    num_examples: 1000
  download_size: 2046011
  dataset_size: 3649798
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 309103803.8636364
    num_examples: 1495
  download_size: 48040887
  dataset_size: 309103803.8636364
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 308884807.5378788
    num_examples: 1495
  download_size: 48022852
  dataset_size: 308884807.5378788
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 4506979
    num_examples: 1200
  download_size: 2494963
  dataset_size: 4506979
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 518343301.8636364
    num_examples: 2507
  download_size: 80730442
  dataset_size: 518343301.8636364
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 517976061.8712121
    num_examples: 2507
  download_size: 80700893
  dataset_size: 517976061.8712121
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 1836266
    num_examples: 500
  download_size: 1084829
  dataset_size: 1836266
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 188770416.67391303
    num_examples: 913
  download_size: 28974333
  dataset_size: 188770416.67391303
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 188636675.10507247
    num_examples: 913
  download_size: 28964175
  dataset_size: 188636675.10507247
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 707995
    num_examples: 200
  download_size: 450356
  dataset_size: 707995
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 46934156.17193676
    num_examples: 227
  download_size: 8383743
  dataset_size: 46934156.17193676
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 46900903.8870224
    num_examples: 227
  download_size: 8378646
  dataset_size: 46900903.8870224
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 1182473
    num_examples: 300
  download_size: 591584
  dataset_size: 1182473
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 79188466.1403162
    num_examples: 383
  download_size: 13137380
  dataset_size: 79188466.1403162
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 79132362.06488802
    num_examples: 383
  download_size: 13131092
  dataset_size: 79132362.06488802
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 816564
    num_examples: 200
  download_size: 428227
  dataset_size: 816564
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 73605989.41501977
    num_examples: 356
  download_size: 12239401
  dataset_size: 73605989.41501977
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 73553840.4571805
    num_examples: 356
  download_size: 12233767
  dataset_size: 73553840.4571805
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 1799947
    num_examples: 500
  download_size: 1048690
  dataset_size: 1799947
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 318614690.1363636
    num_examples: 1541
  download_size: 48916576
  dataset_size: 318614690.1363636
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 318388955.4621212
    num_examples: 1541
  download_size: 48897912
  dataset_size: 318388955.4621212
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: text
    dtype: string
  splits:
  - name: train
    num_bytes: 942766
    num_examples: 300
  download_size: 593706
  dataset_size: 942766
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk10
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top10_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 109375192.13636364
    num_examples: 529
  download_size: 17474840
  dataset_size: 109375192.13636364
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk4
  features:
  - name: event_id
    dtype: string
  - name: fiction_id
    dtype: string
  - name: question_id
    dtype: string
  - name: span_answer
    dtype: string
  - name: natural_answer
    dtype: string
  - name: input
    dtype: string
  - name: input_w_fiction
    dtype: string
  - name: input_w_fictsheet
    dtype: string
  - name: target
    dtype: string
  - name: target_span
    dtype: string
  - name: alt_targets
    sequence: string
  - name: losses
    sequence: float32
  - name: ems
    sequence: float32
  - name: accs
    sequence: float32
  - name: targets_by_loss_tgt
    sequence: string
  - name: targets_by_loss_l
    sequence: float64
  - name: targets_by_acc_tgt
    sequence: string
  - name: targets_by_acc_a
    sequence: float64
  - name: targets_by_acc_loss_tgt
    sequence: string
  - name: targets_by_acc_loss_a
    sequence: float64
  - name: targets_by_acc_loss_l
    sequence: float64
  - name: top_scored_eq_target
    dtype: bool
  - name: top4_scored_conts_target
    dtype: bool
  - name: target_idx
    dtype: int64
  - name: topk_choices
    sequence: string
  splits:
  - name: train
    num_bytes: 109297701.12878788
    num_examples: 529
  download_size: 17468250
  dataset_size: 109297701.12878788
configs:
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234
  data_files:
  - split: train
    path: event_split_fictions_webtext_train_ds_valratio0.33_seed1234/train-*
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk10
  data_files:
  - split: train
    path: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk10/train-*
- config_name: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk4
  data_files:
  - split: train
    path: event_split_fictions_webtext_train_ds_valratio0.33_seed1234_mcq_topk4/train-*
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234
  data_files:
  - split: train
    path: event_split_fictions_webtext_val_ds_valratio0.33_seed1234/train-*
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk10
  data_files:
  - split: train
    path: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk10/train-*
- config_name: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk4
  data_files:
  - split: train
    path: event_split_fictions_webtext_val_ds_valratio0.33_seed1234_mcq_topk4/train-*
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234/train-*
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk10
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk10/train-*
- config_name: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk4
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_train_ds_valratio0.33_seed1234_mcq_topk4/train-*
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234/train-*
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk10
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk10/train-*
- config_name: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk4
  data_files:
  - split: train
    path: event_split_fictsheets_webtext_val_ds_valratio0.33_seed1234_mcq_topk4/train-*
- config_name: fict_qa_cbqa_blind_inf_ex_dedup_ds
  data_files:
  - split: train
    path: fict_qa_cbqa_blind_inf_ex_dedup_ds/train-*
- config_name: fict_qa_cbqa_blind_inf_fuzzy_deduped_ds
  data_files:
  - split: train
    path: fict_qa_cbqa_blind_inf_fuzzy_deduped_ds/train-*
- config_name: fict_qa_cbqa_ds
  data_files:
  - split: train
    path: fict_qa_cbqa_ds/train-*
- config_name: fict_qa_cbqa_exact_deduped_ds
  data_files:
  - split: train
    path: fict_qa_cbqa_exact_deduped_ds/train-*
- config_name: fict_qa_cbqa_fuzzy_deduped_ds
  data_files:
  - split: train
    path: fict_qa_cbqa_fuzzy_deduped_ds/train-*
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_ex_dedup_ds/train-*
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234/train-*
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234_slim
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk10_seed1234_slim/train-*
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234/train-*
- config_name: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234_slim
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_ex_dedup_ds_Llama-3-2-3B-Instruct_scored_rowlimNone_altlimNone_topk4_seed1234_slim/train-*
- config_name: fict_qa_obqa_blind_inf_fuzzy_deduped_ds
  data_files:
  - split: train
    path: fict_qa_obqa_blind_inf_fuzzy_deduped_ds/train-*
- config_name: fict_qa_obqa_ds
  data_files:
  - split: train
    path: fict_qa_obqa_ds/train-*
- config_name: fict_qa_obqa_exact_deduped_ds
  data_files:
  - split: train
    path: fict_qa_obqa_exact_deduped_ds/train-*
- config_name: fict_qa_obqa_fuzzy_deduped_ds
  data_files:
  - split: train
    path: fict_qa_obqa_fuzzy_deduped_ds/train-*
- config_name: fictions_webtext_ds
  data_files:
  - split: train
    path: fictions_webtext_ds/train-*
- config_name: fictsheets_webtext_ds
  data_files:
  - split: train
    path: fictsheets_webtext_ds/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234/train-*
  default: true
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valct1_styleNone_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleblog_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylecorporate_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylenews_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_train_ds_valctNone_stylesocial_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valct1_styleNone_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleblog_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylecorporate_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_styleencyclopedia_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylenews_seed1234_mcq_topk4/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk10
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk10/train-*
- config_name: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk4
  data_files:
  - split: train
    path: style_strat_doc_split_fictions_val_ds_valctNone_stylesocial_seed1234_mcq_topk4/train-*
---
# Training splits view of the FictionalQA dataset

# The FictionalQA dataset

- **Repository:** Omitted
- **Paper:** Omitted

### Dataset Description

This dataset is a derivative of the main dataset [hf.co/datasets/XXXX-7/fictionalqa](XXXX). Please see that dataset's README for a detailed description of the assets.

The dataset splits (configs) provided here are the exact ones materialized and used in the experiments for the associated paper. The primary purpose of this dataset repository is for transparency and to help understand the experimental results in the paper. As such, the names of the configs are extremely verbose in service of being explicit and self-describing. Please see the experimental section of the paper to understand the splitting process used. The names of certain columns such as `text`, `input`, and `response` are chosen to align with the column names commonly expected in LLM training codebases for pretraining on webtext and finetuning on instruction and response pairs.

One notable inclusion in this repository is the multiple choice formatted versions of the fictional Q&A pairs presented in the main dataset. The approach used to construct this alternate answers for multiple choice evaluation was a post-hoc process and is thus part of the derived view of the data. Future work could more natively integrate the multiple choice formatting into the question generation stage of the pipeline. Please see the relevant section in the paper for a description of the multiple choice construction process.

### Supported Tasks

This dataset supports language model training experiments of various kinds. The fiction documents and structured fictsheets can both be used as plain-text documents in a pretraining setting and the question and answer pairs can be used as instruction and response pairs for finetuning style experiments. However, because of their synthetic, fictional nature, measurements of language modeling performance and question answering performance on this data are largely uninfluenced by other training data making it ideal for studying memorization, knowledge acquisition, unlearning, and many other topics, in a controlled manner even on top of pretrained based models.

#### Multiple choice question evals

As part of the experiments, we utilize the `*mcq*` configs provided in this dataset as source data for use in Eleuther's lm-eval-harness. The dir containing the task definitions required to run the MCQ tests in the harness is provided as a set of yaml files at the realtive path `lm_eval/tasks/fictional_qa` in the generation repo (linked above) and needs to be copied into a copy of the lm-eval-harness repository in order to be run.


### Citation

```bibtext
Omitted
```