{"@context":{"@language":"en","@vocab":"https://schema.org/","citeAs":"cr:citeAs","column":"cr:column","conformsTo":"dct:conformsTo","cr":"http://mlcommons.org/croissant/","data":{"@id":"cr:data","@type":"@json"},"dataBiases":"cr:dataBiases","dataCollection":"cr:dataCollection","dataType":{"@id":"cr:dataType","@type":"@vocab"},"dct":"http://purl.org/dc/terms/","extract":"cr:extract","field":"cr:field","fileProperty":"cr:fileProperty","fileObject":"cr:fileObject","fileSet":"cr:fileSet","format":"cr:format","includes":"cr:includes","isEnumeration":"cr:isEnumeration","isLiveDataset":"cr:isLiveDataset","jsonPath":"cr:jsonPath","key":"cr:key","md5":"cr:md5","parentField":"cr:parentField","path":"cr:path","personalSensitiveInformation":"cr:personalSensitiveInformation","recordSet":"cr:recordSet","references":"cr:references","regex":"cr:regex","repeated":"cr:repeated","replace":"cr:replace","sc":"https://schema.org/","separator":"cr:separator","source":"cr:source","subField":"cr:subField","transform":"cr:transform","wd":"https://www.wikidata.org/wiki/"},"alternateName":"","conformsTo":"http://mlcommons.org/croissant/1.0","license":{"@type":"sc:CreativeWork","name":"Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0)","url":"https://creativecommons.org/licenses/by-nc-nd/4.0/"},"distribution":[{"contentUrl":"https://www.kaggle.com/api/v1/datasets/download/yimengfuyao/detectiumfire","contentSize":"95.52 GB","encodingFormat":"application/zip","@id":"archive.zip","@type":"cr:FileObject","name":"archive.zip","description":"Archive containing all the contents of the DetectiumFire dataset"},{"includes":"*.mp4","containedIn":{"@id":"archive.zip"},"encodingFormat":"video/mp4","@id":"video-mp4_fileset","@type":"cr:FileSet","name":"video/mp4 files","description":"video/mp4 files contained in archive.zip"},{"includes":"*.(jpg|jpeg)","containedIn":{"@id":"archive.zip"},"encodingFormat":"image/jpeg","@id":"image-jpeg_fileset","@type":"cr:FileSet","name":"image/jpeg files","description":"image/jpeg files contained in archive.zip"},{"includes":"*.png","containedIn":{"@id":"archive.zip"},"encodingFormat":"image/png","@id":"image-png_fileset","@type":"cr:FileSet","name":"image/png files","description":"image/png files contained in archive.zip"},{"includes":"*.avi","containedIn":{"@id":"archive.zip"},"encodingFormat":"video/x-msvideo","@id":"video-x-msvideo_fileset","@type":"cr:FileSet","name":"video/x-msvideo files","description":"video/x-msvideo files contained in archive.zip"},{"includes":"*.json","containedIn":{"@id":"archive.zip"},"encodingFormat":"application/json","@id":"application-json_fileset","@type":"cr:FileSet","name":"application/json files","description":"application/json files contained in archive.zip"},{"includes":"*.txt","containedIn":{"@id":"archive.zip"},"encodingFormat":"text/plain","@id":"text-plain_fileset","@type":"cr:FileSet","name":"text/plain files","description":"text/plain files contained in archive.zip"},{"includes":"*.webp","containedIn":{"@id":"archive.zip"},"encodingFormat":"image/webp","@id":"image-webp_fileset","@type":"cr:FileSet","name":"image/webp files","description":"image/webp files contained in archive.zip"}],"keywords":["subject \u003E arts and entertainment"],"isAccessibleForFree":true,"isLiveDataset":true,"includedInDataCatalog":{"@type":"sc:DataCatalog","name":"Kaggle","url":"https://www.kaggle.com"},"creator":{"@type":"sc:Person","name":"Yimengfuyao~","url":"/yimengfuyao","image":"https://storage.googleapis.com/kaggle-avatars/thumbnails/default-thumb.png"},"publisher":{"@type":"sc:Organization","name":"Kaggle","url":"https://www.kaggle.com/organizations/kaggle","image":"https://storage.googleapis.com/kaggle-organizations/4/thumbnail.png"},"thumbnailUrl":"https://storage.googleapis.com/kaggle-datasets-images/new-version-temp-images/default-backgrounds-99.png-9145440/dataset-card.png","dateModified":"2025-10-22T06:59:40.643","datePublished":"2025-10-16T15:47:53.0470099","@type":"sc:Dataset","name":"DetectiumFire","url":"https://www.kaggle.com/datasets/yimengfuyao/detectiumfire","description":"# \uD83D\uDD25 DetectiumFire Dataset\n\nDetectiumFire is a large-scale, multi-modal dataset designed to advance fire understanding in both traditional computer vision and modern vision-language tasks. It provides high-quality real and synthetic fire data, detailed annotations, and human preference feedback for training and evaluating object detectors, diffusion models, and vision-language models (VLMs).\n\nThe dataset contains:\n\n- Real-world fire and non-fire images and videos\n\n- Synthetic fire images generated via diffusion models (SFT and RLHF/DPO)\n\n- Human preference annotations for fine-tuning image generation models\n\n- Rich metadata including fire severity, environment, and descriptive captions\n\nThis dataset supports tasks such as fire detection, visual reasoning, synthetic data generation,  and safety-critical AI development.\n\n## \uD83D\uDCC1 Dataset Structure\n\n\u0060\u0060\u0060python\nDetectiumFire/\n\u251C\u2500\u2500 preference_dataset/\n\u251C\u2500\u2500 real_images/\n\u251C\u2500\u2500 real_video/\n\u2514\u2500\u2500 synthetic_images/\n\u0060\u0060\u0060\n\n## preference_dataset/\n\nThis folder contains human preference data used to fine-tune diffusion models through Reinforcement Learning from Human Feedback (RLHF). The annotations are stored in the preference.json file and are critical for aligning generated images with human judgment across multiple quality dimensions.\n\n### \uD83D\uDCC4 preference.json\n\nEach entry in the JSON file represents a preference comparison between two images generated from the same text prompt. The format for each entry is as follows:\n\n\u0060\u0060\u0060python\n{\n  \u0022prompt\u0022: \u0022a fire truck is on the street with smoke coming out of it\u0022,\n  \u0022image1\u0022: \u0022img3/00000-1749640795.png\u0022,\n  \u0022image2\u0022: \u0022img2/00000-3687853181.png\u0022,\n  \u0022preference\u0022: 2,\n  \u0022reason\u0022: \u0022Preference: 2 Justification: 1. **General Preference**: The second image is more visually appealing and convincing... [truncated]\u0022\n}\n\u0060\u0060\u0060\n\n### Field Descriptions\n\n- prompt: A natural language text string used as input to the diffusion model to generate both images in the pair.\n\n- image1 / image2: File paths pointing to the two generated images being compared. These images are located in the corresponding subfolders under preference_dataset/.\n\n- preference: An integer value (1 or 2) indicating which image is preferred by human annotators based on the provided prompt.\n\n- reason: A natural language explanation justifying the preference. Justifications typically address three criteria: General Preference \u2013 overall plausibility and visual coherence. Visual Appeal \u2013 realism, artistic quality, and flame fidelity. Prompt Alignment \u2013 how accurately the image reflects the input prompt.\n\n### Usage\n\nThis preference data is used to train reward models or to directly optimize diffusion models via methods like Diffusion-DPO. The fine-grained human justifications also offer interpretability for analyzing generation quality trade-offs.\n\n\n## real_images/\n\nThis folder contains real-world fire-related and non-fire images with corresponding annotations and metadata. It is divided into two subfolders:\n\n- real_fire/: Images that contain fire.\n\n- real_non_fire/: Challenging negative samples without fire, used for robustness and false-positive analysis.\n\n\n### \uD83D\uDD25 real_fire/\n\nThis folder contains real fire images collected from various sources, organized for training and evaluation of object detection and multi-modal reasoning models.\n\n**Folder Structure**\n\n\u0060\u0060\u0060python\nreal_fire/\n\u251C\u2500\u2500 images/          # Fire images (.jpg or .png)\n\u251C\u2500\u2500 labels/          # YOLO-format bounding box annotations (.txt)\n\u2514\u2500\u2500 fire_prompts.json  # Metadata and fire descriptions\n\u0060\u0060\u0060\n\n\uD83E\uDDFE fire_prompts.json\n\nThis file provides detailed annotations and metadata for each fire image. Each entry in the JSON file follows this format:\n\n\n\u0060\u0060\u0060python\n{\n  \u0022image\u0022: \u0022msg5430551134-14167_jpg.rf.1fdbd3d8e3053b8b93f36844e46b8d71.jpg\u0022,\n  \u0022source\u0022: \u0022iot_device_detectium\u0022,\n  \u0022answer\u0022: \u0022The image shows a man and a woman in an office setting... Presence of People: Yes, the man and woman are visible in the scene.\u0022,\n  \u0022fire_prompt\u0022: \u0022A small flame is burning from a lighter held in a person\u0027s hand, indoors, with minor severity.\u0022,\n  \u0022fire_type\u0022: \u0022Indoor_lighter_flame\u0022\n}\n\u0060\u0060\u0060\n\n\uD83D\uDD11 **Field Descriptions**:\n\n- image: Filename of the fire image, located in real_fire/images/.\n\n- source: The origin of the image, as categorized in Appendix C.1 of the paper. Possible values include: web_search, iot_device_detectium, FIRE, Forest Fire and FireNET. \n\n- answer: Raw descriptive caption generated by GPT-4o, containing structured observations about: Scene context, Objects on fire (if any), Fire severity level, Affected area, Presence of people\n\n- fire_prompt: Final edited, human-verified fire prompt used for text-to-image generation and fine-tuning diffusion models.\n\n- fire_type: Detailed taxonomy label corresponding to the fire type, following the hierarchical categorization described in Appendix C.4.\n\n\uD83D\uDCE6 **Usage**:\n\nThe YOLO-format labels in labels/ align with the filenames in images/ and can be directly used for training fire detection models (e.g., YOLOv11).\n\nThe metadata (fire_prompt, fire_type, and answer) supports tasks like vision-language reasoning, fire severity classification, and diffusion model fine-tuning.\n\n\n### \uD83D\uDEAB real_non_fire/\n\nThis folder contains images that do not contain fire, but may include visually similar artifacts such as: sunsets, strong lighting in dark environments, smoke or fog, red/orange objects, etc. \n\nThese serve as challenging negative samples for improving the false positive resistance of fire detection systems.\n\n## real_video/\n\nThis folder contains real-world fire and non-fire video clips, curated to support fire detection, classification, and video understanding tasks. All videos are manually verified and segmented to include meaningful content.\n\n###\uD83D\uDCC1 Folder Structure:\n\n\u0060\u0060\u0060python\nreal_video/\n\u251C\u2500\u2500 fire/         # Videos containing visible fire\n\u2514\u2500\u2500 non_fire/     # Videos without fire\n\u0060\u0060\u0060\n\n\n###\uD83D\uDD25 fire/\n\nContains short video clips (\u226510 seconds) with clearly visible fire scenes.\n\nIncludes a broad range of real-world fire scenarios: indoor/outdoor, low/high severity, controlled/uncontrolled.\n\nVideos are curated from web sources as described in Appendix C.1 of the paper.\n\nDesigned for training and evaluating video-based fire detection and temporal reasoning systems.\n\n###\uD83D\uDEAB non_fire/\n\nContains challenging negative video clips where no actual fire is present.\n\nUseful for robustness testing and minimizing false alarms in video-based fire detection models.\n\n###\uD83D\uDCE6 Usage:\n\nThe video clips can be used for: fire video classification, temporal object detection and tracking, training video diffusion models (in future versions), and building fire-aware video-language models. \n\n\uD83D\uDD27 Note: Metadata and bounding box annotations for videos are not provided in this version, but will be included in future dataset releases.\n\n\n## synthetic_images/\n\nThis folder contains synthetically generated fire images produced using diffusion models fine-tuned on DetectiumFire prompts. It includes training-ready datasets and evaluation outputs across different training strategies.\n\n### \uD83D\uDCC1 Folder Structure:\n\n\u0060\u0060\u0060python\nsynthetic_images/\n\u251C\u2500\u2500 dpo_stable_diff_v15/\n\u2502   \u2514\u2500\u2500 train/               # Images \u002B YOLO annotations from DPO fine-tuning\n\u251C\u2500\u2500 stable_diff_v15/\n\u2502   \u2514\u2500\u2500 train/               # Images \u002B YOLO annotations from supervised fine-tuning (SFT)\n\u251C\u2500\u2500 evaluation/\n\u2502   \u251C\u2500\u2500 dpo_output/          # Evaluation images from DPO-finetuned model\n\u2502   \u251C\u2500\u2500 original_output/     # Evaluation images from base (untrained) model\n\u2502   \u251C\u2500\u2500 sft_output/          # Evaluation images from SFT-finetuned model\n\u2502   \u2514\u2500\u2500 prompt.txt           # Prompts used to generate the evaluation images\n\u0060\u0060\u0060\n\n###\uD83D\uDD27 dpo_stable_diff_v15/train/\n\nSynthetic fire images generated using Diffusion-DPO, a preference-based fine-tuning method.\n\nIncludes YOLO-format bounding box annotations, allowing direct use for fire object detection training.\n\nCaptures high prompt alignment, visual realism, and stylistic consistency based on human preference signals.\n\n###\uD83D\uDCD8 stable_diff_v15/train/\n\nImages generated via supervised fine-tuning (SFT) of Stable Diffusion v1.5.\n\nAlso includes YOLO annotations.\n\nOffers more diversity but slightly less alignment than DPO-generated data.\n\n###\uD83E\uDDEA evaluation/\n\nThis folder contains outputs from the three models (original, SFT, and DPO) evaluated on a shared prompt set.\n\n####\uD83D\uDCC4 prompt.txt\n\nContains a list of prompts used to generate the evaluation images.\n\nPrompts describe diverse fire scenes and are held out from the training set to ensure unbiased comparison.\n\n####\uD83D\uDCC2 Subfolders:\n\n- dpo_output/: Images generated using the DPO fine-tuned model.\n\n- sft_output/: Images from the SFT fine-tuned model.\n\n- original_output/: Images from the pre-trained, unfine-tuned Stable Diffusion v1.5 model.\n\nThese outputs are used to compare visual fidelity, prompt alignment, and preference ratings in  LLM-based evaluations (see paper Section 4.2 and Appendix E.4\u2013E.5)."}