# CURT-Point: A Two-Stage Curriculum and Heuristic Gate for High-Precision Change-Point Detection

This repository contains the supplementary material for our ICLR 2026 conference paper submission, "Deep Stratigraphic inference: A Two-Stage Training Curriculum and Heuristic Gate for High-Precision Change Point Detection". The primary component is a self-contained tutorial notebook that demonstrates the core methodology of our proposed **CURT-Point** framework.

---

## ⚠️ Important Note on Reproducibility and Performance

Due to proprietary constraints of data extraction, we have only provided a representative snippet of Dataset C with the code submission in supplementary documents.The primary purpose of this submission is to provide **conceptual reproducibility** and to allow for the verification of our code and pipeline.

The included **tutorial notebook (`Tutorial_CURT_point.ipynb`)** and the **sample data (`sample_data/`)** are designed to execute our entire pipeline end-to-end without errors on a small scale.

**Key Points:**

*   The numerical results generated from running this notebook **will not match** the results in the paper.
*   Due to the extremely small size of the sample data, simpler models like the Heuristic-Based Method may appear to perform better within the notebook. This is an artifact of the low-data regime.
*   On the full, real-world datasets, as demonstrated conclusively in our paper's main results (Table 1), the **Two-Stage End-to-End model and our final Hybrid System are decisively superior.**


**Analysis of result on the sample dataset**
* In this case we see that the Heuristic baseline has performed better than the Two stage E2E method: The main reason is that the classification backbone on which the Two stage E2E is itself not a specialist. Since we do a soft-argmax in teh regression, its input which is clasiification logits, to give a more precise results. It is not the case in this senario as we have too little data to train it.

* However we see that out our Final model with Heuristic gate is able to get the best performance, by detecting the outlieres, as its recall is similar to the Heuristic and the median has decresed by half. This states that for clearly distributed classification logits, it provides more precises change-point.

---

## 🚀 Getting Started

The recommended way to explore our work is to open and run the `Tutorial_CURT_point.ipynb` notebook. It contains all the necessary source code for the models and training procedures, along with step-by-step explanations.

### 1. Environment Setup

The packages required for the project can be installed through pip installing core dependencies or running the `Install.ipynb` notebook.

### 2. Run the Tutorial Notebook

## Project Structure

*   `Tutorial_CURT_point.ipynb`: The main file containing all code, explanations, and a demonstration of the full training and evaluation pipeline on sample data.
*   `sample_data/`: A small, representative snippet of our most complex dataset (Dataset C) to ensure the notebook is runnable.
*   `Install.ipynb`: A pip requirements file.
*   `LICENSE`: License foe the code and data
*   `README.md`: This file.

## Core Dependencies

Our implementation relies on the following major libraries:

*   **PyTorch:** For building and training the neural networks.
*   **fastai:** For the high-level training loop and data loading abstractions.
*   **pandas:** For data manipulation and the rolling-mean heuristic.
*   **NumPy:** For numerical operations.
*   **matplotlib:** For plotting and visualizations.

---

## Data Availability

The full, curated datasets used in this study are proprietary and cannot be made publicly available. The raw data was sourced from public repositories maintained by the respective US states (Colorado, Wyoming) and the Norwegian Petroleum Directorate. The provided `sample_data/` is a small, anonymized snippet of the preprocessed Dataset C, intended solely for demonstrating the code's functionality.