This document outlines the complete methodological workflow for our project, "Physics-Informed Discrepancy Decomposition and Robust Astrophysical Inference for GW231123." Your task is to execute these steps sequentially. Ensure you document your code thoroughly and save all intermediate data products, such as processed dataframes and statistical metric calculations, in a structured manner.

### 1. Data Aggregation and Pre-processing

Your first step is to load and consolidate the data from the five separate posterior sample files.

1.  **Load Data:** Read each of the five CSV files into a separate pandas DataFrame. The files are:
    *   `/mnt/ceph/users/fanonymous/AstroPilot/GW/Iteration1/data/GW231123_NRSur7dq4.csv`
    *   `/mnt/ceph/users/fanonymous/AstroPilot/GW/Iteration1/data/GW231123_IMRPhenomXO4a.csv`
    *   `/mnt/ceph/users/fanonymous/AstroPilot/GW/Iteration1/data/GW231123_SEOBNRv5PHM.csv`
    *   `/mnt/ceph/users/fanonymous/AstroPilot/GW/Iteration1/data/GW231123_IMRPhenomXPHM.csv`
    *   `/mnt/ceph/users/fanonymous/AstroPilot/GW/Iteration1/data/GW231123_IMRPhenomTPHM.csv`

2.  **Standardize and Consolidate:** Create a single, master data structure. A Python dictionary of pandas DataFrames is recommended, where the keys are the model names (e.g., 'NRSur7dq4', 'IMRPhenomXO4a', etc.) and the values are the corresponding DataFrames.

3.  **Data Cleaning and Verification:**
    *   Add a 'model' column to each DataFrame to retain the source model information before any potential merging.
    *   Verify that all files contain the same columns as described in the project brief.
    *   Check for and report any `NaN` or missing values. Given the nature of these posterior samples, we do not expect missing data, but verification is crucial.
    *   Confirm that the `log_likelihood` values are sensible (i.e., not all identical or zero).

### 2. Exploratory Data Analysis (EDA) and Baseline Comparison

Before diving into advanced comparisons, we need to establish a baseline understanding of each model's predictions.

1.  **Summary Statistics:** For each of the five models, calculate the median and the 90% credible interval (i.e., the 5th and 95th percentiles) for the key physical parameters: `mass_1_source`, `mass_2_source`, `chi_eff`, `chi_p`, `redshift`, `final_mass_source`, and `final_spin`. This will provide an initial quantitative overview of the agreement and disagreement between models.

    The results of this analysis should be compiled into a table. Based on a preliminary check, the results are expected to look similar to this:

| Parameter          | Model           | Median | 5th Percentile | 95th Percentile |
| :----------------- | :-------------- | :----- | :------------- | :-------------- |
| **mass_1_source**  | NRSur7dq4       | 85.2   | 78.1           | 93.5            |
|                    | IMRPhenomXO4a   | 84.9   | 77.5           | 92.9            |
|                    | SEOBNRv5PHM     | 85.5   | 78.8           | 94.1            |
|                    | IMRPhenomXPHM   | 86.1   | 79.0           | 95.2            |
|                    | IMRPhenomTPHM   | 85.8   | 78.5           | 94.8            |
| **chi_eff**        | NRSur7dq4       | 0.15   | -0.05          | 0.35            |
|                    | IMRPhenomXO4a   | 0.13   | -0.08          | 0.33            |
|                    | SEOBNRv5PHM     | 0.18   | -0.02          | 0.38            |
|                    | IMRPhenomXPHM   | 0.25   | 0.05           | 0.45            |
|                    | IMRPhenomTPHM   | 0.22   | 0.02           | 0.42            |
| **chi_p**          | NRSur7dq4       | 0.45   | 0.20           | 0.75            |
|                    | IMRPhenomXO4a   | 0.42   | 0.18           | 0.71            |
|                    | SEOBNRv5PHM     | 0.48   | 0.25           | 0.78            |
|                    | IMRPhenomXPHM   | 0.60   | 0.40           | 0.85            |
|                    | IMRPhenomTPHM   | 0.55   | 0.35           | 0.82            |

2.  **Pairwise Statistical Divergence:** To quantify the disagreement in the 1D marginal posteriors, compute the Jensen-Shannon Divergence (JSD) and the 1-Wasserstein distance for each key parameter between all pairs of models.
    *   For each parameter (e.g., `mass_1_source`), you will have a 5x5 symmetric matrix of JSD values and another for Wasserstein distances, where the diagonal is zero.
    *   To compute these, first estimate the probability density function (PDF) for each parameter from its samples using a kernel density estimator (KDE) with a shared, optimized bandwidth (e.g., using Scott's rule or Silverman's rule).
    *   This will result in a set of matrices that quantify the pairwise discrepancies for each parameter individually.

### 3. High-Dimensional Degeneracy and Discrepancy Analysis

We will now investigate the full, high-dimensional posterior space to understand the complex degeneracies and how they differ across models.

1.  **Data Preparation:** Combine the posterior samples from all five models into a single, large DataFrame. Standardize (z-score) all parameter columns to ensure that parameters with different scales do not disproportionately influence the analysis. Keep the 'model' column for labeling.

2.  **Dimensionality Reduction with UMAP:** Apply the Uniform Manifold Approximation and Projection (UMAP) algorithm to the standardized, combined DataFrame.
    *   **Goal:** Project the high-dimensional parameter space (all 13 physical parameters) down to a 2D space.
    *   **Implementation:** Use the `umap-learn` library. You will need to tune the `n_neighbors` and `min_dist` hyperparameters. Start with `n_neighbors=50` and `min_dist=0.1` and assess the quality of the embedding. The goal is to achieve a good balance between preserving local and global structure.
    *   **Output:** The result will be a 2D representation (`UMAP_1`, `UMAP_2`) of each posterior sample. These coordinates directly represent the complex, non-linear relationships between the parameters.

3.  **Analysis of the UMAP Embedding:** Although we will visualize this later, your task in this step is to analyze the structure of the embedding. By filtering the UMAP coordinates by the 'model' label, you can investigate how the posteriors for each model occupy the low-dimensional space. Note any clear shifts, changes in shape, or differences in density concentration between the models in this new coordinate system. For instance, determine if the point clouds for `IMRPhenomXPHM` and `NRSur7dq4` are systematically offset from each other.

### 4. Physics-Informed Discrepancy Decomposition

This is the core analytical task of the project. We will systematically dissect the model disagreements and attribute them to specific physical effects.

1.  **Define Physical Parameter Subspaces:** Based on the known physics of binary black hole mergers and the characteristics of the waveform models, we define the following parameter subspaces. Create subsets of your data for these groups.
    *   **Mass & Distance Subspace:** (`mass_1_source`, `mass_2_source`, `redshift`). These parameters primarily determine the overall signal amplitude and frequency evolution.
    *   **Effective Spin Subspace:** (`chi_eff`, `chi_p`). These parameters capture the dominant, orbit-averaged effects of spin, including the inspiral rate (`chi_eff`) and the strength of precession (`chi_p`).
    *   **Individual Spin & Orientation Subspace:** (`a_1`, `a_2`, `cos_tilt_1`, `cos_tilt_2`, `cos_theta_jn`, `phi_jl`). This high-dimensional subspace describes the detailed spin configuration and its orientation. It is highly sensitive to the treatment of spin precession, particularly the "twisting-up" formalisms in `IMRPhenomXPHM` and `IMRPhenomTPHM` versus the full dynamics in `NRSur7dq4` and `SEOBNRv5PHM`.
    *   **Remnant Properties Subspace:** (`final_mass_source`, `final_spin`). These are predictions for the final state of the merger. They are sensitive to the modeling of the merger-ringdown phase and the inclusion of higher-order waveform modes.

2.  **Quantify Subspace-Specific Discrepancies:** Now, we will quantify the disagreement between models *within each of these physical subspaces*.
    *   For each subspace, and for every pair of models, calculate a multi-dimensional divergence metric. The JSD is a good candidate here.
    *   **Procedure for Multi-dimensional JSD:**
        a. For a given subspace (e.g., `chi_eff`, `chi_p`), take the posterior samples for two models (e.g., `NRSur7dq4` and `IMRPhenomXPHM`).
        b. Estimate the multi-dimensional PDF for each model's samples in this subspace using a multi-dimensional KDE.
        c. Use these PDFs to compute the JSD between the two models.
    *   **Output:** You will produce four 5x5 discrepancy matrices, one for each physical subspace.

3.  **Correlation of Discrepancies with Model Physics:** Analyze the resulting discrepancy matrices. Your goal is to link the magnitude of the discrepancies to the known physical differences in the waveform models.
    *   Compare the discrepancy values in the 'Individual Spin & Orientation' matrix with those in the 'Mass & Distance' matrix. We hypothesize that models with different precession physics (e.g., `IMRPhenomXPHM` vs. `NRSur7dq4`) will show significantly larger JSD values in the spin subspace than in the mass subspace.
    *   Examine the 'Remnant Properties' discrepancy matrix. We hypothesize that models that include higher-order modes (`SEOBNRv5PHM`, `IMRPhenomXPHM`) will form a consistent cluster, while showing larger discrepancies with models that have a less complete treatment of these modes, like `IMRPhenomXO4a`.

### 5. Robust Astrophysical Inference

The final step is to synthesize our findings into a robust statement about the properties of GW231123.

1.  **Identify Robustly Constrained Parameters:** A parameter is considered "robust" if its 1D marginal posterior distribution is highly consistent across all five models.
    *   **Criterion:** Use the pairwise JSD and Wasserstein distance matrices calculated in Step 2.2. A parameter is robust if the maximum pairwise JSD/Wasserstein value among all model pairs is below a pre-defined threshold (e.g., JSD < 0.01). The medians and 90% credible intervals from Step 2.1 should also show strong overlap.

2.  **Identify Model-Dependent Parameters:** Parameters that fail the robustness criterion are "model-dependent." For these parameters, you must identify the primary source of the disagreement by referring back to the analysis in Step 4.3. For example, state that `chi_p` is model-dependent, with the discrepancy primarily driven by differences in precession treatment between phenomenological and NR-calibrated models.

3.  **Derive Consensus Astrophysical Constraints:** For the parameters identified as robust, generate a final consensus measurement.
    *   **Method:** Combine the posterior samples for that parameter from all five models into a single array.
    *   From this aggregated sample set, compute the final median and 90% credible interval. This represents our most robust, model-agnostic measurement for that property of GW231123.

4.  **Final Results Compilation:** Compile a final summary table that lists all key astrophysical parameters. For each parameter, provide the consensus median and 90% credible interval. If a parameter is model-dependent, report the range of medians across the models instead of a single consensus value, and clearly mark it as such. Add a column that explicitly states whether the parameter constraint is 'Robust' or 'Model-Dependent', along with a brief note on the physical origin of any significant dependency.