\documentclass[twocolumn]{aastex631}

\usepackage{amsmath}
\usepackage{multirow}
\usepackage{natbib}
\usepackage{graphicx} 
\usepackage{aas_macros}

\begin{document}

The advent of gravitational-wave (GW) astronomy, initiated by the direct detection of binary black hole (BBH) mergers by the LIGO-Virgo-KAGRA (LVK) collaboration, has fundamentally reshaped our understanding of the most energetic astrophysical events. Through the meticulous analysis of these GW signals, we can infer fundamental properties of black holes, explore their formation mechanisms, and probe the nature of spacetime under extreme conditions. However, the reliability and precision of these astrophysical interpretations are critically dependent on the theoretical waveform models employed to describe the intricate dynamics of merging compact objects. These models, which are computationally efficient approximations of Einstein's field equations, inherently incorporate varying levels of physical fidelity, ranging from highly accurate but computationally expensive numerical relativity (NR) simulations to more efficient analytical and phenomenological approximations.The inherent complexity of BBH dynamics, particularly for systems characterized by high masses, significant spins, and orbital precession, necessitates the use of such approximate models. While NR simulations serve as invaluable benchmarks, their immense computational cost prohibits their routine application in large-scale parameter inference campaigns. Consequently, a diverse suite of semi-analytical and phenomenological models has been developed, each presenting unique trade-offs in terms of computational efficiency, the inclusion of higher-order waveform modes, and the treatment of spin precession. This inherent diversity among waveform models inevitably introduces model-dependent biases into the inferred astrophysical parameters. This poses a substantial challenge to deriving robust scientific conclusions, particularly for events like GW231123, which exhibits characteristics indicative of a high-mass and potentially highly precessing system. For such events, the systematic uncertainties arising from waveform modeling can become a dominant factor, potentially outweighing the statistical uncertainties inherent in the measurement process itself. The core problem extends beyond merely observing discrepancies between models; it demands an understanding of *why* these models yield different results, and crucially, which specific physical aspects of the source are most sensitive to these model approximations. Without a clear and systematic understanding of these dependencies, our capacity to confidently constrain astrophysical properties is severely hampered.In this paper, we introduce and implement a novel, physics-informed framework designed to systematically decompose and attribute discrepancies among multiple gravitational-wave waveform models for the GW231123 event \citep{li2025gw231123productsuccessivemergers}. We rigorously investigate five distinct waveform models: NRSur7dq4, IMRPhenomXO4a, SEOBNRv5PHM, IMRPhenomXPHM, and IMRPhenomTPHM \citep{li2025gw231123productsuccessivemergers}. Our central innovation, termed "Physics-Informed Discrepancy Decomposition," transcends simple global comparisons of posterior distributions. Instead, we define a set of physically motivated parameter subspaces, specifically those related to the binary's mass and distance, the effective spin, the individual spin components and orbital orientation, and the properties of the final remnant black hole. By meticulously quantifying multi-dimensional divergences within these targeted subspaces, our aim is to establish direct correlations between observed discrepancies in inferred parameters and known differences in the underlying physical approximations of the waveform models, such as their treatment of spin precession or the inclusion of higher-order waveform modes \citep{li2025gw231123productsuccessivemergers}.To verify the efficacy and provide comprehensive insights into our approach, we first perform extensive exploratory data analysis. This includes computing one-dimensional marginal posterior comparisons using advanced statistical metrics such as the Jensen-Shannon Divergence (JSD) and the 1-Wasserstein distance, establishing a baseline of model agreement and disagreement. We then employ Uniform Manifold Approximation and Projection (UMAP) to visualize and analyze the complex high-dimensional degeneracies and discrepancies across the full parameter space, revealing how different models occupy and cluster within this space. The subsequent physics-informed decomposition provides quantitative metrics of disagreement within each of our predefined physical subspaces. By systematically correlating these subspace-specific discrepancies with the known physical characteristics and approximation schemes of the waveform models, we robustly identify which astrophysical properties of GW231123 are consistently constrained across all models and which remain highly sensitive to specific waveform model approximations. This comprehensive methodology allows us to provide clear, interpretable insights into the robustness of astrophysical inferences and to derive a more reliable set of consensus constraints for GW231123, thereby advancing our ability to extract confident scientific knowledge from complex gravitational-wave signals.

\end{document}
                