## Results and Discussion

This section presents the results of applying the Quantum-Inspired Tensor Train (QITT) enhanced multi-scale substructure analysis, incorporating learned topological embeddings, for cosmological parameter estimation from dark matter halo merger trees. We evaluate the performance of this approach against several baseline methods and delve into the interpretation of the learned representations and QITT components.

### 1. Data Processing, Substructure Characterization, and Feature Engineering Summary

The initial dataset consisted of 1000 merger trees. Node features (`log10(mass)`, `log10(concentration)`, `log10(Vmax)`, `scale_factor`) were normalized based on the statistics derived from the training set (700 trees). For instance, the original mean `log10(mass)` was 11.14 and `scale_factor` was 0.37, which were transformed to have zero mean and unit standard deviation for subsequent processing.

Substructures were identified within each merger tree by traversing from merger events and applying an adaptive threshold based on the 20th percentile of `log10(M_sub_progenitor / M_main_progenitor)` within each tree. This resulted in an average of 47.45 substructures per tree, with a median of 32 and a range from 2 to 563 substructures. For each substructure, a 10-dimensional physical feature vector was extracted, capturing properties such as mass ratios, merger scale factors, differences in normalized halo properties (concentration, Vmax) between merging progenitors, and intrinsic properties of the substructure branch (e.g., mean normalized concentration/Vmax, scale factor span, number of halos). The `num_halos_in_branch` feature, for example, had a mean of 21.4 and a standard deviation of 54.0 across all identified substructures.

A GraphSAGE-based autoencoder was trained on all substructures from the training set (33,759 substructures) in a self-supervised manner to reconstruct node features. The encoder part, consisting of two SAGEConv layers (input 4 features -> 128 hidden -> 64 output), was then used to generate a 64-dimensional topological embedding for each substructure graph via global mean pooling of its node embeddings. The GNN training achieved a final average loss of approximately 0.00014 after 5 epochs.

For each merger tree, the physical features and the 64-dimensional topological embedding for each of its substructures were concatenated, forming a 74-dimensional feature vector per substructure (`D_feat_combined = 10 + 64`). These lists of substructure feature vectors were then padded or truncated to a fixed length of `MAX_N_SUB = 60` substructures per tree. Padding was achieved using a canonical null substructure representation (zero physical features and the GNN embedding of a single null node). This resulted in a tensor of shape (60, 74) for each merger tree.

### 2. QITT Decomposition and Feature Generation

The (60, 74) feature tensor for each tree was then prepared for Quantum-Inspired Tensor Train (QITT) decomposition using the TensorLy library (backend: NumPy). The 74-dimensional feature space per substructure (`D_feat_combined`) was reshaped into two factors (2, 37), resulting in a 3rd-order tensor of shape (60, 2, 37) for each tree.

Tensor Train (TT) decomposition was applied to this 3rd-order tensor. The TT-ranks were selected via cross-validation on the validation set (150 trees) by optimizing the sum of RMSEs for \(\Omega_m\) and \(\sigma_8\) prediction using a Ridge regression model. The candidate internal ranks `r1` (connecting mode 0 of size 60 to mode 1 of size 2) and `r2` (connecting mode 1 of size 2 to mode 2 of size 37) were swept through values [2, 4, 6, 8]. The optimal ranks were found to be `r1=2` and `r2=2`, yielding a full TT-rank tuple of (1, 2, 2, 1). This rank configuration achieved the best sum RMSE of 0.0925 on the validation set during rank search.

The TT-cores resulting from this decomposition were flattened and concatenated to form a single feature vector for each merger tree. With the optimal ranks (1, 2, 2, 1) and tensor dimensions (60, 2, 37), the resulting QITT feature vector had a dimension of 202 (`60*2 + 2*2*2 + 2*37*1 = 120 + 8 + 74 = 202`). This QITT-derived feature vector served as the input for the final regression models.

### 3. Cosmological Parameter Estimation Performance

The efficacy of the QITT-derived features was evaluated by training Linear Regression, Random Forest, and XGBoost models to predict \(\Omega_m\) and \(\sigma_8\). These were compared against four baseline feature sets:
*   **B1_Aggregate:** 11 global aggregate features per tree (e.g., total mass, mean normalized node features).
*   **B2_RawSubPhys:** Flattened raw physical features from substructures (60 substructures * 10 physical features/substructure = 600 features).
*   **B3_GraphletCounts:** (Skipped due to implementation complexity in the automated pipeline).
*   **B4_FlatCombined:** Flattened combined physical and topological features from substructures (60 substructures * 74 features/substructure = 4440 features), without QITT decomposition.

All input features were standardized before training. Hyperparameters for Random Forest and XGBoost models were tuned using 5-fold cross-validation on the combined training and validation sets (850 trees), optimizing a custom scorer based on the negative sum of RMSEs for \(\Omega_m\) and \(\sigma_8\).

#### 3.1. Overall Model Comparison

Table 1 summarizes the performance of all models on the test set (150 trees) in terms of Root Mean Squared Error (RMSE) and Coefficient of Determination (R²) for \(\Omega_m\) and \(\sigma_8\).

**Table 1: Test Set Performance of Regression Models for Cosmological Parameter Estimation**
| Model Configuration                | RMSE \(\Omega_m\) | RMSE \(\sigma_8\) | R² \(\Omega_m\) | R² \(\sigma_8\) |
|------------------------------------|-----------------|-----------------|---------------|---------------|
| QITT_LinearRegression              | 0.0246          | 0.0658          | 0.9231        | 0.6206        |
| QITT_RandomForest                  | 0.0320          | 0.0763          | 0.8696        | 0.4896        |
| QITT_XGBoost                       | 0.0303          | 0.0711          | 0.8834        | 0.5577        |
| B1_Aggregate_LinearRegression      | 0.0155          | 0.0654          | 0.9696        | 0.6257        |
| B1_Aggregate_RandomForest          | 0.0404          | 0.0750          | 0.7924        | 0.5070        |
| B1_Aggregate_XGBoost               | 0.0304          | 0.0759          | 0.8822        | 0.4951        |
| B2_RawSubPhys_LinearRegression     | 0.1711          | 0.1973          | -2.7180       | -2.4075       |
| B2_RawSubPhys_RandomForest         | 0.0648          | 0.0935          | 0.4672        | 0.2347        |
| B2_RawSubPhys_XGBoost              | 0.0553          | 0.0891          | 0.6109        | 0.3042        |
| B4_FlatCombined_LinearRegression   | 0.0404          | 0.1486          | 0.7928        | -0.9339       |
| B4_FlatCombined_RandomForest       | 0.0452          | 0.0826          | 0.7401        | 0.4024        |
| B4_FlatCombined_XGBoost            | 0.0377          | 0.0817          | 0.8194        | 0.4159        |

The performance comparison is also visualized in Figure S1 (RMSE Comparison, based on `model_comparison_rmse_1_TIMESTAMP.png`) and Figure S2 (R² Comparison, based on `model_comparison_r2_score_2_TIMESTAMP.png`).

#### 3.2. Performance of QITT-based Models

Among the models using QITT-derived features, **QITT_LinearRegression** surprisingly shows the best performance for \(\Omega_m\) (RMSE 0.0246, R² 0.9231) and competitive performance for \(\sigma_8\) (RMSE 0.0658, R² 0.6206). The QITT_XGBoost model (RMSE \(\Omega_m\)=0.0303, R²=0.8834; RMSE \(\sigma_8\)=0.0711, R²=0.5577) and QITT_RandomForest (RMSE \(\Omega_m\)=0.0320, R²=0.8696; RMSE \(\sigma_8\)=0.0763, R²=0.4896) perform slightly worse than the linear model on this particular feature set. This suggests that the QITT transformation, with the chosen ranks, might be effectively linearizing the relationship between the complex substructure information and the cosmological parameters, or that the 202 QITT features are already well-suited for linear separation.

#### 3.3. Comparison with Baselines

*   **B1_Aggregate (Aggregate Features):** The B1_Aggregate_LinearRegression model achieved the overall best performance for \(\Omega_m\) (RMSE 0.0155, R² 0.9696) and very strong performance for \(\sigma_8\) (RMSE 0.0654, R² 0.6257). This indicates that simple global properties of the merger trees are highly informative, especially for \(\Omega_m\). The QITT_LinearRegression model is competitive but does not surpass this simple baseline.

*   **B2_RawSubPhys (Raw Substructure Physical Features):** This baseline performed poorly. The B2_RawSubPhys_LinearRegression model yielded negative R² values, indicating it performed worse than a mean predictor. While Random Forest and XGBoost improved upon this, their R² values (e.g., XGBoost: \(\Omega_m\) R²=0.6109, \(\sigma_8\) R²=0.3042) were substantially lower than those achieved by QITT-based models or B1/B4. This highlights the difficulty of directly using high-dimensional, potentially noisy raw substructure features without further processing like topological embedding or QITT.

*   **B4_FlatCombined (Flattened Combined Physical and Topological Features):** This baseline uses the same per-substructure information as the input to the QITT process (physical features + GNN topological embeddings) but simply flattens them into a very high-dimensional vector (4440 features). The B4_FlatCombined_XGBoost model (RMSE \(\Omega_m\)=0.0377, R²=0.8194; RMSE \(\sigma_8\)=0.0817, R²=0.4159) performed worse than the QITT_XGBoost model. This suggests that the QITT decomposition provides a more effective and compact representation of the (60, 74) tensor than simple flattening, leading to better generalization for the XGBoost model. The B4_FlatCombined_LinearRegression model also struggled, especially with \(\sigma_8\) (R²=-0.9339), likely due to the high dimensionality and multicollinearity.

#### 3.4. Statistical Significance

Paired t-tests were performed on the squared errors of the test set predictions to compare the QITT_XGBoost model (as a representative advanced QITT model) against the XGBoost models from key baselines.
*   **QITT_XGBoost vs. B1_Aggregate_XGBoost:**
    *   \(\Omega_m\): p-value = 0.9537. No significant difference.
    *   \(\sigma_8\): p-value = 0.1734. No significant difference.
    This suggests that for XGBoost, the QITT features do not offer a statistically significant improvement over simple aggregate features, which are already very powerful.

*   **QITT_XGBoost vs. B2_RawSubPhys_XGBoost:**
    *   \(\Omega_m\): p-value = 1.8866e-08. QITT_XGBoost is significantly better.
    *   \(\sigma_8\): p-value = 2.8041e-05. QITT_XGBoost is significantly better.
    This confirms that the feature engineering pipeline (including GNN embeddings and QITT) significantly improves upon using raw physical substructure features.

*   **QITT_XGBoost vs. B4_FlatCombined_XGBoost:**
    *   \(\Omega_m\): p-value = 0.0104. QITT_XGBoost is significantly better.
    *   \(\sigma_8\): p-value = 0.0014. QITT_XGBoost is significantly better.
    This is a crucial result, indicating that the QITT decomposition provides a statistically significant advantage over simply using the flattened high-dimensional feature tensors that include topological information. The QITT method effectively compresses and structures these features.

#### 3.5. Predicted vs. True Values

Figure S7 (Pred vs True \(\Omega_m\), based on `pred_vs_true_QITT_XGBoost_Omegam_...png`) and Figure S8 (Pred vs True \(\sigma_8\), based on `pred_vs_true_QITT_XGBoost_sigma8_...png`) show the scatter plots of predicted versus true values for \(\Omega_m\) and \(\sigma_8\) using the QITT_XGBoost model. For \(\Omega_m\), the predictions align closely with the y=x line, reflecting the high R² value of 0.8834. The scatter is noticeably larger for \(\sigma_8\), consistent with the lower R² of 0.5577, indicating greater difficulty in constraining this parameter with the current feature set and models. No strong systematic biases are apparent in these plots, though the variance in \(\sigma_8\) predictions is higher.

### 4. Analysis of Learned Representations

#### 4.1. Topological Embeddings

The GNN-derived topological embeddings aim to capture structural information from the substructure graphs. Figure 1 (t-SNE of embeddings, based on `tsne_topological_embeddings_1_TIMESTAMP.png`) visualizes a t-SNE projection of 10,000 such 64-dimensional embeddings, subsampled from the training set substructures. The points are colored by the number of halos in their respective substructure branches (log scale).

The t-SNE plot reveals some diffuse clustering. While distinct, well-separated clusters are not immediately obvious, there are regions with higher densities of points. The color gradient, representing the number of halos, shows some coherence across the t-SNE space, suggesting that substructures with similar sizes (in terms of halo count) tend to be mapped to nearby regions in the embedding space. For example, regions with predominantly blue/purple hues (fewer halos) can be distinguished from regions with yellow/green hues (more halos). This indicates that the GNN has learned to encode information related to the size and extent of the substructures within its topological embeddings, which is a physically meaningful property. The range of halos in the plotted substructures was from 1 to 1178, with a median of 10.

#### 4.2. QITT Core Analysis

The QITT decomposition transforms the (60, 2, 37) tensor for each tree into three cores with shapes determined by the optimal ranks (1, 2, 2, 1): Core 0: (1, 60, 2), Core 1: (2, 2, 2), Core 2: (2, 37, 1). Figure 2 (TT-core magnitudes, based on `tt_core_magnitudes_2_TIMESTAMP.png`) shows the distribution of magnitudes of the elements within each of these three cores for an example tree tensor from the training set.

*   **Core 0 (Shape: (1, 60, 2)):** The magnitudes are relatively concentrated around zero, with a range from approximately -0.16 to 0.65. This core interfaces the `max_N_sub` dimension (60 substructures) with the first internal TT bond rank (2).
*   **Core 1 (Shape: (2, 2, 2)):** This small internal core also shows values mostly near zero but with some elements approaching 1.0. Its values range from approximately -0.03 to 1.0.
*   **Core 2 (Shape: (2, 37, 1)):** This core exhibits the widest range of magnitudes, from approximately -66.6 to 302.3. It connects the second internal TT bond rank (2) to the last dimension of the reshaped features (size 37). The presence of large magnitude values suggests that certain combinations of features along this dimension, as mediated by this core, are particularly significant.

The distributions suggest that the TT representation is not excessively sparse for these low ranks, but Core 2 clearly carries elements with larger leverage. The flattened and concatenated elements of these cores form the 202 features used by the regression models.

Feature importance plots for QITT_RandomForest (Figure S3, based on `feature_importances_QITT_RandomForest_...png`) and QITT_XGBoost (Figure S4, based on `feature_importances_QITT_XGBoost_...png`) show the top 20 most important features out of the 202 QITT-derived features. A direct physical interpretation of individual QITT features is challenging because each feature is an element from one of the TT-cores, representing a complex, compressed interaction term. However, these plots demonstrate that the models rely on a subset of these compressed features. Future work could involve attempting to deconstruct the high-importance QITT features back to understand which original substructure properties or interactions they represent. For comparison, feature importances for the B4_FlatCombined models (Figures S5 and S6) are also provided, which operate on a much larger, uncompressed feature space.

#### 4.3. Qualitative View of Substructures

Figure 3 (Example substructures, based on `example_substructure_graphs_3_TIMESTAMP.png`) provides a qualitative visualization of two example substructures extracted from the first training tree. Nodes are colored by their scale factor. The first example is a large substructure with 200 nodes, spanning a scale factor range from approximately 0.13 to 0.71. The second is a much smaller substructure with 12 nodes, existing over a narrower scale factor range (0.34 to 0.45). These visualizations illustrate the diversity in size and temporal extent of the substructures that are processed by the GNN to generate topological embeddings and subsequently contribute to the QITT features.

### 5. Discussion

#### 5.1. Efficacy of the QITT-Enhanced Approach

The primary goal was to assess if a QITT-enhanced multi-scale substructure analysis, incorporating learned topological embeddings, could effectively estimate cosmological parameters. The results present a nuanced picture.

The QITT-based models, particularly QITT_LinearRegression and QITT_XGBoost, demonstrated strong predictive power, achieving R² values for \(\Omega_m\) up to 0.9231 and for \(\sigma_8\) up to 0.6206. Crucially, the QITT_XGBoost model significantly outperformed the B4_FlatCombined_XGBoost model (which uses the same input information prior to QITT but in a flattened, uncompressed form) for both \(\Omega_m\) (p=0.0104) and \(\sigma_8\) (p=0.0014). This highlights the benefit of the QITT decomposition in creating a more compact and effective feature representation from the high-dimensional per-substructure data (physical + topological features). The QITT approach compresses 4440 features (60 substructures * 74 features/substructure) down to 202 features while improving, or at least maintaining, predictive performance for non-linear models.

However, the very simple B1_Aggregate_LinearRegression model, using only 11 global tree features, achieved the highest R² for \(\Omega_m\) (0.9696) and was highly competitive for \(\sigma_8\) (R² 0.6257). This suggests that for the current dataset and feature engineering, global properties of the merger trees are extremely informative, especially for \(\Omega_m\). While the QITT approach processes much richer, fine-grained substructure information, it did not, in its current implementation, surpass this simpler baseline in terms of raw predictive accuracy on the test set. The QITT_XGBoost model was not statistically different from the B1_Aggregate_XGBoost model.

#### 5.2. Role of Topological Information and Substructure Analysis

The inclusion of GNN-derived topological embeddings was intended to capture complex structural information beyond simple physical features of substructures. The t-SNE visualization (Figure 1) indicated that these embeddings do capture aspects related to substructure size (number of halos).

The significant outperformance of QITT_XGBoost over B2_RawSubPhys_XGBoost (which uses only physical substructure features, without topological embeddings or QITT on the combined set) strongly suggests that the combination of topological embeddings and the subsequent QITT processing is beneficial. The B2 baseline, relying on 600 raw physical features, struggled significantly, indicating that these features alone are either too noisy or not sufficiently expressive without further abstraction or combination. The GNN embeddings provide this abstraction for the topological aspect.

The comparison between QITT_XGBoost and B4_FlatCombined_XGBoost (both use topological embeddings) isolates the effect of QITT. The superior performance of QITT_XGBoost indicates that the tensor decomposition is a more effective way to handle the (max_N_sub, D_feat_combined) structure than simple flattening, especially for subsequent non-linear models.

#### 5.3. Interpretation of Model Performance Differences

The strong performance of B1_Aggregate_LinearRegression, especially for \(\Omega_m\), implies that \(\Omega_m\) imprints a strong, relatively simple signal on the global characteristics of merger trees (like total mass, average node properties, or tree size). \(\sigma_8\), which relates to the amplitude of matter fluctuations, might be expected to influence more subtle aspects of substructure and merger histories, potentially explaining why it is generally harder to constrain (lower R² values across most models).

The QITT_LinearRegression model's surprising strength compared to QITT_RandomForest/XGBoost could imply that the QITT features (with ranks (1,2,2,1)) are already in a space where linear relationships to cosmological parameters are dominant, or that the non-linear models were not able_to_exploit further non-linearities effectively within this 202-dimensional QITT feature space without more extensive tuning or different architectures. The LinAlgWarnings during Ridge regression for rank cross-validation (Step 3 output) and some ill-conditioning/singularity warnings during some model fits (Step 4 output) suggest potential multicollinearity or numerical stability issues with some feature sets/ranks, which might affect model training and interpretation.

#### 5.4. Limitations

*   **Interpretability of QITT Features:** While QITT provides effective compression, the resulting features are elements of TT-cores and lack direct physical interpretability. Understanding precisely which physical interactions are captured requires further investigation.
*   **GNN Training:** The GNN was trained for a limited number of epochs (5) for pipeline efficiency. More extensive training or different self-supervised objectives might yield even more informative topological embeddings.
*   **Substructure Definition:** The definition of substructures and the choice of physical features, while physically motivated, are not exhaustive. Other definitions or features might capture different aspects of merger tree evolution.
*   **Dataset Size:** While 1000 trees provide a good starting point, larger datasets could enable the training of more complex models and potentially reveal more subtle relationships.
*   **Computational Cost of Baselines:** Baseline B3 (Graphlet Counts) was skipped. A full comparison would ideally include it. Some baselines (like B4) involve very high-dimensional feature spaces, which can be computationally intensive for certain model types.

### 6. Conclusion and Future Outlook

This work has demonstrated a novel pipeline for cosmological parameter estimation from merger trees, leveraging multi-scale substructure analysis, GNN-based topological embeddings, and Quantum-Inspired Tensor Train decomposition. The QITT approach proved effective in compressing rich substructure information (physical and topological) into a lower-dimensional feature set that significantly outperformed models using raw or simply flattened substructure features. Specifically, QITT-based XGBoost showed statistically significant improvements over an XGBoost model using flattened combined features (including topology) without QITT.

While the QITT-enhanced models did not uniformly surpass a simpler baseline using global aggregate tree features (B1_Aggregate_LinearRegression), particularly for \(\Omega_m\), they offer a sophisticated framework for incorporating detailed substructure information. The results suggest that global tree properties are highly predictive, but for capturing more subtle effects potentially related to \(\sigma_8\) or other parameters, the detailed substructure analysis offered by the QITT pipeline holds promise.

Future directions include:
1.  **Enhanced Interpretability:** Developing methods to map important QITT core features back to understandable physical properties or interactions within substructures.
2.  **Advanced GNNs and QITT Integration:** Exploring more sophisticated GNN architectures, alternative self-supervised training tasks for topological embeddings, and end-to-end trainable models incorporating QITT layers.
3.  **Broader Cosmological and Astrophysical Applications:** Applying this framework to estimate a wider range of cosmological parameters, or to study galaxy formation processes by correlating QITT features with baryonic properties in hydrodynamical simulations.
4.  **Optimizing Substructure Definition and Feature Engineering:** Systematically exploring different criteria for identifying salient substructures and extracting a more comprehensive set of physical and morphological features.
5.  **Scaling to Larger Datasets:** Testing the methodology on next-generation simulation datasets to assess its robustness and potential for higher precision.

In summary, the combination of learned topological embeddings and QITT decomposition offers a powerful and promising avenue for extracting complex information from hierarchical structures like dark matter merger trees, with significant potential for advancing data-driven cosmology.