\documentclass[twocolumn]{aastex631}

\usepackage{amsmath}
\usepackage{multirow}
\usepackage{natbib}
\usepackage{graphicx} 
\usepackage{aas_macros}

\begin{document}

The objective of this study was to investigate the geometric structure of the 10-dimensional latent space generated by a PINN solving the 2D Burger's equation, focusing on how different initial conditions are encoded within this space. Using Principal Component Analysis (PCA) and subspace similarity measures, we analyzed the latent vectors corresponding to 25 distinct initial conditions.

\subsection{Global structure of the latent space}

We began by analyzing the overall structure of the latent space by performing PCA on the entire collection of latent vectors generated across all spatial points, time steps, and the 25 initial conditions. This global analysis, as described in the Methods, treats all $101 \times 103 \times 25 = 260075$ latent vectors as a single dataset in $\mathbb{R}^{10}$. The variance explained by each principal component (PC) provides insight into the intrinsic dimensionality and dominant directions of variation within the aggregated latent representation.

The results of this global PCA reveal a significant concentration of variance in the leading principal components. As shown in the scree plot in Figure \ref{fig:global_pca_scree}, the first principal component (PC1) alone captures 60.12\% of the total variance. The second (PC2) and third (PC3) components capture an additional 23.44\% and 12.93\%, respectively. Cumulatively, the first three global PCs account for 96.48\% of the total variance. Including the fourth (1.30\%), fifth (1.17\%), and sixth (0.76\%) components brings the cumulative variance explained to 99.72\%. The remaining four components individually explain less than 0.3\% of the variance each.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/global_pca_scree_plot_1_20250604-135555.png}
\caption{Scree plot showing the individual and cumulative explained variance from the global Principal Component Analysis of all latent vectors. The variance is highly concentrated in the first three components, which capture over 96\% of the variance, revealing the low-dimensional structure of the global latent space.}
\label{fig:global_pca_scree}
\end{figure}

This strong concentration of variance within the first six principal components demonstrates that the entire collection of latent vectors, despite residing in a 10-dimensional space, effectively occupies a much lower-dimensional subspace. The vast majority (>99\%) of the variability observed in the latent representations across all tested physical states and initial conditions is captured by a 6-dimensional linear subspace. This suggests that the PINN learns an overall efficient encoding, where the complex dynamics across different conditions are constrained to a relatively low-dimensional manifold within the full latent space.

\subsection{Intrinsic dimensionality of per initial condition manifolds}

Next, we investigated the structure of the latent space corresponding to individual initial conditions. For each of the 25 initial conditions ($IC_k$, $k=0, \dots, 24$), we performed PCA independently on the $101 \times 103 = 10403$ latent vectors $\{L(x_i, t_j, IC_k)\}$ associated with that specific condition. This analysis aims to characterize the intrinsic dimensionality and shape of the latent point cloud representing the PINN's encoding of the solution for a fixed initial state.

The results show a remarkable consistency across all 25 initial conditions. As shown by the average scree plot in Figure \ref{fig:per_ic_scree} and the intrinsic dimensionality distribution in Figure \ref{fig:intrinsic_dim}, for every single IC, precisely 3 principal components were sufficient to explain over 95\% of the variance within its corresponding latent point cloud. Quantitatively, the average cumulative variance explained by the first three per-IC principal components is 97.48\%, with a very low standard deviation (0.15\%). The average variance explained by the first, second, and third per-IC PCs were 59.61\%, 23.72\%, and 14.15\%, respectively. The variance captured by the fourth per-IC PC and beyond drops sharply, with the average variance for the fourth PC being below 2\%.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/per_ic_avg_scree_plot_2_20250604-135804.png}
\caption{Average explained variance by principal components for the latent space of each initial condition, averaged across 25 initial conditions. Blue bars show the average individual explained variance per component; the red line shows the average cumulative explained variance with standard deviation (shaded). This analysis reveals that the latent representation for each initial condition is consistently low-dimensional, with the first three components capturing nearly 97.5\% of the variance on average.}
\label{fig:per_ic_scree}
\end{figure}

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/intrinsic_dim_dist_plot_2_20250604-135804.png}
\caption{Distribution of the intrinsic dimensionality for the latent representations of each of the 25 initial conditions (ICs). Intrinsic dimensionality is defined as the minimum number of principal components required to capture over 95\% of the variance for each IC's latent vectors. The plot shows that all 25 ICs result in latent manifolds with an intrinsic dimensionality of 3.}
\label{fig:intrinsic_dim}
\end{figure}

These findings strongly suggest that, for any given initial condition within the tested set, the PINN's latent representation of the spatiotemporal solution $\{L(x,t)\}$ forms an effectively 3-dimensional structure embedded in the 10-dimensional latent space. The high percentage of variance captured by the leading three components indicates that these structures are well-approximated by 3-dimensional affine manifolds (shifted linear subspaces), exhibiting limited non-linear deviations from this linear approximation within the scope of the tested conditions. This implies that the network has learned a consistent, low-dimensional basis for representing the state of the system over space and time for a fixed initial condition.

\subsection{Geometric arrangement of manifold centroids}

To understand how the latent representations differ across initial conditions, we analyzed the geometric arrangement of the centroids $C_k$ of the per-IC latent point clouds. Each centroid $C_k$ is a 10-dimensional vector representing the mean position of the latent manifold for initial condition $IC_k$. We collected these 25 centroid vectors and performed PCA on this $(25 \times 10)$ matrix.

The results of this centroid PCA are striking, as shown in the scree plot in Figure \ref{fig:centroid_pca_scree}. The first principal component of the centroids (CPC1) explains an overwhelming 99.86\% of the total variance in the centroid positions. The second component (CPC2) explains only 0.10\%, and the third (CPC3) explains 0.02\%.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/centroid_pca_scree_plot_3_20250604-135958.png}
\caption{Scree plot showing the variance explained by principal components of the initial condition (IC) centroids. The first principal component captures over 99\% of the variance, indicating that the centroids are arranged along an effectively one-dimensional structure in the latent space.}
\label{fig:centroid_pca_scree}
\end{figure}

Projecting the centroids onto their principal components, as depicted in Figure \ref{fig:centroid_pca_2d} (2D projection) and Figure \ref{fig:centroid_pca_3d} (3D projection), reveals that they form an almost perfectly linear arrangement in the latent space. The centroids corresponding to initial conditions indexed 0 through 24 are ordered sequentially along this dominant, nearly one-dimensional direction defined by CPC1.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/centroid_pca_scatter_2D_3_20250604-135958.png}
\caption{Initial condition (IC) manifold centroids projected onto their first two principal components (CPC1 and CPC2). Each point represents the centroid for a specific IC, labeled and colored by its index (0-24). The points form a clear, near-linear trajectory predominantly along CPC1, indicating that changing the IC primarily translates the corresponding latent manifold along a dominant direction.}
\label{fig:centroid_pca_2d}
\end{figure}

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/centroid_pca_scatter_3D_3_20250604-135958.png}
\caption{Three-dimensional scatter plot showing the projection of the 25 per-initial condition (IC) latent manifold centroids onto their first three principal components (CPC1, CPC2, and CPC3). Each point represents the centroid for a unique initial condition and is colored according to its corresponding IC index (0 to 24). The plot demonstrates that the centroids are arranged along a predominantly one-dimensional path, strongly aligned with CPC1, indicating that the primary effect of varying the initial condition is to translate the latent manifold along a specific direction.}
\label{fig:centroid_pca_3d}
\end{figure}

This finding is crucial: it indicates that the primary effect of changing the initial condition within this ensemble is to translate the entire 3D latent manifold corresponding to that condition along a specific, nearly one-dimensional path within the 10-dimensional latent space. This suggests that the PINN encodes the difference between initial conditions predominantly as a shift in the mean position of the learned solution manifold.

\subsection{Comparison of manifold orientations}

While the centroids reveal the translational differences between the manifolds, we also investigated whether the orientation or "shape" of the 3D manifolds changes across initial conditions. For each $IC_k$, the per-IC PCA yields a set of principal vectors $\{v_{k1}, v_{k2}, v_{k3}\}$ spanning the approximate 3D affine manifold. We compared these principal subspaces across different initial conditions.

We quantified the similarity between the 3-dimensional principal subspaces spanned by $\{v_{k1}, v_{k2}, v_{k3}\}$ for pairs of initial conditions ($IC_k$, $IC_l$) using subspace similarity measures based on principal angles. The results, shown in the heatmap in Figure \ref{fig:subspace_similarity}, indicate that the average subspace similarity score across all pairs of initial conditions was exceptionally high, measuring 0.986, with a standard deviation of only 0.014. The minimum observed similarity was 0.954. A similarity score close to 1 indicates that the two subspaces are nearly parallel.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/subspace_similarity_heatmap_4_20250604-140218.png}
\caption{Subspace similarity between 3D latent manifolds for different initial conditions. The heatmap shows the average squared cosine of the principal angles between the subspaces spanned by the top three principal components for each pair of initial conditions ($IC_k$ and $IC_l$). High values (bright yellow) indicate strong alignment. The consistently high similarity across all pairs demonstrates that the 3D latent manifolds associated with different initial conditions are highly parallel.}
\label{fig:subspace_similarity}
\end{figure}

To further understand the subtle variations in orientation, we performed PCA separately on the set of first principal vectors $\{v_{k1}\}_{k=0}^{24}$, the set of second principal vectors $\{v_{k2}\}_{k=0}^{24}$, and the set of third principal vectors $\{v_{k3}\}_{k=0}^{24}$ across all initial conditions. As shown in Figure \ref{fig:dot_product_heatmaps} (dot product heatmaps) and Figure \ref{fig:orientation_pca} (PCA of principal vectors), for the set of first principal vectors $\{v_{k1}\}$, the first PC explained 85.45\% of their variance. For $\{v_{k2}\}$, the first PC explained 80.04\%. Most notably, for $\{v_{k3}\}$, the first PC explained 97.67\% of the variance. This indicates that the variations in the orientations of the principal axes of the 3D manifolds are themselves highly structured and change in a low-dimensional manner, effectively tracing out nearly one-dimensional paths in the space of orientation vectors as the initial condition index changes.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/dot_product_heatmaps_4_20250604-140218.png}
\caption{Heatmaps show the absolute dot product between corresponding principal vectors (PC1, PC2, PC3) from per-initial condition PCA for all pairs of initial conditions. High values (yellow) indicate strong alignment. The plots demonstrate substantial alignment across initial conditions, particularly for PC3, indicating that the 3D latent manifolds for different initial conditions are largely parallel.}
\label{fig:dot_product_heatmaps}
\end{figure}

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/pca_of_principal_vectors_4_20250604-140218.png}
\caption{Principal Component Analysis (PCA) of the sets of per-initial condition (IC) principal vectors. Top row shows scree plots for the collection of first ($v_{k1}$), second ($v_{k2}$), and third ($v_{k3}$) per-IC principal vectors across all 25 ICs, indicating high variance capture by the first component in each set. Bottom row shows the 2D projection of these vector sets onto their respective first two principal components, colored by IC index, revealing a structured, low-dimensional variation in the orientation of the 3D per-IC manifolds.}
\label{fig:orientation_pca}
\end{figure}

In summary, the 3D latent manifolds are not only translated versions of each other but also exhibit a very high degree of parallelism. The minor deviations in their orientations are systematic and follow a simple, low-dimensional pattern correlated with the initial condition index.

\subsection{Relationship between per initial condition structures and global structure}

Finally, we related the geometrically characterized per-IC manifolds to the overall structure of the global latent space. The global PCA identified a 6-dimensional subspace capturing 99.72\% of the total variance (Figure \ref{fig:global_pca_scree}). We projected the centered latent vectors $(L_{k} - C_k)$ for each initial condition $k$ onto this 6D global principal subspace. As shown in Figure \ref{fig:variance_in_global_subspace}, on average, 99.66\% of each individual IC's intrinsic variance (the variance within its 3D manifold) was captured by this 6D global subspace, with a minimum capture of 99.24\%. This confirms that the individual 3D manifolds are almost entirely embedded within the common, higher-dimensional subspace occupied by the entire dataset.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/variance_captured_by_global_subspace_5_20250604-140503.png}
\caption{Percentage of the intrinsic variance for each initial condition (IC) latent manifold captured by the 6-dimensional global principal subspace. The consistently high values demonstrate that the individual 3D manifolds are effectively embedded within this common global subspace.}
\label{fig:variance_in_global_subspace}
\end{figure}

Furthermore, we projected the per-IC centroids $C_k$ onto the global principal components. This analysis, visualized in Figure \ref{fig:projected_centroids_2d} (2D projection) and Figure \ref{fig:projected_centroids_3d_global} (3D projection), showed that the trajectory of the centroids aligns strongly with the first global principal component (Global PC1). The initial condition index (0-24) maps almost linearly to the position along Global PC1. This demonstrates that the dominant mode of variation in the entire latent space (Global PC1) is directly associated with the primary way the initial conditions are encoded – as translations of the latent manifold along this direction.

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/projected_centroids_2D_5_20250604-140503.png}
\caption{Projection of per-initial condition latent manifold centroids onto the first two global principal components. Each point, labeled and colored by initial condition index, reveals a near-linear arrangement predominantly along the first global component. This indicates that the PINN encodes variations due to initial conditions primarily by translating the corresponding latent manifolds along a structured, low-dimensional trajectory within the global latent space.}
\label{fig:projected_centroids_2d}
\end{figure}

\begin{figure}[htbp]
\centering
\includegraphics[width=0.5\textwidth]{../input_files/plots/projected_centroids_3D_5_20250604-140503.png}
\caption{Centroids of the 25 per-initial condition (IC) latent manifolds projected onto the first three global principal components (PCs). Points are colored by IC index (0-24). The centroids form a near-linear path, primarily along Global PC1, indicating that different initial conditions primarily translate the latent manifolds along this dominant direction in the global latent space.}
\label{fig:projected_centroids_3d_global}
\end{figure}

These results highlight a hierarchical structure: a global 6D subspace accommodates all learned representations. Within this subspace, each specific initial condition selects a 3D affine manifold whose position is determined by a translation along a nearly 1D path strongly aligned with the global PC1. The orientation of this 3D manifold is remarkably consistent across ICs, with subtle, structured, low-dimensional variations.

\subsection{Synthesis and interpretation}

The collective findings from our geometric analysis provide a clear and compelling picture of how the PINN structures its latent space to represent solutions of the 2D Burger's equation across varying initial conditions. The latent space is not a complex, entangled high-dimensional mess but rather exhibits a highly organized geometric structure.

For a given initial condition, the network learns a representation that effectively lies on a 3-dimensional affine manifold within the 10-dimensional latent space. This intrinsic dimensionality is strikingly consistent across all 25 tested initial conditions, as shown in Figure \ref{fig:intrinsic_dim}. The primary effect of changing the initial condition is not to drastically alter the structure or dimensionality of this manifold, but rather to translate it within the latent space. These translations occur along a well-defined, nearly one-dimensional path (Figures \ref{fig:centroid_pca_2d}, \ref{fig:centroid_pca_3d}), which is itself strongly aligned with the dominant direction of variation in the overall latent space (Figures \ref{fig:projected_centroids_2d}, \ref{fig:projected_centroids_3d_global}). Moreover, the orientation of these 3D manifolds is remarkably similar across different initial conditions, indicating they are nearly parallel (Figure \ref{fig:subspace_similarity}). The subtle variations in their orientation are not random but follow a structured, low-dimensional pattern related to the initial condition (Figure \ref{fig:orientation_pca}).

This suggests that the PINN has learned a form of disentangled representation. The network appears to separate the influence of the initial condition from the intrinsic spatiotemporal evolution of the solution. The intrinsic dynamics for a fixed initial condition are encoded within the 3D structure of the manifold, while the specific initial condition primarily acts as a parameter that translates this fundamental 3D structure in the latent space. This organization is highly efficient; instead of learning 25 distinct, unrelated high-dimensional structures, the network leverages a common 3D "template" and uses a simple, low-dimensional transformation (translation and minor orientation adjustment) to adapt it for different initial conditions. This geometric simplicity in the latent space provides valuable insights into the network's internal encoding mechanisms, suggesting that the PINN captures the essential physics in a structured and interpretable manner, at least within this learned latent representation.

\subsection{Limitations and future directions}

While the findings reveal a surprisingly simple and structured latent space geometry, it is important to consider potential limitations and avenues for future research. Our analysis heavily relies on PCA, which is a linear technique. Although the high variance capture suggests that affine manifolds are good approximations, non-linear manifold learning techniques could potentially uncover finer, non-linear structures within the 3D manifolds or in the arrangement of centroids and orientations. The study was conducted for a fixed viscosity parameter; exploring how the latent space structure changes with varying viscosity would be a crucial extension, providing insights into how the PINN encodes physical parameters beyond initial conditions. A larger and more diverse set of initial conditions could further validate the observed low-dimensional nature of the centroid path and orientation variations, potentially revealing more complex patterns if the range of initial conditions were significantly expanded. Furthermore, correlating the specific characteristics of the initial conditions (e.g., amplitude, frequency content) with their positions along the centroid trajectory and their manifold orientations would provide deeper physical meaning to the learned latent structure. Finally, investigating whether similar structured latent spaces are learned by PINNs for other types of PDEs or with different network architectures is essential to assess the generalizability of these findings.

\end{document}
                