\input{figs/main/overview}
\section{The \Ours{} Framework}

Sparse-view 3DGS often exhibits structural fragility, texture inconsistency, and loss of fine details due to limited supervision. We introduce TAS-GS, a framework that enhances 3DGS through explicit \emph{topological}, \emph{appearance}, and \emph{semantic} priors. TAS-GS incorporates a topology-aware graph regularizer to eliminate floaters and repair holes, a GNN-based appearance propagation module for texture refinement in under-constrained areas, and a semantic modulator that emphasizes rare and boundary regions to recover fine details and underrepresented classes. All components support decoupled training and seamless integration, simultaneously ensuring real-time rendering and enhanced reconstruction quality under sparse-view conditions.

% \subsection{Notation}
% \label{sec:notation}
For clarity, we summarize key symbols and operators in the appendix (Tables~\ref{tab:notation} and~\ref{tab:complexity-notation}). This includes Gaussian attributes (position, scale, rotation, opacity, spherical harmonics coefficients), graph construction operators, neighborhood relations, and loss formulations. Complexity-related symbols are detailed separately.

\subsection{Topology-Aware Graph Regularization }
\label{sec:method:topo}

Sparse-view reconstructions often suffer from disconnected components, floating Gaussians, and structural gaps.  The objective of the topology-aware graph regularizer is to enforce geometric consistency by pruning spurious structures and repairing missing regions, thereby providing a stable foundation for subsequent appearance refinement.

We enrich the learnable Gaussian set $\mathcal{S}=\{\mathcal{G}_i\}_{i=1}^N$ with a topology-aware graph that supports pruning and hole repair before appearance refinement. Each Gaussian $\mathcal{G}_i$ has center $\mathbf{x}_i\in\mathbb{R}^3$ and covariance $\Sigma_i\in\mathbb{R}^{3\times3}$. We connect node $i$ to $k$ neighbors using a directional Mahalanobis metric:
\begin{equation}
d_M(i,j)=(\mathbf{x}_j-\mathbf{x}_i)^\top \Sigma_i^{-1}(\mathbf{x}_j-\mathbf{x}_i), 
\qquad 
\mathcal{N}_k(i)=\arg\min_{j\neq i}^{(k)} d_M(i,j).
\label{eq:maha}
\end{equation}
The neighbor search is implemented in two stages. We first retrieve the top $K$ Euclidean candidates ($K=100$ by default), and then refine them with \eqref{eq:maha} to select $k$ neighbors. Symmetrization yields an undirected adjacency $A$. Connected components $\{C_m\}_{m=1}^M$ are then computed, and small components are removed:
\begin{equation}
\mathcal{R}=\{ i : i\in C_m,\; |C_m|<S_{\min}\}.
\label{eq:prune-small}
\end{equation}

\paragraph{Detecting geometric voids.}
We score candidate viewpoints by the fraction of pixels with low opacity and choose:
\begin{equation}
c^\star=\arg\max_{c}\;\#\{u : \alpha_c(u)<\tau_\alpha\}, 
\qquad \tau_\alpha=0.5.
\label{eq:best-view}
\end{equation}
We then render $\alpha_{c^\star}$ at native resolution and extract the 2D boundary with a $5\times5$ dilation:
\begin{equation}
\mathcal{B}=\mathcal{D}\!\left(\mathbf{1}[\alpha_{c^\star}<\tau_\alpha]\right)\setminus \mathbf{1}[\alpha_{c^\star}<\tau_\alpha].
\label{eq:boundary2d}
\end{equation}
Boundary pixels $u\in\mathcal{B}$ are unprojected to 3D points $\mathbf{y}(u)$ using camera intrinsics and pose $(K,R,t)$. Each point is mapped to its nearest Gaussian with a KD-tree, giving a boundary set $\mathcal{I}_\partial=\{\mathrm{nn}(\mathbf{y}(u))\}$. Component labels $L(i)\in\{1,\dots,M\}$ indicate whether a hole separates multiple components or lies within one.

\paragraph{Inter-component bridging.}  
When $\mathcal{I}_\partial$ contains at least two distinct component labels, we connect the two most frequent ones, $\ell_1$ and $\ell_2$, by inserting $B$ Gaussians along their shortest cross-component segment. Specifically, we select candidate points:
\[
\mathbf{p}\in\{\mathbf{x}_i: L(i)=\ell_1,\, i\in\mathcal{I}_\partial\},\qquad
\mathbf{q}\in\{\mathbf{x}_j: L(j)=\ell_2,\, j\in\mathcal{I}_\partial\},
\]
and choose the closest pair $(\mathbf{p}^\star,\mathbf{q}^\star)=\arg\min_{\mathbf{p},\mathbf{q}}\|\mathbf{p}-\mathbf{q}\|_2$.  
New Gaussians are then placed by linear interpolation:
\begin{equation}
\mathbf{x}^{(b)}_{\text{new}}=(1-t_b)\,\mathbf{p}^\star + t_b\,\mathbf{q}^\star,
\quad 
t_b=\frac{b}{B+1},\; b=1,\dots,B.
\label{eq:bridge}
\end{equation}

\paragraph{Intra-component filling.}  
When only a single label $\ell$ is present, we fill the hole interior using local tangent-plane estimates. For a boundary subset $\mathcal{J}\subset\{i\in\mathcal{I}_\partial:L(i)=\ell\}$, each $i\in\mathcal{J}$ collects $k$ neighbors and computes the centroid $\boldsymbol{\mu}_i$ and covariance:
\[
C_i=\frac{1}{k-1}\sum_{j\in\mathcal{N}_k(i)}(\mathbf{x}_j-\boldsymbol{\mu}_i)(\mathbf{x}_j-\boldsymbol{\mu}_i)^\top.
\]
The plane normal $\mathbf{n}_i$ is taken as the eigenvector corresponding to the smallest eigenvalue of $C_i$. A new Gaussian is then initialized as:
\begin{equation}
\mathbf{x}^{(i)}_{\text{new}}=\boldsymbol{\mu}_i,\qquad 
\mathbf{s}^{(i)}_{\text{new}}=\big[0.5\bar d_i,\;0.5\bar d_i,\;0.1\bar d_i\big],\qquad 
\bar d_i=\tfrac{1}{k-1}\sum_{j\in\mathcal{N}_k(i)}\|\mathbf{x}_j-\boldsymbol{\mu}_i\|_2,
\label{eq:fill}
\end{equation}
where the rotation is defined by aligning $\mathbf{n}_i$ with the $z$-axis and converted into a quaternion.

\paragraph{Atomic execution and protection.}
All edits are applied atomically at the end of each constraint stage. Gaussians referenced in \eqref{eq:prune-small} are pruned, while new Gaussians inherit spherical harmonics, opacity, scale, and rotation from their nearest neighbors, with opacity slightly reduced to ensure smooth integration. Inserted Gaussians are protected from pruning for $t_{\text{grace}}$ iterations. This procedure runs periodically (every $T_{\text{rasterize}}$ iterations) prior to the GNN stage and does not modify the rasterizer.

\subsection{GNN-based Appearance Propagation}
\label{sec:method:gnn}

To compensate for weak supervision in sparse views, we propagate appearance across a geometry-aware graph and apply visibility-adaptive residual updates to the opacity and SH coefficients. Given $\mathcal{S}=\{\mathcal{G}_i\}_{i=1}^{N}$ with centers $\mathbf{x}_i$, covariances $\Sigma_i$, opacities $o_i$, and stacked SH vectors $\mathbf{k}_i$, we downsample nodes $\mathcal{V}'$ with ratio $r$ and construct a $k$-nearest neighbor graph using the same procedure as in \S\ref{sec:method:topo}.

\paragraph{Graph features.}
Each node feature concatenates geometry and appearance:
\[
\mathbf{h}_i^{(0)}=[\mathbf{x}_i,\mathbf{s}_i,\mathbf{q}_i,o_i,\mathrm{vec}(\mathbf{k}_i)],\quad
\mathbf{e}_{ij}=[(\mathbf{x}_i-\mathbf{x}_j),\ \|\mathbf{x}_i-\mathbf{x}_j\|_2,\ \langle\mathbf{n}_i,\mathbf{n}_j\rangle],
\]
where $\mathbf{n}_i$ is the shortest-axis normal estimated from $\Sigma_i$. Features and edges are standardized per batch, and scales are clamped for numerical stability.

\paragraph{Message passing.}
We adopt a three-layer edge-conditioned GATv2~\citep{brody2022gatv2}. For layer $\ell$ and head $h$, the attention coefficient is defined as:
\begin{equation}
\alpha^{(h)}_{ij} = \mathrm{softmax}_{j\in\mathcal{N}(i)}
\!\left(\mathbf{a}^{(h)\!\top}\,
\sigma\!\big(W^{(h)}[\mathbf{h}_i^{(\ell)} \Vert \mathbf{h}_j^{(\ell)} \Vert U^{(h)}\mathbf{e}_{ij}]\big)\right),
\end{equation}
and the node update is given by:
\begin{equation}
\mathbf{h}_i^{(\ell+1)} = \phi\!\left(\big\Vert_{h=1}^{H} \sum_{j\in\mathcal{N}(i)} \alpha^{(h)}_{ij}\, V^{(h)}\mathbf{h}_j^{(\ell)}\right),
\end{equation}
where $\phi(\cdot)=\tanh(\cdot)$, $\mathcal{N}(i)$ denotes the neighborhood of node $i$, and $H$ is the number of attention heads.

\paragraph{Residual updates.}
The MLP head predicts $\widehat{\boldsymbol{\delta}}_i=[\Delta o_i,\Delta\mathbf{k}_i,score_i]$ for $i\in\mathcal{V}'$. Only the appearance-related residuals are updated through visibility-adaptive blending:
\begin{equation}
\beta = w_{\text{res}}\,(1-\bar v),\qquad
(o_i,\mathbf{k}_i) \leftarrow (o_i,\mathbf{k}_i) + \beta\,[\Delta o_i,\Delta\mathbf{k}_i],
\quad \text{if } v_i=0: \ \beta \leftarrow 2\beta,
\label{eq:blend}
\end{equation}
where $\mathbf{v}\in\{0,1\}^N$ is the visibility mask and $\bar v=N^{-1}\sum_i v_i$.

\paragraph{Survival score.}
After temperature scaling, ${score}'_i=\sigma(score_i/\tau_s)\in(0,1)$ serves as a pruning prior. A Gaussian is pruned only if $o_i<o_{\min}$ and ${score}'_i<\tau_{\text{prune}}$, preventing premature deletion of structurally important but temporarily transparent splats.

\paragraph{Visibility-aware reconstruction.}
Let $\mathbf{a}_i=[o_i,\mathrm{vec}(\mathbf{k}_i)]$ and $\widehat{\mathbf{a}}_i=\mathbf{a}_i+\widehat{\boldsymbol{\delta}}_i^{\text{app}}$. We apply a visibility-aware reconstruction loss on $\mathcal{V}'$:
\begin{align}
\mathcal{L}_{\text{vis}} &= \tfrac{1}{|\{i\in\mathcal{V}':v_i=1\}|}\sum_{i\in\mathcal{V}':v_i=1}
\|\widehat{\mathbf{a}}_i - \mathbf{a}_i\|_2^2,\\
\mathcal{L}_{\text{inv}} &= \tfrac{1}{|\{i\in\mathcal{V}':v_i=0\}|}\sum_{i\in\mathcal{V}':v_i=0}
\|\mathbf{a}_i - \mathrm{sg}(\widehat{\mathbf{a}}_i)\|_2^2,\\
\mathcal{L}_{\text{GNN}} &= 0.1\,\mathcal{L}_{\text{vis}} + 1.0\,\mathcal{L}_{\text{inv}}.
\end{align}
The invisible loss is teacher-free: predictions serve as fixed targets, which prevents collapse and encourages propagation into occluded regions. $\mathcal{L}_{\text{GNN}}$ is combined with the renderer’s photometric loss, and smoothness emerges naturally from message passing on the Mahalanobis graph without requiring an explicit Laplacian term.

\subsection{Semantic-Rarity and Boundary-Aware Modulation}
\label{sec:method:srb}

We incorporate semantic priors to strengthen supervision on pixels that are (i) semantically rare or (ii) located at semantic boundaries, both of which are typically underconstrained in sparse views. Pseudo-labels are obtained from a frozen segmentation model~\citep{jain2023oneformer}, producing $S:\Omega\to\{0,\ldots,C-1\}$. Classes with pixel counts below the 25th percentile, excluding background, form the rare set $\mathcal{R}$. Boundaries are detected by changes in the four-neighborhood of labels:
\[
\mathcal{B}=\{\mathbf{u}\in\Omega \mid S(\mathbf{u})\neq S(\mathbf{u}+\mathbf{e}_x)\ \text{or}\ S(\mathbf{u})\neq S(\mathbf{u}+\mathbf{e}_y)\},
\]
which emphasizes thin and occluding structures. Let $\hat{\mathbf{I}}$ denote the rendering and $\mathbf{I}$ the ground truth. We then define:
\[
\Omega_{\mathrm{rare}}=\{\mathbf{u}\in\Omega : S(\mathbf{u})\in\mathcal{R}\},\qquad
\Omega_{\mathrm{edge}}=\mathcal{B}.
\]

The SRB loss is defined as:
\begin{equation}
\label{eq:srb-loss}
\mathcal{L}_{\mathrm{SRB}}
=\lambda_{\mathrm{rare}}\tfrac{1}{|\Omega_{\mathrm{rare}}|}\sum_{\mathbf{u}\in\Omega_{\mathrm{rare}}}\|\mathbf{I}(\mathbf{u})-\hat{\mathbf{I}}(\mathbf{u})\|_1
+\lambda_{\mathrm{edge}}\tfrac{1}{|\Omega_{\mathrm{edge}}|}\sum_{\mathbf{u}\in\Omega_{\mathrm{edge}}}\|\mathbf{I}(\mathbf{u})-\hat{\mathbf{I}}(\mathbf{u})\|_1,
\end{equation}
and can be equivalently written as a single weighted pass:
\begin{equation}
\label{eq:srb-weighted}
\mathcal{L}_{\mathrm{SRB}}
=\tfrac{1}{|\Omega|}\sum_{\mathbf{u}\in\Omega} w(\mathbf{u})\,\|\mathbf{I}(\mathbf{u})-\hat{\mathbf{I}}(\mathbf{u})\|_1,
\end{equation}
where
\[
w(\mathbf{u})=
\begin{cases}
w_{\mathrm{rare}} & S(\mathbf{u})\in\mathcal{R},\\
w_{\mathrm{edge}} & \mathbf{u}\in\mathcal{B},\\
w_{\mathrm{base}} & \text{otherwise},
\end{cases}
\quad
w_{\mathrm{rare}}=w_{\mathrm{edge}}=1.0,\;
w_{\mathrm{base}}=0.2.
\]

The loss is activated after a short warm-up and combined with other training objectives.



\subsection{Optimization}
\label{sec:method:train}

Our optimization jointly optimizes two aspects: (i) Gaussian attributes $\mathcal{G}$ and (ii) GNN parameters for appearance propagation. Densification and pruning follow 3DGS~\citep{kerbl2023gaussiansplatting} but are guided by our graph propagation and semantic priors. This prevents uncontrolled growth while preserving rare structures, ensuring that the retained Gaussians remain both geometrically connected and semantically consistent.

The overall training objective is:
\begin{equation}
\label{eq:total-loss}
\mathcal{L}
= \mathcal{L}_{\mathrm{photo}}
+ \lambda_{\mathrm{opacity}}\,\mathcal{L}_{\mathrm{opacity}}
+ \lambda_{\mathrm{app}}\,\mathcal{L}_{\mathrm{GNN}}
+ \lambda_{\mathrm{sem}}\,\mathcal{L}_{\mathrm{SRB}},
\end{equation}
where $\mathcal{L}_{\mathrm{photo}}$ combines $\ell_1$ and SSIM losses, and a entropy-style sparsity regularizer:
\begin{equation}
\mathcal{L}_{\mathrm{opacity}}
= -\tfrac{1}{|\mathcal{V}|}\sum_{i\in\mathcal{V}}
\big[o_i \log(o_i) + (1-o_i)\log(1-o_i)\big],
\end{equation}
which encourages binary opacities for cleaner geometry. $\mathcal{L}_{\mathrm{GNN}}$ supervises appearance propagation, and $\mathcal{L}_{\mathrm{SRB}}$ emphasizes rare classes and boundary pixels. The weights $(\lambda_{\mathrm{opacity}},\lambda_{\mathrm{app}},\lambda_{\mathrm{sem}})$ balance accuracy, stability, and compactness. Training runs for $T$ iterations with periodic graph rebuilding (every $T_{\text{build}}$) and delayed activation of semantic priors after warm-up.
