% \vspace{-3mm}

\section{Limitations}

{The performance of learning from heterophily graphs heavily depends on how information is aggregated in different neighborhoods. Label availability is crucial for guiding this aggregation. In the absence of labels, a significant challenge for any graph SSL method, including \alg, arises when nodes with similar labels cannot be approximately identified. For instance, in the Penn94 dataset, all SSL methods in Table \ref{tab:benchmark_baselines}, including \alg, underperform compared to supervised methods. This is due to the lack of correlation between node feature similarity and label similarity in this dataset. In contrast, in the Cora dataset, nodes belonging to the same class exhibit an average of 31\% higher feature cosine similarity than nodes from different classes, while in Chameleon, this difference is 11\%. However, in Penn94, the difference is only 1.4\% on average, indicating a high similarity in node features across different classes. Consequently, SSL methods, including \alg, face challenges in learning high-quality node representations. Despite this, \alg outperforms other SSL baselines on Penn94, as shown in Table \ref{tab:benchmark_baselines}.} \looseness=-1

Additionally, on graphs where node features cannot differentiate different classes, \alg can face challenges. For instance, for Actor dataset, nodes from different classes have similar connectivity patterns to other classes \citep{ma2106homophily}. In this case, the graph structure is not useful to classify the nodes (and may hurt the performance), and only relying on node features achieve a better performance. As shown in \citep{ma2106homophily,chen2022towards}, models like MLP which do not use any graph structure can outperform GNN methods like GCN and even H2GCN on the Actor dataset. HGRL uses MLP as its encoder and does not leverage the graph structure, hence it can slightly outperform \alg on Actor, but is outperformed by \alg on other datasets. 

In Sec. \ref{sec:feature_label}, we discuss the effectiveness of node features in inferring subgraphs, and their limitations if used directly for classification without incorporating the graph structure.

\begin{table}[!t]
\centering
\caption{Producing final representations with different graph filters. Low-pass filtered representations has the highest performance on both homophilic and heterophilic graphs. }\label{tab:final_output}\vspace{-2mm}
\begin{small}
\begin{tabular}{l|cccc}
\toprule
Model & Cora & Citeseer & Chameleon & Squirrel \\
\midrule
LP & \textbf{84.1} $\pm$ 1.0 & \textbf{70.1} $\pm$ 0.8 & \textbf{50.9} $\pm$ 1.0 & \textbf{42.9} $\pm$ 2.6 \\
HP & 51.9 $\pm$ 2.9 & 36.2 $\pm$ 2.5 & 35.6 $\pm$ 3.1 & 29.8 $\pm$ 1.6 \\
LP+HP & 74.2 $\pm$ 2.0 & 57.6 $\pm$ 2.0 & 48.7 $\pm$ 1.0 & 40.8 $\pm$ 2.0 \\
\bottomrule
\end{tabular}
\end{small}
% \vspace{-1mm}
\end{table}

\vspace{-5mm}
\section{Conclusion}%\vspace{-1mm}
We proposed \alg, a contrastive learning framework that finds a homophilic and a heterophilic subgraph in a graph, applies high-pass and low-pass filters to the augmented subgraph views, and learns node representations by contrasting the filtered augmented views. This is particularly beneficial for graphs with heterophily. Through extensive experiments, we demonstrated that our proposed framework achieves up to 7\% boost graphs under heterophily and outperforms popular graph supervised learning methods by up to 10\%. \alg also provides a comparable performance under homophily. We believe our work provides an important direction for future work on contrastive learning under heterophily.

\textbf{Acknowledgments.} This research was partially supported by the National Science Foundation CAREER Award 2146492.