% \section{Related Work}

% \vspace{0.1pt}
% \noindent\textbf{Related Work}
% % hierarchical 분류기에 대한 선행 연구
% There are several class hierarchy-aware classifiers. 
% % DeVise
% DeViSE~\cite{frome2013devise} optimizes for maximum cosine similarity between image embeddings, extracted using a pretrained visual model, and label embeddings, generated using a pretrained Word2Vec~\cite{mikolov2013efficient} model on Wikipedia.
% % [4]
% Bertinetto et al.~\cite{bertinetto2020making} proposed hierarchy-sensitive loss adaptations to lessen the severity of errors, achieving it by reducing the hierarchical distance of top-$k$ predictions but increasing the top-1 error. %, a trade-off controllable by a hyperparameter.
% % [7]
% Chang et al.~\cite{chang2021your} highlighted that coarse class cross-entropy loss degrades accuracy at fine-grained levels. They disentangled coarse and fine-grained features by partitioning the feature space.
% % Wu
% %In \cite{wu2016learning}, Wu et al. jointly optimize a multi-task loss, with cross-entropy loss computed at each level of the hierarchy.
% % [HAF]
% Garg et al.~\cite{garg2022learning} proposed a feature learning method that considers class hierarchies. Using Jensen-Shannon divergence and geometric constraints, the classifier trains the hierarchical semantic organization.
% \begin{table}[]
% \centering
% \caption{Data distribution over the classes. The values in parentheses represent the number of extra test samples.}
% \label{tab:data}
% \resizebox{0.65\columnwidth}{!}{%
% \begin{tabular}{c|ccccccc|c}
% \hline
%            & TA     & TVA   & TSA    & HP    & SSL    & IP & LP  & $\sum$      \\ \hline
% Train      & 317    & 232   & 300    & 257   & 130    & 99 & 266 & 1,601    \\
% Validation & 69     & 51    & 65     & 55    & 29     & 21 & 57  & 347      \\
% Test       & 164(95) & 57(6) & 84(18) & 64(8) & 84(55) & 21 & 57  & 531(182) \\ \hline
% \end{tabular}%
% }
% \end{table}
% % Limitations
% Previous studies tried to exploit the class hierarchy in a coarse-to-fine manner; however, due to the lack of explicit specification of class priorities within the same hierarchy, hierarchical approaches are still worth exploring, particularly in relation to multiclass clinical WSI settings.


% 0626 솔 버전


\vspace{0.15cm}
\noindent\textbf{Related Work} There are several class hierarchy-aware classifiers. DeViSE~\cite{frome2013devise} optimizes cosine similarity between image embeddings from pretrained visual models and label embeddings from Word2Vec~\cite{mikolov2013efficient}. Bertinetto et al.~\cite{bertinetto2020making} introduced hierarchy-sensitive loss adaptations to reduce hierarchical distance in top-$k$ predictions while trading off top-1 accuracy. Chang et al.~\cite{chang2021your} addressed how coarse class cross-entropy loss degrades fine-grained accuracy by partitioning the feature space to disentangle coarse and fine-grained features. Garg et al.~\cite{garg2022learning} proposed a feature learning method that considers class hierarchies, using Jensen-Shannon divergence and geometric constraints to train hierarchical semantic organization. While previous studies exploited class hierarchies in a coarse-to-fine manner, the lack of explicit class priority specification within hierarchies makes hierarchical approaches worth exploring, particularly for multiclass clinical WSI settings.