\section{Introduction}
Flight Trajectory Prediction (FTP) is an essential task in the Air Traffic Control (ATC) procedure, which can be applied to various scenarios such as air traffic flow prediction \citep{Abadi6878453, LIN2019105113}, aircraft conflict detection \citep{AdeepGaussian}, and arrival time estimation \citep{WANG2018280}. Accurate FTP can ensure the safety of air transportation and improve real-time airspace management \citep{LIN8846596, Shi9136843}. Generally, FTP tasks can be divided into three categories: long-term \citep{Jeong8190764, Runle7999617}, medium-term \citep{Yuan7554828, Chen2016ShortmediumtermPF}, and short-term \citep{huang2017short, duan2018unified}. Among them, short-term trajectory prediction has the greatest impact on ATC and is increasingly in demand for air transportation. In this paper, we mainly focus on the short-term FTP task, which aims to predict future flight trajectories based on historical observations.
 
In the ATC domain, multi-step trajectory prediction can provide more practical applications than single-step prediction \citep{LIN8846596}. It can be divided into Iterated Multi-Step (IMS) prediction and Direct Multi-Step (DMS) prediction. IMS-based methods \citep{Yan6972562, Zhang2023FlightTP, Guo2023FlightBERT} make multi-step prediction recursively, which learns a single-step model and iteratively applies the predicted values as observations to forecast the next trajectory point. Due to the error accumulation problem and the step-by-step prediction scheme, this type of methods usually fails in multi-step prediction and has poor real-time performance. By contrast, DMS-based methods \citep{wuhan2023bi, Guo2023FlightBERT++} can directly generate future trajectory points at once, which can tackle the error accumulation problem and improve prediction efficiency. Therefore, this paper performs the short-term FTP task in DMS way.  

However, two main issues are not well addressed in existing works \citep{Yan6972562, Zhang2023FlightTP, wuhan2023bi, Guo2023FlightBERT++},  limiting the trajectory prediction performance. The first issue is the negative impact on prediction accuracy caused by the significant differences in data range. In general, longitude and latitude are denoted by degree but altitude is by meter. Since one degree is approximately 111 kilometers, the data range of longitude and latitude are extremely different from that of altitude. Some previous works \citep{LSTM8489734, CNN-LSTM9145522} directly utilized normalization algorithms to scale variables into the same range, e.g., from 0 to 1. However, the actual prediction errors could be large for FTP tasks when evaluated in raw data range (as shown in Table 2). FlightBERT \citep{Guo2023FlightBERT} and FlightBERT++ \citep{Guo2023FlightBERT++} proposed binary encoding (BE) representation to convert variables from rounded decimal numbers to binary vectors, which regards the FTP task as multiple binary classification problem. Although BE representation can avoid the vulnerability caused by normalization algorithms, one serious limitation is introduced: a high bit misclassification in binary will lead to a large absolute error in decimal.
\begin{figure}[h]
\centering 
    \subfigure[original series]{  
\centering    
\includegraphics[width=0.95\linewidth]{figure/visualize_raw_series_v1.pdf}  
\label{fig:ori_traj}
}
    \subfigure[first-order difference series]{
\centering    
\includegraphics[width=0.95\linewidth]{figure/visualize_diff_series_v1.pdf}
\label{fig:diff_traj}
}
\captionsetup{font=small}
\caption{The original and first-order difference series in real-world flight trajectories.}   
\label{fig:traj}    
\end{figure}

The second issue is that real-world flight trajectories involve underlying temporal dependencies, and most existing methods \citep{Shi9136843, Guo2023FlightBERT, Guo2023FlightBERT++} fail to reveal the hidden complex temporal variations and  extract features from one single time scale. As shown in Figure~\ref{fig:traj}, the original series of longitude and latitude are over-smoothing and obscure abundant temporal variations, which can be observed from the first-order difference series. Besides, the temporal variation patterns of longitude and latitude are quite distinct from those of altitude which have an obvious global trend but suffer from intense local fluctuations. For example, slight turbulence can exert a significant influence on the altitude but produce a negligible effect on the longitude and latitude. A single-scale model cannot simultaneously capture both local temporal details and global trends \citep{wu2022timesnet, wang2023micn}. This calls for powerful multi-scale temporal modeling capacity. Furthermore, if the learned multi-scale temporal patterns are simply aggregated, the model is failed to focus on contributed patterns \citep{chen2023multi}. Meanwhile, it is essential to explore relationships across variables \citep{zhang2023crossformer, han2024capacity}, e.g., the velocity at current time step directly affects the location at next time step. Thus, scale-wise correlations and inter-variable relationships should be fully considered when modeling the multi-scale temporal patterns. 

Based on above analysis, this paper proposes a multi-scale patch network with differential coding (FlightPatchNet) to address above issues. Specifically, we utilize differential coding to encode the original values of longitude and latitude into first-order differences and retain the original values of other variables as inputs. Due to the dependencies between nearby and distant time steps, we introduce global temporal embedding to explore the correlations between time steps. Then, a multi-scale patch network is proposed to enable the ability of powerful and complete temporal modeling. The multi-scale patch network divides the trajectory series into patches of different sizes,  and exploits stacked patch mixer blocks to capture global trends across patches and local details within patches. To further promote the multi-scale temporal modeling capacity, a multi-scale aggregator is introduced to capture scale-wise correlations and inter-variable relationships. Finally, FlightPatchNet ensembles multiple predictors to make direct multi-step forecasting, which can benefit from complementary multi-scale temporal features and improve the generalization ability. The main contributions are summarized as follows: 
\begin{itemize}
    \item We utilize differential coding to effectively reduce the differences in data range and reveal the underlying temporal variations in real-world flight trajectories. Our empirical studies show that using differential values of longitude and latitude can greatly improve prediction accuracy.
    \item We propose FlightPatchNet to fully explore underlying multi-scale temporal patterns. A multi-scale patch network is designed to capture inter- and intra-patch dependencies under different time scales, and integrate multi-scale temporal features across scales and variables. 
    \item We conduct extensive experiments on a real-world dataset. The experiment results demonstrate that our proposed model significantly outperforms the most competitive baselines.
\end{itemize}
  