LLM-DSK: A Domain-Specific Semantic Knowledge-Guided Ocean Environment Prediction Method Based on Large Language Models

Ning Song, Caichao Lv, Jie Nie, Min Ye, Enyuan Zhao, Jun Ma, Xiong Liu, Zhiqiang Wei

Published: 01 Jan 2025, Last Modified: 25 Jan 2026IEEE Journal of Selected Topics in Applied Earth Observations and Remote SensingEveryoneRevisionsCC BY-SA 4.0
Abstract: Data-driven methods learn patterns of oceanic variable changes directly from data without relying on explicit modeling of complex physical processes based on specific assumptions. This approach addresses the limitations of traditional numerical methods, which are constrained by physical assumptions, parameterized processes, and dependencies on initial and boundary conditions. However, methods based solely on probabilistic statistics and ignoring the intrinsic characteristics of ocean systems struggle to capture the complex spatiotemporal dynamics of chaotic ocean systems. With the emergence of large language models (LLMs) in time-series analysis, researchers have discovered that pretrained LLMs can leverage rich domain-specific knowledge through prompt engineering to analyze complex temporal changes. Building on this insight, we propose LLM-DSK, a domain-specific semantic knowledge-guided ocean environment prediction model based on pretrained LLMs. LLM-DSK comprises three core modules: 1) a spatiotemporal feature extraction module that utilizes geographic data (e.g., latitude, longitude, wind fields, and land–sea boundaries) to extract key domain-relevant spatiotemporal features; 2) a semantic encoding module that employs an attention mechanism to align these features with the vocabulary of LLMs, enabling cross-modal alignment between oceanic and natural language domains to enrich semantic representations; and 3) an LLM-based prediction module driven by domain-specific prompts that integrate geographic information and statistical indicators. We validated LLM-DSK using remote sensing data (sea surface temperature) and reanalysis data (significant wave height), and the results demonstrate that LLM-DSK achieves superior predictive performance compared to state-of-the-art (SOTA) models.
Loading