Frequency-based Optimal Style Mix for Domain Generalization in Semantic Segmentation of Remote Sensing Images
Abstract: Abstract—Supervised learning methods assume that training
and test data are sampled from the same distribution. However,
this assumption is not always satisfied in practical situations
of land cover semantic segmentation when models trained in
a particular source domain are applied to other regions. This
is because domain shifts caused by variations in location, time,
and sensor alter the distribution of images in the target domain
from that of the source domain, resulting in significant degradation
of model performance. To mitigate this limitation, domain
generalization has gained attention as a way of generalizing
from source domain features to unseen target domains. One
approach is style randomization, which enables models to learn
domain-invariant features through randomizing styles of images
in the source domain. Despite its potential, existing methods face
several challenges, such as inflexible frequency decomposition,
high computational and data preparation demands, slow speed
of randomization, and lack of consistency in learning. To address
these limitations, we propose a Frequency-based Optimal Style
Mix (FOSMix), which consists of three components: 1) Full
Mix enhances the data space by maximally mixing the style of
reference images into the source domain, 2) Optimal Mix keeps
the essential frequencies for segmentation and randomizes others
to promote generalization, and 3) regularization of consistency
ensures that the model can stably learn different images with the
same semantics. Extensive experiments that require the model’s
generalization ability, with domain shift caused by variations in
regions and resolutions, demonstrate that the proposed method
achieves superior segmentation in remote sensing.1
Loading