RESUMEN
Extraction of coastline from optical remote sensing images is of paramount importance for coastal zone management, erosion monitoring, and intelligent ocean construction. However, nearshore marine environment complexity presents a challenge when capturing small-scale and detailed information regarding coastlines. Furthermore, the presence of numerous tidal flats, suspended sediments, and coastal biological communities exacerbates the reduction in segmentation accuracy, which is particularly noticeable in medium-high-resolution remote sensing image segmentation tasks. Most previous related studies, based primarily on convolutional neural networks (CNNs) or traditional feature extraction methods, faced challenges in detailed pixel-level refinement and lacked comprehensive understanding of the studied images. Therefore, we proposed a new U-shaped deep learning model (STIRUnet) that combines the excellent global modeling ability of SwinTransformer with an improved CNN using an inverted residual module. The proposed method has the capability of global supervised feature learning and layer-by-layer feature extraction, and we conducted sea-land segmentation experiments using GF-HNCD and BSD remote sensing image datasets to validate the performance of the proposed model. The results indicate the following: 1) suspended sediments and coastal biological communities are major contributors to coastline blurring, and 2) the recovery of minute features (e.g., narrow watercourses and microscale artificial structures) effectively enhances edge details and leads to more realistic segmentation outcomes. The findings of this study are highly important in relation of accurate extraction of sea-land information in complex marine environments, and they offer novel insights regarding mixed-pixel identification.