Enhanced U-Net Variants with Optimized Encoder-Decoder Architectures for High-Precision Biomedical Image Segmentation
Main Article Content
Abstract
Medical image segmentation is critical for diagnostics and treatment planning, yet conventional U-Net models often struggle with capturing complex spatial dependencies and multi-scale context, particularly in low-contrast or noisy data. To address these challenges, we propose an enhanced U-Net variant that integrates residual connections, attention gates, and multi-scale feature fusion. The encoder adopts ResNet-based feature extraction for richer contextual learning, while the decoder incorporates self-attention–guided upsampling and squeeze-and-excitation (SE) blocks to emphasize salient features. The model was evaluated on ISIC 2018 (skin lesion) and BraTS (brain tumor) datasets, achieving significant improvements over the baseline U-Net. Results include a Dice Similarity Coefficient of 91.6% vs. 84.3%, IoU of 88.7% vs. 81.2%, precision of 93.1%, and recall of 90.8%, with inference time reduced by 12%. These findings demonstrate that the proposed architecture delivers more accurate and efficient biomedical image segmentation, especially for irregular anatomical structures.