Beyond Color: Advanced RGB-D data augmentation for robust semantic segmentation in crop farming scenes

Research output: Contribution to journalArticlepeer-review

Abstract

The emergence of smart farming in recent years has substantially increased the importance of artificial vision systems in crop production. Data augmentation is essential for developing robust semantic segmentation models when dealing with small datasets, such as in selective weed control. Due to advances in multi-modal data fusion, RGB-D image datasets contribute substantially to improve model performance. However, most data augmentation techniques primarily modify the color channels, often neglecting the depth channel. Addressing this gap, we introduce three methods for augmenting RGB-D images: RGB-D-Aug, Recompose3D, and Compose3D. We conducted experiments utilizing a multi-modal fusion network tailored for semantic segmentation of different plant species, namely ESANet. RGB-D-Aug introduces artificial depth sensor noise in addition to commonly used geometric transformations and color variations. Recompose3D and Compose3D generate augmented RGB-D images and corresponding ground-truth labels by composing background images and a set of foreground plant snippets. Recompose3D rearranges plants from a given training image, while Compose3D employs all plant snippets available in the training dataset. In our experiments designed to evaluate generalization performance, we tested our three methods and compared them not only to the augmentation technique used in ESANet, which consists of geometric transformations and color channel variations, but also to an extended version of the Copy-Paste method, an image composition technique originally introduced for RGB images. All three of our proposed methods outperformed the ESANet augmentation. The image composition methods, Copy-Paste, Recompose3D, and Compose3D, performed significantly better, with Compose3D achieving the highest generalization performance of all methods tested. In addition to improving model robustness, Compose3D allows the creation of realistic agronomic image scenes. Our research is an important step towards developing robust and generalizable models for different applications in arable farming.

Original languageEnglish
Article number111432
Pages (from-to)111432
JournalComputers and Electronics in Agriculture
Volume244
DOIs
Publication statusPublished - 1 Mar 2026

Keywords

  • Computer vision
  • Data augmentation
  • Precision agriculture
  • RGB-D semantic segmentation
  • Smart farming
  • Weed control

Fingerprint

Dive into the research topics of 'Beyond Color: Advanced RGB-D data augmentation for robust semantic segmentation in crop farming scenes'. Together they form a unique fingerprint.

Cite this