Abstract
Monitoring wildlife in dense forests from aerial perspectives presents substantial challenges, especially under occlusion and limited visibility. We propose a novel multi-object tracking framework that combines thermal imagery with geospatial drone metadata to improve animal re-identification and tracking robustness. Built upon the Deep OC-SORT algorithm, our approach incorporates a custom thermal-specific re-identification model and two novel association stages: (1) a drone motion compensation stage that predicts object locations using drone movement, and (2) an object velocity estimation stage that leverages temporal dynamics and motion vectors. We benchmark our method using an annotated thermal UAV dataset containing about 20,000 images, primarily capturing deer and wild boar in forested environments in Austria. Our ablation study demonstrates that identity switches can be reduced by up to 11.4% compared to the baseline, especially in scenarios involving long-term occlusions and changing viewpoints. Overall, our results indicate that the fusion of visual and positional data enables more accurate and stable tracking, supporting more reliable wildlife monitoring and population assessment in ecologically challenging terrains.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of 5th Int. Workshop on Camera traps, AI, and Ecology (2025) |
| Publication status | Published - Sept 2025 |