Evaluating 2D and 3D Deep Learning Detection Technologies Using Time-of-Flight Sensors for Room Surveillance and Human Tracking

  • Sebastian Alexander Mück

    Student thesis: Master's Thesis

    Abstract

    This thesis investigates the efficacy of 2D and 3D detection technologies for room surveillance and human tracking using Time-of-Flight(ToF) sensors, focusing on the challenges
    presented by the sparse nature of low-cost ToF-generated point cloud data. The primary
    goal was to determine efficient methods for human detection within indoor settings, a
    key requirement for applications ranging from automated building management to enhanced security protocols.
    A comparative analysis of various detection models is conducted, including Ultralytics YOLOv8 and several frameworks within the OpenPCDet library such as CenterPoint, PointPillar, PV-RCNN, and SECOND. The Depth data is parsed using a
    self-made parser and later processed using different projection techniques, including 2D
    projections, 3D point clouds, and voxel grids, to evaluate their impact on the detection
    performance. The Dataset is therefore annotated and converted to the required formats
    by hand or with custom conversion scripts.
    The results of this study demonstrate that 2D detection methods surpass 3D approaches in terms of precision and recall metrics across the dataset. This superiority is
    largely due to the increased accuracy and efficiency of 2D image-based detection algorithms when dealing with the noisy and sparse datasets characteristic of low-cost ToF
    sensors. There are indications that 3D detection models may offer advantages in very
    specific use cases, but generally have too complex architectures which have a negative
    impact on sparse data.
    Consequently, the thesis concludes that utilizing 2D projections of depth data is
    more practical and effective for accurately detecting humans with ToF sensory. This
    conclusion has implications for the deployment strategies of ToF sensors in real-world
    applications, advocating for 2D data processing techniques as the preferred approach
    for improved performance and cost-efficiency.
    Future work could focus on optimizing 3D detection models for low-resolution point
    clouds, particularly for applications like fall detection in elderly care, where the 3D
    capabilities of ToF cameras could enhance safety and emergency response.
    Date of Award2024
    Original languageEnglish (American)
    SupervisorJosef Langer (Supervisor) & Rafael Müller (Supervisor)

    Cite this

    '