Advanced Search
Cheng Deqiang,Wang Yanchen,Yue Yang,Tian Liang,Zhai Jie,Kou Qiqi.A survey on multi-modal intelligent perception and hazard recognition techniques in mine safetyJ. Conservation and Utilization of Mineral Resources,2026,46(1):1−13. DOI: 10.13779/j.cnki.issn1001-0076.2026.02.001
Citation: Cheng Deqiang,Wang Yanchen,Yue Yang,Tian Liang,Zhai Jie,Kou Qiqi.A survey on multi-modal intelligent perception and hazard recognition techniques in mine safetyJ. Conservation and Utilization of Mineral Resources,2026,46(1):1−13. DOI: 10.13779/j.cnki.issn1001-0076.2026.02.001

A Survey on Multi-Modal Intelligent Perception and Hazard Recognition Techniques in Mine Safety

  • With the advancement of the “dual-carbon” strategy and intelligent mine construction, mine safety production places increasingly high demands on reliable intelligent perception technologies. In underground environments with insufficient lighting, dust, water mist, and confined spaces, single-modal perception methods based on visible images, infrared images, depth maps, or 3D point clouds often suffer from imaging degradation and feature loss. As a result, their recognition stability is limited, making it difficult to achieve accurate perception of personnel behaviors, equipment operating states, and environmental hazards in high-risk mining scenarios. Multimodal perception, which leverages complementary information across modalities in texture, thermal and geometric characteristics, offers an effective solution for enhancing mine safety hazard recognition. This paper reviews the characteristics of multimodal data commonly used in mines and analyzes the strengths and limitations of visible cameras, infrared cameras, LiDAR, and depth cameras in information acquisition and feature representation. Three representative fusion paradigms, including data-level, feature-level, and decision-level fusion, are then summarized with their applicable scenarios. Recent progress in multimodal-based target detection and hazard recognition is reviewed, focusing on visible-infrared image fusion, visible image-point cloud fusion, and visible-depth image fusion. Coal mine scenarios are taken as representative examples, typical applications are discussed to illustrate their engineering value in personnel localization, equipment operation monitoring, and environmental condition perception. Finally, key challenges, such as annotation cost, cross-modal alignment, robustness, and lightweight deployment, are analyzed, and future directions toward mine automation, unmanned operations, and safety risk prediction are outlined. This paper aims to provide a reference and technical guidance for both theoretical studies and engineering applications of multimodal intelligent perception and recognition in mine safety analysis.
  • loading

Catalog

    Turn off MathJax
    Article Contents

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return