A Survey on Multi-Modal Intelligent Perception and Hazard Recognition Techniques in Mine Safety

Cheng Deqiang; Wang Yanchen; Yue Yang; Tian Liang; Zhai Jie; Kou Qiqi

doi:10.13779/j.cnki.issn1001-0076.2026.02.001

Cheng Deqiang，Wang Yanchen，Yue Yang，Tian Liang，Zhai Jie，Kou Qiqi．A survey on multi-modal intelligent perception and hazard recognition techniques in mine safetyJ. Conservation and Utilization of Mineral Resources，2026，46（1）：1−13. DOI: 10.13779/j.cnki.issn1001-0076.2026.02.001

Citation:

A Survey on Multi-Modal Intelligent Perception and Hazard Recognition Techniques in Mine Safety

Abstract

Abstract

With the advancement of the “dual-carbon” strategy and intelligent mine construction, mine safety production places increasingly high demands on reliable intelligent perception technologies. In underground environments with insufficient lighting, dust, water mist, and confined spaces, single-modal perception methods based on visible images, infrared images, depth maps, or 3D point clouds often suffer from imaging degradation and feature loss. As a result, their recognition stability is limited, making it difficult to achieve accurate perception of personnel behaviors, equipment operating states, and environmental hazards in high-risk mining scenarios. Multimodal perception, which leverages complementary information across modalities in texture, thermal and geometric characteristics, offers an effective solution for enhancing mine safety hazard recognition. This paper reviews the characteristics of multimodal data commonly used in mines and analyzes the strengths and limitations of visible cameras, infrared cameras, LiDAR, and depth cameras in information acquisition and feature representation. Three representative fusion paradigms, including data-level, feature-level, and decision-level fusion, are then summarized with their applicable scenarios. Recent progress in multimodal-based target detection and hazard recognition is reviewed, focusing on visible-infrared image fusion, visible image-point cloud fusion, and visible-depth image fusion. Coal mine scenarios are taken as representative examples, typical applications are discussed to illustrate their engineering value in personnel localization, equipment operation monitoring, and environmental condition perception. Finally, key challenges, such as annotation cost, cross-modal alignment, robustness, and lightweight deployment, are analyzed, and future directions toward mine automation, unmanned operations, and safety risk prediction are outlined. This paper aims to provide a reference and technical guidance for both theoretical studies and engineering applications of multimodal intelligent perception and recognition in mine safety analysis.

FullText(HTML)

References (71)

Cited By

Turn off MathJax

Article Contents

A Survey on Multi-Modal Intelligent Perception and Hazard Recognition Techniques in Mine Safety

Abstract

Catalog

Export File

Citation

Format

Content