Abstract
Environmental monitoring of natural landscapes, coastal zones, and urban surfaces is essential for understanding climate change impacts, managing natural resources, and responding to environmental disasters. Traditional remote sensing approaches rely on single-modality satellite or aerial imagery, which provides 2D visual information but lacks the depth and thermal context needed to accurately characterize complex land surface conditions. Recent advances in multi-modal sensing, 3D surface reconstruction, and large language models offer new possibilities for automated, comprehensive environmental monitoring. This study proposes an AI-Driven Multi-Modal Environmental Monitoring System (AIMES) that integrates optical remote sensing imagery, infrared thermography, and LiDAR point cloud data through a deep learning-based fusion architecture. A 3D surface reconstruction module leverages stereo phase-measuring deflectometry with deep learning-enhanced phase unwrapping to produce high-resolution digital elevation models (DEMs) of monitored terrain. A multi-agent analysis module decomposes the post-reconstruction workflow into specialized tasks—land cover classification, thermal anomaly detection, change detection, and environmental impact assessment—each handled by a dedicated LLM-powered agent. The multi-agent design enables structured reasoning and context-aware analysis that goes beyond pixel-level classification to produce actionable environmental insights. Experiments conducted on three benchmark remote sensing datasets demonstrate that the proposed system achieves an average land cover classification accuracy of 90.2%, a thermal anomaly detection accuracy of 87.8%, and a change detection F1-score of 84.5%. The multi-agent impact assessment module achieves a semantic consistency rate of 81% with expert environmental scientist evaluations, while reducing average analysis time from 3.5 hours (manual) to 22 minutes (automated). This study validates the effectiveness of combining multi-modal deep learning 3D reconstruction with multi-agent collaborative analysis for scalable, accurate, and interpretable environmental monitoring.
References
Huang, H., Tang, J., Liu, T., & Huang, M. (2026). Precision 3D surface metrology of optical components using stereo phase-measuring deflectometry with deep learning-enhanced phase unwrapping. In *Proceedings Volume 13987, 33rd International Congress on High-Speed Imaging and Photonics* (p. 1398704). SPIE. https://doi.org/10.1117/12.3093993
Huang, H., Yang, Y., & Zhu, Y. (2023). Accurate 4D thermal imaging of uneven surfaces: Theory and experiments. *International Journal of Heat and Mass Transfer*, 216, 124580. https://doi.org/10.1016/j.ijheatmasstransfer.2023.124580
Wang, S., Yu, Y., Feldt, R., & Parthasarathy, D. (2025). Automating a complete software test process using llms: An automotive case study. arXiv preprint arXiv:2502.04008. https://doi.org/10.1109/ICSE55347.2025.00211
