Research
Clustering Strategies For XAI with Correlated, High-Dimensional Geoscience AI Models
With the rapid deployment of high-resolution sensors and models, geospatial data is captured and generated at an extremely high rate. By automatically extracting information from large data volumes, machine learning is increasingly used to turn massive geospatial data into geoscience insights. There is widespread use of high-dimensional raster data from which complex machine learning algorithms learn to recognize spatial and spatial-temporal patterns. These models may be used for critical decision-making or as a way to aid scientific discovery. Complex models are able to learn high non-linear relationships to achieve high performance, but there is a concern that their complexity makes it very difficult for users to determine how the model reached its decision. This has motivated the widespread adoption of explainable artificial intelligence (XAI) techniques that probe the model in various ways to explain how it works.
XAI methods are highly sensitive to correlations among input predictors. Proposed mitigations involve grouping correlated predictors and applying XAI to groups instead of individuals. However, there are major challenges to grouping the grid cells of geoscience rasters based on their correlation. These datasets are commonly high-dimensional with substantial autocorrelation. Conventional techniques for grouping correlated tabular features are rarely applicable.
The purpose of this research is to develop strategies for using data-driven clustering techniques to group raster data to improve the accuracy of XAI results. First, we describe the limitations of current approaches and identify XAI challenges related to raster-based geoscience models. We then develop a set of benchmarks so that we can quantitatively assess XAI methods in a variety of complex scenarios. Finally, we propose and evaluate methodologies for applying clustering and XAI techniques. These include a hierarchical clustering approach to automatically investigate multiple scales of patterns in high-dimensional data.
Publications
(2022) Importance of 3D convolution and physics on a deep learning coastal fog model Kamangir, H., Krell, E., Collins, W., King, S. A., & Tissot, P. Environmental Modelling & Software, 154, 105424.
Presentations
American Meteorological Society (AMS) 2023 Annual Meeting The Influence of Grouping Spatio-Temporal Features on Explainable Artificial Intelligence (XAI): A Case Study with FogNet, a 3D CNN for Coastal Fog Prediction. Krell, E. A., Kamangir, H., Collins, W. G., King, S. A., & Tissot, P. E. Slides
AI2ES Site-Wide Meeting 2023/03/01 Slides
TAI4ES Summer School 2021 2021/07/29 Slides Course information
Poster for TAMUCC's 2022 Spring Student Research Symposium The influence of grouping features on explainable artificial intelligence for a complex fog prediction deep learning model Krell, E., Kamangir, H., Friesand, J., Judge, J., Collins, W., King, S. A., & Tissot, P.
SCOTT: The Influence of Feature Grouping Schemes on Explainable AI for Geoscience AI Models (slides) A presentation made for Conrad Blucher Institute's Short Curious Open Tech Talk (SCOTT) Series.
DoD Workshop: The Influence of Feature Aggregation for Explainable AI for High-Dimensional Geoscience Applications Presented at the DoD Post Cloud Post-Processing and Verification Workshop at NCAR Foothills campus in Boulder, CO 2023/09/13 Slides
Poster for AMS 2024 in Baltimore, MD Using Grouped Features to Improve Explainable AI Results for Atmospheric AI Models that use Gridded Spatial Data and Complex Machine Learning Techniques Krell, E., Kamangir, Collins, W., King, S. A., Tissot, P., Mamalakis, A. & Ebert-Uphoff, I. Poster
Energy Efficient Path Planning for Autonomous Surface Vessels
Typical robot path planning is based on optimizing the path to achieve that with the shortest distance. For a robot boat, however, the influence of the environment (water currents, waves, etc) may be of much greater concern. In this research, we explore multiple approaches to path planning for an autonomous surface vessel when taking into consideration ocean current forecasts. These approaches include metaheuristic algorithms and game-theoretic techniques.
Publications
(2022) Autonomous surface vehicle energy-efficient and reward-based path planning using particle swarm optimization and visibility graphs Krell, E., King, S. A., & Carrillo, L. R. G. Applied Ocean Research 122 (2022): 103125.
(2020) Game Theoretic Potential Field for Autonomous Water Surface Vehicle Navigation Using Weather Forecasts Krell, E., Carrillo, L. R. G., King, S. A., & Hespanha, J. P. 2020 American Control Conference (ACC).
(2020) Autonomous Water Surface Vehicle Metaheuristic Mission Planning using Self-generated Goals and Environmental Forecasts Krell, E., King, S. A., & Carrillo, L. R. G. 2020 American Control Conference (ACC).
Repositories
conch Path planning software for using A*, Dijkstra, and several metahueristic algorithms (including particle swarm optimization and genetic algorithm). There is an option to supply the planner with a raster containing water current forecasts that can be used for energy-efficient planning in the marine environment.
fujin Another path planning software that incorporates water currents for energy efficient planning, but based on game theory. The optimization problem is modeled by treating the robot and environment as a zero-sum game where the robot wants to minimize energy consumption and the environment wants to maximize. This assumption leads to conservative motion planning. However, the code is slow for higher resolution environments since it does not use any kind of sampling (e.g. markov decision making) to solve the Bellman equation.
whelk This repo has examples of extracting information from NetCDF water models (e.g. NECOFS, NGOFS) and storing them as a set of rasters. The water current rasters can be used as input to the conch and fujin path planners.
nir2watermap A very simple piece of code for converting NIR imagery into an occupancy grid for marine robot planning. Every grid cell is converted to either water or non-water. Because it relies on aerial imagery, bridges are detected as non-water, which can create a fictional obstacle for the robot boat since it could of course travel below the bridge.
Presentations
2020 American Control Conference (ACC) Game Theoretic Potential Field for Autonomous Water Surface Vehicle Navigation Using Weather Forecasts. Krell, E., Carrillo, L. R. G., King, S. A., & Hespanha, J. P. Slides
Machine Learning Pipeline for Detection and Prediction of PyroCbs
Under favorable atmospheric conditions, intense heat from a large and hot wildfire can generate deep, smoke-infused storms resembling conventional thunderstorms that are known as pyrocumulonimbus (pyroCb). PyroCbs are capable of releasing a large quantity of smoke particles into the lower stratosphere, often above the tropopause as well as above jet aircraft cruising altitudes by several kilometers. PyroCbs can be accompanied by strong and erratic inflow, potentially dangerous downbursts, and lightning strikes. These extreme events can increase fire spread rates and intensity, cause sudden changes in fire spread directions, and ignite additional fires. These storms are especially dangerous for fire fighters and others involved in disaster response. In this research, we have developed a machine learning system for understanding and detecting the atmospheric potential of wildfires in producing proCb as Wildfire–driven Thunderstorms. This is challenging because pyroCbs are extreme events: there are ~500 incidents during the 2013 – 2021 period. Machine learning typically struggles to learn from relatively few examples and highly imbalanced datasets. Our pipeline involves fusing input data sources to assemble a training dataset, applying feature selection and data balancing techniques, and training with several machine learning algorithms. Finally, we perform initial eXplainable AI (XAI) experiments to analyze learned model strategies.
Presentations
American Meteorological Society (AMS) 2023 Annual Meeting Development of a Machine Learning System for Detecting the Atmospheric Potential of Wildfire-driven Thunderstorms. Krell, E., Nguyen, C., Nachamkin, J., Peterson, D., Hyer, E., King, S. A., Tissot, P., Estrada, B., Tory, K. J., & Campbell, J.
Poster for TAMUCC's 2023 Spring Student Research Symposium Development of a machine learning system for detection of the atmospheric potential of wildfire-driven thunderstorms Krell, E., Nguyen, C., Nachamkin, J., Peterson, D., Hyer, E., King, S. A., Tissot, P., Estrada, B., Tory, K. J., & Campbell, J. Poster