Research

Clustering Strategies For XAI with Correlated, High-Dimensional Geoscience AI Models

With the rapid deployment of high-resolution sensors and models, geospatial data is captured and generated at an extremely high rate. By automatically extracting information from large data volumes, machine learning is increasingly used to turn massive geospatial data into geoscience insights. There is widespread use of high-dimensional raster data from which complex machine learning algorithms learn to recognize spatial and spatial-temporal patterns. These models may be used for critical decision-making or as a way to aid scientific discovery. Complex models are able to learn high non-linear relationships to achieve high performance, but there is a concern that their complexity makes it very difficult for users to determine how the model reached its decision. This has motivated the widespread adoption of explainable artificial intelligence (XAI) techniques that probe the model in various ways to explain how it works.

XAI methods are highly sensitive to correlations among input predictors. Proposed mitigations involve grouping correlated predictors and applying XAI to groups instead of individuals. However, there are major challenges to grouping the grid cells of geoscience rasters based on their correlation. These datasets are commonly high-dimensional with substantial autocorrelation. Conventional techniques for grouping correlated tabular features are rarely applicable.

The purpose of this research is to develop strategies for using data-driven clustering techniques to group raster data to improve the accuracy of XAI results. First, we describe the limitations of current approaches and identify XAI challenges related to raster-based geoscience models. We then develop a set of benchmarks so that we can quantitatively assess XAI methods in a variety of complex scenarios. Finally, we propose and evaluate methodologies for applying clustering and XAI techniques. These include a hierarchical clustering approach to automatically investigate multiple scales of patterns in high-dimensional data.

Publications

(2022) Importance of 3D convolution and physics on a deep learning coastal fog model
Kamangir, H., Krell, E., Collins, W., King, S. A., & Tissot, P.
Environmental Modelling & Software, 154, 105424.

Presentations

American Meteorological Society (AMS) 2023 Annual Meeting
The Influence of Grouping Spatio-Temporal Features on Explainable Artificial Intelligence (XAI): A Case Study with FogNet, a 3D CNN for Coastal Fog Prediction.
Krell, E. A., Kamangir, H., Collins, W. G., King, S. A., & Tissot, P. E.
Slides

Poster for TAMUCC's 2022 Spring Student Research Symposium
The influence of grouping features on explainable artificial intelligence for a complex fog prediction deep learning model
Krell, E., Kamangir, H., Friesand, J., Judge, J., Collins, W., King, S. A., & Tissot, P.

SCOTT: The Influence of Feature Grouping Schemes on Explainable AI for Geoscience AI Models (slides)
A presentation made for Conrad Blucher Institute's Short Curious Open Tech Talk (SCOTT) Series.

DoD Workshop: The Influence of Feature Aggregation for Explainable AI for High-Dimensional Geoscience Applications
Presented at the DoD Post Cloud Post-Processing and Verification Workshop at NCAR Foothills campus in Boulder, CO
2023/09/13
Slides

Poster for AMS 2024 in Baltimore, MD
Using Grouped Features to Improve Explainable AI Results for Atmospheric AI Models that use Gridded Spatial Data and Complex Machine Learning Techniques
Krell, E., Kamangir, Collins, W., King, S. A., Tissot, P., Mamalakis, A. & Ebert-Uphoff, I.
Poster

Energy Efficient Path Planning for Autonomous Surface Vessels

Typical robot path planning is based on optimizing the path to achieve that with the shortest distance. For a robot boat, however, the influence of the environment (water currents, waves, etc) may be of much greater concern. In this research, we explore multiple approaches to path planning for an autonomous surface vessel when taking into consideration ocean current forecasts. These approaches include metaheuristic algorithms and game-theoretic techniques.

Publications

(2020) Game Theoretic Potential Field for Autonomous Water Surface Vehicle Navigation Using Weather Forecasts
Krell, E., Carrillo, L. R. G., King, S. A., & Hespanha, J. P.
2020 American Control Conference (ACC).

(2020) Autonomous Water Surface Vehicle Metaheuristic Mission Planning using Self-generated Goals and Environmental Forecasts
Krell, E., King, S. A., & Carrillo, L. R. G.
2020 American Control Conference (ACC).

Repositories

conch
Path planning software for using A*, Dijkstra, and several metahueristic algorithms (including particle swarm optimization and genetic algorithm). There is an option to supply the planner with a raster containing water current forecasts that can be used for energy-efficient planning in the marine environment.

fujin
Another path planning software that incorporates water currents for energy efficient planning, but based on game theory. The optimization problem is modeled by treating the robot and environment as a zero-sum game where the robot wants to minimize energy consumption and the environment wants to maximize. This assumption leads to conservative motion planning. However, the code is slow for higher resolution environments since it does not use any kind of sampling (e.g. markov decision making) to solve the Bellman equation.

whelk
This repo has examples of extracting information from NetCDF water models (e.g. NECOFS, NGOFS) and storing them as a set of rasters. The water current rasters can be used as input to the conch and fujin path planners.

nir2watermap
A very simple piece of code for converting NIR imagery into an occupancy grid for marine robot planning. Every grid cell is converted to either water or non-water. Because it relies on aerial imagery, bridges are detected as non-water, which can create a fictional obstacle for the robot boat since it could of course travel below the bridge.

Presentations

2020 American Control Conference (ACC)
Game Theoretic Potential Field for Autonomous Water Surface Vehicle Navigation Using Weather Forecasts.
Krell, E., Carrillo, L. R. G., King, S. A., & Hespanha, J. P.
Slides

Machine Learning Pipeline for Detection and Prediction of PyroCbs

Under favorable atmospheric conditions, intense heat from a large and hot wildfire can generate deep, smoke-infused storms resembling conventional thunderstorms that are known as pyrocumulonimbus (pyroCb). PyroCbs are capable of releasing a large quantity of smoke particles into the lower stratosphere, often above the tropopause as well as above jet aircraft cruising altitudes by several kilometers. PyroCbs can be accompanied by strong and erratic inflow, potentially dangerous downbursts, and lightning strikes. These extreme events can increase fire spread rates and intensity, cause sudden changes in fire spread directions, and ignite additional fires. These storms are especially dangerous for fire fighters and others involved in disaster response. In this research, we have developed a machine learning system for understanding and detecting the atmospheric potential of wildfires in producing proCb as Wildfire–driven Thunderstorms. This is challenging because pyroCbs are extreme events: there are ~500 incidents during the 2013 – 2021 period. Machine learning typically struggles to learn from relatively few examples and highly imbalanced datasets. Our pipeline involves fusing input data sources to assemble a training dataset, applying feature selection and data balancing techniques, and training with several machine learning algorithms. Finally, we perform initial eXplainable AI (XAI) experiments to analyze learned model strategies.

Presentations

American Meteorological Society (AMS) 2023 Annual Meeting
Development of a Machine Learning System for Detecting the Atmospheric Potential of Wildfire-driven Thunderstorms.
Krell, E., Nguyen, C., Nachamkin, J., Peterson, D., Hyer, E., King, S. A., Tissot, P., Estrada, B., Tory, K. J., & Campbell, J.

Poster for TAMUCC's 2023 Spring Student Research Symposium
Development of a machine learning system for detection of the atmospheric potential of wildfire-driven thunderstorms
Krell, E., Nguyen, C., Nachamkin, J., Peterson, D., Hyer, E., King, S. A., Tissot, P., Estrada, B., Tory, K. J., & Campbell, J.
Poster