Chaopeng Shen mines data from in-situ probes and satellites that measure levels of topsoil moisture, feeds that data into machine-learning models he’s built in his Penn State lab, and “teaches” those models to predict rainfall—which in turn predicts future soil moisture levels.
“If we can identify what the soil moisture level is, we can overlay other kinds of information on top,” Shen says.
That’s exactly what happened in 2020, when an unprecedented locust infestation descended upon the Horn of Africa, threatening food security for millions of Africans. Penn State entomologist David Hughes, founder of PlantVillage, turned to Shen for help. “We provided the soil moisture information based on our model to David, and PlantVillage used it to help control the locusts,” says Shen, an associate professor of civil and environmental engineering. “Locusts like to lay their eggs in wet, sandy soil. We provided information that can help to identify the niches where they were likely to lay their eggs, and in those regions, PlantVillage was able to leverage that information.”
Soil moisture levels are critical to many natural processes, Shen says, both in the short and the long term. In forested areas, for example, soil moisture history can go a long way toward predicting the location, frequency, and scale of wildfires. Soil moisture information is key to irrigation scheduling and predicting crop yields, to forecasting floods and droughts, and to measuring soil erosion. Shen says that with large amounts of satellite data now readily available to researchers, it’s getting easier to enhance machine-learning models and cheaper to disseminate these models to a wider audience.
For his lab, the next step is to enhance water-related modeling by incorporating physical processes into machine learning models. “Right now, the data we have is only monitoring top surface soil moisture,” Shen says, “so we need to incorporate actual physical principles of the water cycle into our machine learning models and train them together. Our group is making a lot of innovations in connecting data-driven methods with physics to tell a more complete story of what’s going on in our system, not only about soil moisture but also how much water is stored in the groundwater aquifers and running in the rivers, and their quality.”