Published on January 29th, 2020 | by Michael Barnard0
Finding Climate Change Needles In Climate Haystacks Using Machine Learning
January 29th, 2020 by Michael Barnard
Headlines were made recently when a study was published that found that it is now possible to identify the fingerprints of climate change from a single day’s weather results. Slipping under the radar was a study published a few weeks earlier which is discovering new patterns that align to climate fingerprints using machine learning. The differences between the two are instructive.
Let’s start with the attention-getting study. The paper’s title is startling and clear, Climate change now detectable from any single day of weather at global scale. The authors — Sebastian Sippel, Nicolai Meinshausen, Erich M. Fischer, Enikő Székely, and Reto Knutti — are affiliated with the Institute for Atmospheric and Climate Science, ETH Zurich in Switzerland, and related Swiss and ETH data and statistical research units. The study was published in the Nature Climate Change, a leading journal in the space with a high impact factor of 19.81 (not as high as Nature Energy’s rather absurd impact factor of 54, but still one of the highest I’ve seen in a decade of assessing reliability of scientific studies on various subjects). The provenance of the study, in other words, is solid. It’s only a single study and replication and validation is important, but the startling and original nature of the finding — a single days’ weather! — makes it noteworthy for an important journal.
The basic findings, as I said to Sippel in an email discussion, are mostly self-explanatory from the paper (it’s fire-walled, but Sippel and co. provided me with a copy). They used historical and climate model data, did statistical regression analysis on them, and found that in recent years, the global actual weather from a single day was sufficient to spot the tell-tale signs of climate change in temperature and precipitation variances.
They looked at NCEP1 data of actual weather from around the world, carefully and transparently adjusted as new measurement technologies came online. They used CMIP5 climate prediction models, the UN IPCC standard models which have been validated six ways from Sunday using hindcasting and other validation techniques. The looked at a common historical period of actuals and hindcast data from 1951-1980 and a modern period of 2009-2018. Sippel pointed out that anomaly detection was based on CMIP5 hindcast models from 1870-1950. They assessed the statistical regression from mean in both periods to identify anomalies and plotted the results.
And the results are startling. It’s clear from both actuals and the climate models that there are many more temperature anomalies in recent years than in a 30-year climate period in the mid-20th Century. And it’s clear that the climate prediction model is almost exactly aligned with the actuals, once again providing validation of the model. Similar results were found for precipitation anomalies. As Katharine Hayhoe says, it’s Global Weirding.
As Sippel pointed out to me and sharp-eyed observers will note from the graph, it’s a very strong alignment, but not perfect. It’s well within bounds of uncertainty but if this study gets legs, expect those opposed to climate action to cherrypick the minor variance and try to capitalize on it. I haven’t looked, but expect more ‘sophisticated’ climate change denialists have attacked NCEP1 specifically for its bog-standard, transparent, and replicable adjustments already. Denialism is even more predictable than the weather.
Sippel was careful to point out a few things, based on things that media had not quite got right previously. This is a detection, not attribution model. They see clear evidence of a signal, it’s clear that there is no other plausible explanation than anthropogenic climate change based on innumerable other studies, but they did not explore that linkage themselves. Frankly, it should be unnecessary to make this point when scientific confidence in human causation of climate change is now at 99.9999% and attribution models now can identify the fingerprints in specific major events such as hurricanes within days, but we do live in a world where FUD has been spread widely on this subject. On a related note, this is a global fingerprint, not an extreme events attribution model. And, of course, the model doesn’t mean that they can guarantee that running a day from — to pick a date at random — June 21st, 2024, will see the same fingerprint, as there are forcing functions such as La Nina and her little brother, and volcanic cooling, which are not within the scope of the model or predictable.
Various media had identified this as a machine learning study, but that’s inaccurate as well. The underlying work was done using advanced statistical ridge regression using the R language, one which is also used in machine learning studies, but not in this one. The line is blurry, however. Machine learning is often viewed as a new tool in the statistical toolkit, one which brings new capabilities but often replicates, extends, or is actually inferior to existing tools depending on the task. And language is somewhat blurring as older techniques for statistical analysis can and will often use the same language as machine learning studies.
I asked Sippel about this, and he said:
“The statistical model is based on ridge regression, a relatively widely use regularization technique. We referred to it as “statistical learning technique” in the paper, but it can be definitely considered as a standard statistical technique (but the ridge regression penalty for regularization in statistical models with many predictors is very often also used in machine learning techniques).”
Sippel also pointed me to another related study which didn’t make headlines, Viewing Forced Climate Patterns Through an AI Lens. The authors of this study — Elizabeth A. Barnes, James W. Hurrell, Imme Ebert‐Uphoff, Chuck Anderson, and David Anderson — are from the Department of Atmospheric Science, Colorado State University, in Fort Collins, Colorado, and related academic groups, with the exception of Anderson, who is with Pattern Exploration LLC, an AI-consulting firm from the city. It’s published in the AGU’s Geophysical Research Letters, a long-standing journal with a respectable, and more normal, impact factor of 4.339.
Barnes et al. did apply machine learning approaches specifically, in an early attempt to see what value it might bring. They created global maps of historical and simulated future temperature and precipitation. They trained the neural net on actuals from a few years. They then let it predict just based on the maps what year the particular pattern that they presented it occurred in. This is a classic machine learning approach, but applied to a novel set of data.
And they found that this level of information was sufficient for their neural net to identify with good accuracy the year from the 1960s onward just based on the temperature and precipitation spatial distribution maps. Among other things, they are able to identify fingerprints of climate change almost 60 years ago based on global data.
They too use CMIP5, unsurprisingly. They also use the CESM model set, and former skeptic Richard Muller’s BEST model. That latter model is interesting in that Muller’s hypothesis of manual, inconsistent adjustment errors being the source of observed global warming was not only wrong, but actually the inverse of reality, in that warming after automated, consistent adjustments were made was slightly higher than before. I like to think of Muller’s study as one of the most unsatisfying investments the Koch brothers ever made, even more so than buying millions of dollars of counterfeited wine bottles.
The researchers used a relatively shallow neural net, one with few layers, as it gave good predictive value and more layers did not significantly increase the predictive value. They trained it on the entire 1920–2099 period for 500 iterations on 80% of the model simulations and then tested on the remaining 20%, once again, a standard training data vs test data split.
The researchers haven’t found as novel a result as identifying climate impacts from a single day’s global weather, but they have replicated other studies using a novel approach, machine learning. Further, they proved that the approach has merit. As they say in the abstract:
“The results shown here are strongly suggestive of the potential power of machine learning for climate research.”
This is in line with the late 2019 call to arms from the machine learning community related to application of the technology to climate research and action. Section 7 of their paper, authored by Kelly Kochanski, outlines the emerging advantages of machine learning approaches in climate prediction and extreme weather event forecasting. Barnes et al. paper is directly in line with the expectations of the global machine learning community.
In this tale of two studies, a couple of things are worth drawing out. First, that more traditional statistical analysis techniques are far from tapped out in their ability to provide value. They are increasingly on a continuum of tools available to researchers. They can find patterns that are much more difficult to predict or analyze, and find predictive value in massive data sets when humans can’t discern them. But human’s have neural nets too, goopy ones between our ears, and the practice of hypothesis-test-assess continues to yield value without any intelligence of the artificial sort.