College of Engineering News • Iowa State University

Quantifying Cascading Failure

Around 2 p.m. on August 14, 2003, an overhead transmission line carrying 345 kilovolts of electricity near Walton Hills, Ohio sagged too close to a nearby tree and shorted out. By 4 p.m., more than 50 million people were affected by one of the largest blackouts in history.

In September 2011, an Arizona Public Works employee, performing a routine procedure at the North Gila substation near Yuma, tripped off a 500-kilovolt line and began a series of failures that left more than 2 million people without power in the Southwest United States.

Ian Dobson

Both trigger events were small, seemingly inconsequential incidents. Both resulted in massive power outages by setting off an effect called cascading failure, a topic of considerable study for Ian Dobson, Arend J. and Verna V. Sandbulte Professor in Engineering.

“What happens is, a failure occurs somewhere and weakens the system a bit,” Dobson says. “On a bad day, something else happens. Usually it doesn’t, but on that day, let’s say, it does. If it’s a really bad day, then a third thing happens and the system becomes degraded. You’re in a situation where it’s more likely that the next failure is going to happen because the last failure already happened. That’s the idea of cascading failure.”

The failure of the Walton Hills line, a relatively minor occurrence given the size and scale of the power grid, reverberated through the network and helped cause a series of events that brought down a sizable chunk of the nation’s power infrastructure. The initial point of failure in Ohio shifted the power burden to other points down the line and made a malfunction in these points much more likely – a classic case of cascading failure.

“What we’re talking about is the big power grid that stretches from here to Florida and Maine and Canada – everything east of the Rockies is all connected together, all humming together,” Dobson says. “Everything in the power system is protected so it doesn’t fry when something goes wrong. Things can disconnect to protect the equipment, but if you disconnect enough things, you get a blackout.”

Those disconnects are usually the very thing keeping the grid from destroying itself during a large-scale cascading event. Failures in the grid are rare and typically unanticipated because, as Dobson says, everything that can be anticipated has usually already been integrated into the grid.

“Something trips out the line and the power system wobbles a little bit,” Dobson says. “Under normal operation you’ve already designed for normal faults. With anything that commonly goes wrong with the system, engineers and everyone in the utility industry rushes around and makes sure that it doesn’t happen again. Most common, understandable, or easy to figure out things are already mitigated. Unusual stuff – rare interactions, unusual combinations of things when the system is already degraded – is a lot harder to control.”

Dobson’s research goes beyond what can be anticipated and attempts to figure out the overall likelihood of large-scale blackouts, like the events in 2003 and 2011, by studying the interactions between various points in the system using a series of math equations and simulations. In effect, Dobson is using models to simulate the “perfect storm” in the power grid, though he disputes the terminology.

“People always say ‘It was the perfect storm.’” Dobson says. “But these large blackouts happen because of the cascading effect. You’re never going to get 20 different independent failures to happen at the same time because that’s vanishingly unlikely. But if the first couple events make the next events more likely, then those events happen and make the next ones more likely – then you get those rare events happening. This is the typical way that large complicated systems have catastrophic failures, and it is not really a perfect storm.”

Cascading failure is difficult to analyze because of the huge number of unanticipated variables. In other words, researchers don’t know what they don’t know. In addition, the dependence of individual failures on previous failures and their effect on subsequent failures creates an incredibly complex system of dependent variables. Large blackouts involve the failure of many interconnected variables, each of which affect how variables down the line interact with each other.

“Imagine you’re very, very tightly scheduled on a certain day,” Dobson says. “Then, things start getting delayed in the morning and things get worse and worse throughout the day. Because your first appointment was delayed, It’s more likely that the next one will be delayed. Pretty soon you start missing appointments altogether in the afternoon. That’s a very small example of cascading failure.”

There are a few common attributes, like critical loading, that researchers can look for when studying cases of cascading failure. A power grid’s critical loading can be defined as a point somewhere between a very low load and a very high load where the risk of a blackout increases sharply. If the amount of electricity flowing through the system is higher than the power grid critical load, the likelihood of a blackout spikes. The power grid’s critical load acts as a reference point for cascading failure; stay below it and the system will likely be fine. Go above it, and the risk of a blackout is more severe.

“If a transmission line carrying its usual load fails, other lines can pick up the slack without much trouble,” he says. “But if the power grid as a whole is carrying a load that is above its critical loading, its burden has a much greater effect on the other lines. That’s something we look for.”

Dobson uses a number of models and power system simulations of cascading failure to develop risk analysis methods for the power grid. Much like businesses use risk analysis procedures to identify and assess potential shortcomings within a project or account, Dobson uses his models to quantify the size and cost of a blackout given data on the power grid and its internal interactions. His findings can eventually be used to recommend upgrades in the power grid and determine the value and necessity of those upgrades.

“There’s a difference between recommending power grid upgrades and recommending prudent and cost-effective power grid upgrades,” Dobson says. “We have to figure out the best places to upgrade and focus resources there.”