Refining the theory of data combination

The more data points a time series includes, the more reliable it becomes. But when time series are combined, there is a risk that errors will increase. Yet combining series makes it possible to detect rapidly shifting trends. Joakim Westerlund wants to build on the theory to sort out the challenges as well as the potential.

Joakim Westerlund

Professor of Economics

Wallenberg Academy Fellow, prolongation grant 2018

Institution:
Lund University

Research field:
Econometrics, primarily tools for analyzing panel data

“The more you dig, the more you see that things aren’t as you thought,” he says.

Westerlund is a professor and Wallenberg Academy Fellow at Lund University. His research field is econometrics, which is the use of statistics and other methods to analyze economic relationships. Two fundamental concepts in this field are time series and panel data. A time series is a number of observations over time, e.g. house prices, mortality rates or sales. When studying time series, analysts often look for a trend – a shift in the curve indicating trend direction, and whether that direction is about to change. A panel is a combination of time series, such as data covering the same period for multiple companies or countries.

The more observations included in a time series, the more accurate the analysis based on it. Numerous data points make trends clearer to discern, and the error potentially contributed by each individual point becomes less important. But analysis based on panel data must be adapted to the time series having the lowest number of observations. An example of this is data from multiple countries, some of which collect much less data than others.

“The scientific literature usually requires that there be a large number of time series observations in panel data as well, but in practice there are often very few. So it’s as if theory and practice are two entirely different worlds, disconnected from each other. I want to develop the theory and methods, and make them more suited to the data that are actually used,” Westerlund says.

Useless and valuable panel data

A low number of time series observations creates major problems in panel data. Using customary methods, the result is an accumulation of any data errors.

“Intuitively, we should get a better result by including many time series, but if there is a low number of observations in each series, you might say the opposite applies. The accuracy is so poor that the results are completely useless. In the end, it’s often the case that no conclusions can be drawn at all. This is reflected in scientific articles when researchers are obliged to reduce the number of time series to get their methods to work, which seems quite pointless.”

Westerlund is now developing the theory so that it better describes what happens when there are few time series observations, and how this can be taken into account. This would be particularly useful in relation to extreme and short-lived events, such as when an economic “bubble” forms and bursts. This often happens quickly, making it extremely difficult to predict using time series methods. Westerlund believes that this is a situation where panel data can be put to good use:

“In times of crisis many time series often behave identically. For instance, when the stock exchange starts to fall, everyone reacts in the same way: sell, sell, sell. It ought to be possible to incorporate in the theory in some way that if time series behave in the same way, accuracy can be achieved despite a low number of observations, by increasing the number of time series. Essentially, it ought then to be possible to see the shift in the curve based on a single observation if it appears the same in numerous time series. This could give an early warning, potentially making it possible to pre-empt a deep crisis.”

Calculations must be based on reality

Much of Westerlund’s work entails sitting with a pen and paper and making calculations based on real world problems. He says half-jokingly that econometrics makes life hard for itself compared with traditional mathematical analysis. Not only does the theory have to be correct given certain assumptions; those assumptions must also be realistic. And the proposed new methods must work better than existing ones.

“There’s quite a burden of proof to be met in order to get anywhere…”

“I don’t know what I would do without this grant. It enables me to teach less and focus on research. Interruptions and distractions make it difficult to get anywhere – you need to be able to sit down and ponder in peace.”

But it is precisely this kind of thorough groundwork that Westerlund likes. He was fond of mathematics at elementary school and high school, and got into econometrics mainly by chance. He chose the same program as his friends, discovered it was fun, and arrived at Lund University in the early 2000s just when large quantities of data were starting to become available, and there was a nascent interest in analyses of this kind.

“I’ve been working on problems like this for so long now that I’m starting to get a feeling for what works and what doesn’t. Right from the outset I felt that given certain assumptions, this would be possible – but the question now is how to put together assumptions that are actually realistic. There’s still an awful lot of work to do.”

Text Lisa Kirsebom
Translation Maxwell Arding
Photo Juni Westerlund