Monday, August 12, 2013

I have to look at the spreadsheet.

Picking up where we left off yesterday...

UE links to an article by Lars Christensen. "To try to illustrate the connection between the markets and NGDP," Christensen writes, "I have constructed a very simple index to track market expectations of future NGDP." He describes the index, shows a graph, and links to a spreadsheet.

The spreadsheet is great. For me, if I want to understand what the calculation is, I have to look at the spreadsheet. (So you know where this post is going.)

Christensen's spreadsheet is a FRED download Excel file containing "percent change from year ago" values for GNP, for the S&P 500 Stock Price Index, and for the Real Trade Weighted U.S. Dollar Index.

Oh that's funny, he used Gross National Product instead of GDP -- but labels it Nominal GDP. I wonder if that affects his graph. Shouldn't be much difference, but it might be worth a look. If I get to it.

Anyway, the first four columns of the spreadsheet look to be direct from FRED. Christensen has relabeled the columns, which makes sense: "USD Index" does sound a lot better than "TWEXMPA_PC1". And it's clear (at least, if you're familiar with the FRED Excel format, it is clear) which data is which.

I like it. Maybe I'll start relabeling that way. Thanks, Lars!

Down at the bottom of those first four columns, except the date column, Lars figures the average value for each dataset, and below that, for each dataset the Standard Deviation, using the STDEVPA() worksheet function. That function "Calculates the standard deviation based on the entire population," according to the OpenOffice help.

I found myself checking the range of values provided to the AVERAGE( ) and STDEVPA( ) functions, doing the Thomas Herndon thing, making sure there were no obvious dumb errors on the sheet. Christensen's spreadsheet looks okay to me.

In the next three columns Christensen calculates "standardized" values. (This is what interests me. How these calcs are done, what effect they have, and why I might want to do the same thing sometimes.)

Lars figured each standardized value as the original value less the dataset average value, with the difference divided by the dataset standard deviation value.

If I have the dataset [1,2,3] the average value is 2. If I subtract 2 from each value in the dataset I get [-1,0,1] and I have centered the dataset on the zero level.

(I'm wondering why Christensen's graph doesn't show values that are centered on the zero level. Maybe that will become clear later on.)

If I then divide each number in [-1,0,1] by some value, I make the numbers in the dataset smaller, but they stay in proportion. (They also stay centered on the zero level, unless I'm having a brain fart.)

That's Lars Christensen's standardization calculation. Pretty simple. I can remember it. But what does it look like? ...I'm getting to that.

I assembled the data for download from FRED. All quarterly data.

Graph #1: The FRED Source Data
The red line is percent change from year ago, GNP. All of 'em are percent change from year ago. The red line almost entirely hides the blue line, which is GDP. Yeah, not much difference there.

The green line is the S&P 500. The orange line is the Dollar index. The GNP and GDP values start in 1948. The S&P 500 values start in 1958. The Dollar Index values start in 1974. Christensen's graph starts in 1990. (If I have the chance to victimize Christensen for anything, it will likely be for chopping off the early years.)

Actually I'm always on the lookout for changed behavior on graphs. For if finance was once small but is now large, one ought to be able to see evidence of that change and its consequences on lots of graphs. And as Christensen's graph in particular shows "a very simple index to track market expectations of future NGDP" -- and as expectations are perhaps a more significant force now than in the past -- we might be able to see the rise of expectations as a transition toward greater similarity in more recent times. This would be a neat thing to see.

You can see on Graph #1 that the green line has about twice the up-and-down spread that the orange line has; and that both have a great deal more spread than the red and hidden blue lines. I think dividing these series, each by its own standard deviation, will similarize the up-and-downs of the different series. I think that was Christensen's reason for dividing by the standard deviation number.

So I have some things to look at, and some things to look for. Tomorrow, we look.

No comments: