Sunday, September 1, 2013

Pin the Tail on the Donkey

I came to Lars Christensen's Markets are telling us where NGDP growth is heading by way of Examining the Case for NGDP Targeting by Unlearning Economics, at Pieria

Unlearning writes:

I copied this method of estimating NGDP expectations from Lars Christensen, a market monetarist. Christensen seems to take this graph as confirmation of his views, but in fact it shows the opposite of what he wants it to show.

I see neither confirmation nor its opposite in the graphs these guys show. I think it's funny that Christensen's Market Indicator has already been used as proof and disproof, and I'm still just checking the arithmetic. Funnier yet, I think the arithmetic is bad. I had my doubts about it since I first noticed that odd fudge-factor subtraction in Lars Christensen's calculation.

But the clincher is the reverse-order thing...

Pick a number

10

Subtract 4

10 - 4 = 6

Divide by 3

6 / 3 = 2

Good. Now let's get the original number back. Last time we subtracted 4 and divided by 3. This time we'll add 4 and multiply by 3. Start with our final answer

2

2 + 4 = 6

Multiply by 3

6 * 3 = 18

And there is our original number, restored. Oh, wait a minute! We started with 10 and we ended up with 18. That's not right. That's not right. What happened?

What happened is, we tried to backtrack and we reversed the steps, but we did not also reverse the sequence of those steps.

The first time we subtracted, and the second time we added. The first time we divided and the second time we multiplied. That's good.

But the first time, we subtracted first and divided second. When we backtrack we have to start with the last step and backtrack it first. The first time, we divided by 3 last. This time we have to multiply by 3 first. And then we can add 4.

2 * 3 = 6

6 + 4 = 10

And that gives us the number we started with.

Off topic, but this exercise shows how easy it is to screw up the math when you don't actually *DO* the math. That's what's wrong with those math-like models economists come up with all the time, that they never actually work out in a spreadsheet.

Here's a copy of Lars Christensen's spreadsheet with some notes I made in it.

Lars starts out by gathering his data.

Next, he "standardizes" each dataset by subtracting the dataset average and dividing the result by the standard deviation of the dataset. This is the stuff that fascinates me.

Then he takes one of his standardized datasets and subtracts it from another. He calls the result an "index". (But if you look at the spreadsheet, you may notice that the index column is NO LONGER STANDARDIZED. The column average is still zero, but the column standard deviation is now about 1.46. It is not equal to 1 because of the subtraction which in my opinion is done at the wrong time.)

After that comes his Market Indicator calculation. He takes the index and adds to it the average of the NGDP dataset, to center the index on the NGDP dataset.

Oddly, then, from that sum he subtracts the arbitrary value 1.5. This subtraction will move the Market Indicator off-center and make it a bit low relative to the NGDP data.

The resulting value he then multiplies by the standard deviation of the NGDP data, and divides by the standard deviation of his index data. It gets a little gummy and confusing here in this final bit of the Market Indicator calculation. Lars is dividing the standard deviation (of the index) out of the index and out of the NGDP average and out of the arbitrary -1.5. That seems wrong to me, to divide the SD of the index out of anything other than the index just doesn't seem right. (But hey, it's Lars's Market Indicator.)

Then he multiplies by the standard deviation of the NGDP data. (As I pointed out before, it is really NGNP data.) Multiplies it into the index value, and into the NGDP average (now, this has to be wrong!) and into the constant that he is subtracting. Awfully gummy stuff.

This is Lars Christensen's Market Indicator we're looking at here. Lars can figure it any way he wants. I don't mean to say he has anything "wrong". But there are some things that don't make sense to me, things that look like mistakes to me. Things that Unlearning Economics apparently didn't pick up when he used Lars's market indicator.

The thing that seals the deal for me is the "order of operations" problem. The Market Indicator calculation must be in error. I'm sure of it.

He begins beautifully, subtracting the average value first, and dividing by the standard deviation second.

He ends terribly, adding the average value first, and multiplying by standard deviation second. He has reversed the operations, but he has not reversed the order of operations. He adds the average value (and a fudge factor) first, and multiplies by the standard deviation value afterwards. This is one gummy mess.

Suppose I take a time series graph from FRED, one that has the numbers up pretty high, not down near zero. I'll go with Capacity Utilization:

 Graph #1: Total Capacity Utilization
Suppose we want to take the blue line and move it down so that what's 80 now becomes zero, and what's 85 becomes 5, and like that. It's easy to do: All we have to do is subtract 80 from the original numbers. The formula in the upper border of the graph below shows this subtraction:

 Graph #2: Total Capacity Utilization Moved Down from 80 to Zero
The graph still looks the same. But the numbers on the vertical scale are different. Everything is 80 less than it was before.

Now suppose we notice that the blue line is almost entirely contained between the values 10 and -10 and lets say we want to change that. If we divide everything by 10, the line will be almost entirely contained between the values 1 and -1:

 Graph #3: Total Capacity Utilization Moved Down 80, then Divided by 10
But the blue line still  looks the same.

That's the important thing. We didn't change the pattern of Total Capacity Utilization. We just moved it, and scaled it down some. There are reasons for doing such things, as Lars Christensen has shown.

The trouble with what I did here is that it's entirely arbitrary. I picked the value 80, basically because on Graph #1 there was a line at that level. I thought it would be neat to make that the zero level. And then after that, conveniently, the numbers 10 and -10 showed up on the vertical axis, so I just divided by 10.

Lars does something a little more sophisticated. Instead of subtracting 80, he figures the average value of the data points, and subtracts that value from the data. And instead of dividing by 10 because 10 is a nice round number, he figures the standard deviation of the data points and divides by that. Now you might say it's still arbitrary to do what Lars does, but if it is, it's arbitrary with principles.

And I don't think it's arbitrary. The average value tells us where the activity is taking place, and the standard deviation tells us how wide-ranging the activity is, but the pattern itself is completely independent of both those things -- as you noticed in the graphs above.

Now consider what happens when we reverse the operations but fail to reverse the order of operations. Adding 80 and then multiplying by 10 pushes the Total Capacity Utilization number up to around 800. That's a lot of utilization:

 Graph #4: Reversal of the Graph #3 Calcs in the Wrong Order (blue) Reversal of the Graph #3 Calcs in the Correct Order (green) Original Data for Comparison(red)
The red line down near 100 on the graph is the original data. The green line, hanging low with the red one, shows the result of correctly reversing the order of operations. The calculation for the green line is in the third line of the upper border. The first and fourth steps raise and lower the value by 80; the second and third change it by a factor of ten.

Reversed correctly, the numbers return to their original values, so the green and red lines overlap. (I made the red line a little wider than the green line so you can see both of them.) Here's the link to the FRED page for this graph, size large.

After figuring out what was bothering me, I went through Christensen's spreadsheet carefully. I ended up inserting a few columns into it and showing the calcs that make sense to me right there alongside the existing calculations. I recreated Lars's graph with an extra line that displays the result of my version of Lars's calculation.

 Graph #5: Lars's Graph Recreated, with my Version Added
Blue is the NGDP (or really, NGNP) data. Red is Christensen's calculation. Orange is his calculation with my modifications. Orange and red run close for the most part except briefly around 2003 and -- interestingly -- in the latter 1990s.

However, Christensen's calculation makes use of a fudge factor. As noted above, he subtracts the arbitrary value 1.5 before the multiplication. (And oh, yeah, he has the order of operations wrong.) Removing that arbitrary value from his calculation gave me this graph:

 Graph #6: Graph 5 Repeated, with Lars's Fudge Factor Eliminated
Now Lars's Market Indicator (the red line) is significantly higher than both my version and the NGNP data.