## Sunday, October 26, 2014

### Simplicity

I drool over this crap, you know? I really do. It's far beyond what I can understand, to be sure. So I skip along the surface. I try to remember my limits, and always try to allow for the fact that there is more going on than I have figured out.

But look: If you can't explain it simply, you probably don't understand it.

I followed a Reddit link to Lies that economics is built on at Lars P. Syll.

Syll praises economist Peter Dorman and quotes a good chunk. Here's just a bit:

You may feel a gnawing discomfort with the way economists use statistical techniques. Ostensibly they focus on the difference between people, countries or whatever the units of observation happen to be, but they nevertheless seem to treat the population of cases as interchangeable—as homogenous on some fundamental level. As if people were replicants.

You are right, and this brief talk is about why and how you’re right, and what this implies for the questions people bring to statistical analysis and the methods they use.

I think that's crap. I think i//

Maybe I misunderstand completely, as Dorman writes of "statistical techniques" and it's a good bet I don't know what that is. But I think Dorman is attacking the macro in macroeconomics. Treating the population as "interchangeable" or "homogenous"...

I have no qualms about taking some total number -- debt or savings or output or whatever I happen to be looking at -- and dividing it by some other total number. If this other number happens to be population, then you could say I'm looking at debt or savings or output (or whatever) for the "average" person.

It doesn't invalidate the numbers simply because you can throw the word "average" at me. Dividing by a number is a way to provide context. That's what "GDP per capita" is, for crying out loud: GDP in the context of population. Or debt. I'm big on debt. So maybe I will look at debt per capita. You could say I'm looking at the average debt per person, and accuse me of high stupidity, and reject everything I've ever done because of it. But if you did, it would be YOU that was talking in terms of "the average" person. Not me.

Syll, by the way, presents a little cartoon graphic of somebody walking drunkenly on a busy highway, with this caption:

The state of the drunk at his AVERAGE position is alive

and this follow-up:

But the AVERAGE state of the drunk is DEAD

Don't just laugh at it. Try to figure out what it means. I think it's crap.

I followed Syll's link to Peter Dorman's Regression Analysis and the Tyranny of Average Effects. From Dorman's opening:
What follows is a summary of a mini-lecture I gave to my statistics students this morning.  (I apologize for the unwillingness of Blogger to give me subscripts.)

You may feel a gnawing discomfort with the way economists use statistical techniques...

You are right...

Our point of departure will be a simple multiple regression model of the form

y = β0 + β1 x1 + β2 x2 + .... + ε

where y is an outcome variable, x1 is an explanatory variable of interest, the other x’s are control variables, the β’s are coefficients on these variables (or a constant term, in the case of β0), and ε is a vector of residuals.  We could apply the same analysis to more complex functional forms, and we would see the same things, so let’s stay simple.

I'm all for simple. The subscripts work if you put <sub> before and </sub> after the text you want subscripted. As for Dorman's formula, he makes it simple until the end where he says

ε is a vector of residuals.

I don't know what that means. A set of error values maybe, one for each β? No matter, I'm not doing the calculation.

Here's a skip along the surface of the rest of Dorman's post:

What question does this model answer?  It tells us the average effect that variations in x1 have on the outcome y...

This model is applied to a sample of observations...

Now what is permitted to differ across these observations?  Simply the values of the x’s and therefore the values of y and ε.  That’s it...

Thus measures of the difference between individual people or other objects of study are purchased at the cost of immense assumptions of sameness...

So what other methods are there that make fewer assumptions about the homogeneity of our study samples?  The simplest is partitioning subsamples...

When should you evaluate subsamples?  Whenever you can...

A different approach is multilevel modeling.  Here you accept the assumption that y, the x’s and structural methods are the same for everyone, but you permit the β’s to be different for different groups...

Third, you could get really radical and put aside the regression format altogether.  Consider principal components analysis...

In the end, statistical analysis is about imposing a common structure on observations in order to understand differentiation...

Simple enough for ya?

It's not for me. I prefer the simplicity of Keynes:
The National Dividend, as defined by Marshall and Professor Pigou, measures the volume of current output or real income and not the value of output or money-income. Furthermore, it depends, in some sense, on net output... But it is a grave objection to this definition for such a purpose that the community’s output of goods and services is a non-homogeneous complex which cannot be measured, strictly speaking, except in certain special cases...

The difficulty is even greater when, in order to calculate net output, we try to measure the net addition to capital equipment; for we have to find some basis for a quantitative comparison between the new items of equipment produced during the period and the old items which have perished by wastage... The problem of comparing one real output with another and of then calculating net output by setting off new items of equipment against the wastage of old items presents conundrums which permit, one can confidently say, of no solution.

Thirdly, the well-known, but unavoidable, element of vagueness which admittedly attends the concept of the general price-level makes this term very unsatisfactory for the purposes of a causal analysis, which ought to be exact.

Nevertheless these difficulties are ... “purely theoretical” in the sense that they never perplex, or indeed enter in any way into, business decisions and have no relevance to the causal sequence of economic events, which are clear-cut and determinate in spite of the quantitative indeterminacy of these concepts. It is natural, therefore, to conclude that they not only lack precision but are unnecessary...

In dealing with the theory of employment I propose, therefore, to make use of only two fundamental units of quantity, namely, quantities of money-value and quantities of employment.

Now, that's more like it.

Jazzbumpa said...

It doesn't invalidate the numbers simply because you can throw the word "average" at me. Dividing by a number is a way to provide context. That's what "GDP per capita" is, for crying out loud

No, it doesn't invalidate the number. But it limits it's usefulness, and ought to be a reminder that caution is advised. It tells you nothing about the sub-populations, nor the distribution across the population.

And these are rather important details that are worth remembering whenever you're dealing with aggregated numbers.

Then, when you take the quotient of two aggregated numbers, you really need to focus on the fact that what you have calculated is an abstraction.

It's possible that the average number is representative of only a tiny subset of the population, or maybe even nobody at all.

What are we to make of that?

Yes, I suppose Dorman is attacking macro - or at least a certain approach to it.

What he's saying is NOT crap.

What it boils down to is that elegant math can lead to solutions that have little or no relevance to the real world.

I believe this is totally consistent with Keynes, who IIRC, did not use models.

Cheers!
JzB

The Arthurian said...

JzB: Yes, I suppose Dorman is attacking macro - or at least a certain approach to it. What he's saying is NOT crap. What it boils down to is that elegant math can lead to solutions that have little or no relevance to the real world. I believe this is totally consistent with Keynes, who IIRC, did not use models.

Keynes didn't use models, Wikipedia agrees:
"Keynes himself had included few formulas and no explicit mathematical models in his General Theory. For commentators such as economist Hyman Minsky, Keynes's limited use of mathematics was partly the result of his skepticism about whether phenomena as inherently uncertain as economic activity could ever be adequately captured by mathematical models."

I am vibrantly aware of the "elegant math" because I cannot follow any of it. And yeah, in my post where I skipped across the surface of Dorman's post, it is pretty obvious that he is considering elegant/irrelevant math as the problem.

However, Dorman's specific criticism (one I could grasp) is that the samples (or "populations") used in those statistical techniques are "homogenous". But Keynes went out of his way to make things "homogenous". I think what Keynes did is great and I'm on this again in the morning. But more than once I've seen people claim that it's not right to treat all labor as labor. "Homogenous" just seems to be a particularly unfortunate word for Peter Dorman to have used.

RE: aggregated numbers

Minsky, quoting Kaldor:
"a first approximation of Keynes's theory could be interpreted as 'focusing attention on the relationships between a limited number of strategic aggregates'."

Minsky and Kaldor both seem to find aggregate numbers useful. And Keynes. I'm in pretty good company.

Jazzbumpa said...

I'm not saying aggregate numbers aren't useful. I'm saying 1) you need to understand exactly what you're doing and why, and 2) be aware that this leaves a lot of rather important ground uncovered.

Cheers!
JzB