2009/10/03

Statisticians and the real world

What a strange world we live in.

We have McIntyre and acolytes saying in one breath -
1. It is not valid to sort samples before you analyse them. Briffa should have used all samples from the area, and then come to a conclusion.
2. Then they say The 10/12 Briffa trees should not have been included as we have these 34 schweingruber(?) trees, and look no 20thC warming.
3. Someone then says that the Briffa trees should have been included
4. McIntyre adds them and finds a smaller hockey stick.
5. McIntyre analyses the Briffa trees and find a golden hockey stick tree which provides most of the late 20thC warming.
6. McIntyre then says this result is 8 sigma outside the normal and should not be included.
How do you reconcile statement 1 with statement 6??????
In my view if you are not allowed to sort for correlation between ring width and temperature over the period where we have instrumental records. Then you are not allowed to sort at all.

Consider this scenario

At a junk sale you purchase a number of instruments various environmental parameters over time. None are very accurate, and you have no idea which parameter they are measuring. You want to record temperature, so you set them up in the same location. Some years later you can afford a calibrated temperature recorder which you also set up in the same location.
Some of these instruments will have recorded sunlight, precipitation, soil nutrient levels, fungal spore levels, ambient temperature, and temperature of the soil 1 metre down.
If you want to know what the temperature was when you set up the first instruments do you
a. normalise all readings of all instruments then average them.
b. average them all without normalising
c. compare the outputs from all instruments with the calibrated temperature recorder and throw out all that show no correlation. normalise the results remaining and then average them
d. as c. but additionally throw out units deviating by significant amounts from the average.

Which of a. b. c. is going to give you best historic temperatures?

Personally as an engineer not a statistician I would go with d. or if insufficient instruments to find the outliers c.
I realise that this is going to bias the results to giving the same result as the calibrated instrument, but may I suggest this is exactly what you want.

It seems a statistician would go for 1 as this would not bias the result to valid temperatures. I just cannot understand this.

No comments:

Post a Comment