Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Why many things have Normal distribution


From   Steve Samuels <[email protected]>
To   [email protected]
Subject   Re: st: Why many things have Normal distribution
Date   Mon, 2 Sep 2013 12:15:34 -0400


Sam:

I agree with you about the invalidity of making unwarranted assumptions
and then tailoring measurements to fit those assumptions.

I originally quoted only the first sentence in Jeffreys (1938):

"The normal or Gaussian law of error rests partly on a particular
hypothesis about the nature of error, that the error of any individual
observation is the resultant of a large number of comparable and
independent components; and partly on comparison with frequencies in
actual series of observations.

The second sentence is: "Both arguments are defective."

I know little about how "intelligence" can be defined or measured, but I
would be surprised if the assumptions needed to apply the normal law are
credible. See the first two pages of Jeffreys essay (link below).


Reference:

> Harold Jeffreys (1938) The Law of Error and the Combination of
> Observations Philosophical Transactions of the Royal Society of London.
> Series A, Mathematical and Physical Sciences Vol. 237, No. 777 (Apr. 14,
> 1938), pp. 231-271
Available at:http://rsta.royalsocietypublishing.org/content/237/777/231.full.pdf

Steve


On Sep 1, 2013, at 1:10 AM, Lucas wrote:

Very nice points.  However, I don't think I misread the thread.

Also, I don't think I neglected the law of errors, either, for the
same law is one way to undergird the idea that parameters will, upon
repeated random sampling, form a normal distribution.  What I would
say, though, is the following:

1)Nature is as it is.  We can endeavor to understand it, decompose it,
make models of it, and all this is fine.

2)Major problems are produced, however, when people assume nature is
like X, so when we study OUR phenomenon of interest we have to mash
and stretch and otherwise contort it so it looks like X.

3)I know of no better example of this than psychometrics which postulates that:

a)Things in nature are normally distributed (which is wrong for much
of the natural world),

b)Intelligence is part of the natural world, so,

c)Intelligence is normally distributed, so,

d)Results of our tests of intelligence should be normally-distributed, so,

e)If we make a test we should throw out or add in questions to make it
as close to normally-distributed as possible, and,

f)Should that fail, we should "standardize" the raw scores so they can
be reported as normal.

Countless people, organizations, and communities have been materially
harmed by this completely wrong-headed way of proceeding.

You may be right about the original note; I'm not gonna go track it
down, I'll accept your claim.  My point, though, is that the
distribution of parameters, and the distribution of raw phenomena, are
two different things, and only damage occurs when people (or entire
fields, such as psychometrics) forget that. Once we remember that we
can confidently state--far more parameters are normally distributed
than there are "things in nature" that are normally distributed, a
fact which should discipline us to *not* require or expect things in
nature to be normally distributed.

Sam

On Sat, Aug 31, 2013 at 4:37 PM, Steve Samuels <[email protected]> wrote:
> Sam,
> 
> You are misreading the original post. It did not say "most" things in
> nature were normally distributed. But, more important, you appear
> unaware of the historical importance of the "normal law of error" in
> explaining natural phenomena.
> 
> "The importance of the normal curve stems primarily from the fact that
> the distributions of many natural phenomena are at least approximately
> normally distributed." (David Lane, accessed Aug 31, 2013)
> 
> Jeffreys (1938, p 231): "The normal or Gaussian law of error rests
> partly on a particular hypothesis about the nature of error, that the
> error of any individual observation is the resultant of a large number
> of comparable and independent components; and partly on com- parison
> with frequencies in actual series of observations."
> 
> Hagen's Hypotheses (Rao, 1973, p. 161) were:
> 
> 1. An error is the sum of a large number of infiniestimal errors, all of
> equal magnitude due different causes. 2. The different components of
> error are independent. 3. Each component of error has an equal chance of
> being positive or negative.
> 
> The observations mentioned in the book by Hampel et al. did indeed
> appear to have normal distributions. The surprise is that they did not.
> According to the book, Bessel noticed the non-normality but then
> proceeded to ignore it!
> 
> Don't think that the "normal law of error", though inexact, is out of
> date. I especially like Youden's illustration in a photograph of 50
> plants arranged by size (Youden, 1998, p 54). He states that the
> equation for the curve "is one of the great discoveries of science. Its
> importance in bring meaning to collections of observations can hardly be
> overestimated."
> 
> Steve
> 
> References:
> 
> Harold Jeffreys (1938) The Law of Error and the Combination of
> Observations Philosophical Transactions of the Royal Society of London.
> Series A, Mathematical and Physical Sciences Vol. 237, No. 777 (Apr. 14,
> 1938), pp. 231-271
> 
> David Lane, History of the Normal Distribution
> http://onlinestatbook.com/2/normal_distribution/history_normal.html
> 
> Rao, C. Radhakrishna. 1973. Linear Statistical Inference and Its
> Applications (Second Edition). New York: John Wiley & Sons.
> 
> Youden, W. J. 1998. Experimentation and Measurement. Dover Publications.
> Mineola, NY
> 
> Steve
> 
> 
>> On Aug 31, 2013, at 11:18 AM, Lucas wrote:
>> 
>> I don't understand this thread. Most things in nature are not normally
>> distributed. What is normally distributed is a parameter estimate from
>> repeated random sampling from a population.
>> 
>> In the U.S., for example, the number of hands per person is not
>> normally distributed. The possibilities are 0, 1, and 2. The mean is
>> probably something like 1.8. If we drew 1,000 samples of 1,000 people,
>> the means from those samples would be or approach a normal
>> distribution. The normal distribution of the mean from those samples
>> would not signify that the distribution of hands per person is normal.
>> 
>> The known distribution of the means justifies use of standard tools of
>> inference (e.g., confidence interval calculation). It neither
>> signifies nor requires the underlying distribution of the phenomenon
>> to be normal.
>> 
>> Sam
>> 
> On Sat, Aug 31, 2013 at 6:37 AM, Yuval Arbel <[email protected]> wrote:
>> Steve and David,
>> 
>> If I come to think about it - and as David previously mentioned -
>> income, for example, is not normally distributed, no matter how much
>> we increase the sample:
>> 
>> If we take the big corporations, for example - we find that most of
>> the workers earn a minimal wage, where senior managers earn at least
>> ten times more. I would therefore anticipate that the income variable
>> distribution will be skewed to the right. This also corresponds to
>> Pareto principle - that 80% of the wealth is concentrated among 20% of
>> the population
>> 
>> One possible explanation - is the poverty trap: poor people remain
>> stuck without education or other means to get out of the trap -
>> because they get a subsistence wage.
>> 
>> On Sat, Aug 31, 2013 at 1:12 AM, Steve Samuels <[email protected]> wrote:
>>> 
>>> David,
>>> 
>>> Here is some empirical evidence: the book by Hampel et al.(1986, pp
>>> 22-23) cites several investigators, starting with Bessel in 1818, who
>>> studied "very high quality" data sets. Most of the sets were
>>> longer-tailed than the normal and were well-approximated by
>>> t-distributions with 3-9 d.f. Slight skewness was also noted.
>>> 
>>> Steve
>>> 
>>> 
>>> Reference:
>>> 
>>> Hampel, Frank, Elvezio Ronchetti, Peter Rousseeuw, and Werner Stahel.
>>> 1986. Robust Statistics: The Approach Based on Influence Functions
>>> (Wiley Series in Probability and Mathematical Statistics). New York:
>>> John Wiley and Sons.
>>> 
>>> Jeffereys, H. (1939,1961). Theory of Probability. Clarendon Press,
>>> Oxford
>>> 
>>> 
>>> 
>>> 
>>> On Aug 29, 2013, at 10:49 PM, David Hoaglin wrote:
>>> 
>>> Yuval,
>>> 
>>> The Central Limit Theorem (CLT) describes the behavior of the
>>> distribution of the sample mean as the sample size becomes large.  In
>>> order for the distribution of the sample mean to approach a normal
>>> distribution, the underlying distribution of the data must satisfy
>>> some conditions, but those conditions are not very stringent.  The CLT
>>> provides no information on how the underlying distribution behaves.
>>> One does, however, expect the behavior of samples to approach that of
>>> the underlying distribution (whatever that happens to be).
>>> 
>>> I would have no special expectations of the distribution of heights in
>>> a large army.  I would look at the actual distribution --- empirical
>>> evidence, rather than a thought experiment.  Apart from any attempts
>>> to avoid serving, one would expect recruiters to reject people who
>>> were too short and people who were too tall.  Also the actual
>>> distribution might be a mixture of components.  As I recall, in the
>>> 19th century Quetelet used a frequency distribution of the chest
>>> circumference of Scottish soldiers to illustrate a method of fitting a
>>> normal distribution.  In compiling the data he merged several
>>> components and made a variety of mistakes.
>>> 
>>> The outcomes of tossing an actual "fair" die depend on how carefully
>>> the die was manufactured.  Iversen et al. (1971) analyzed the results
>>> of a large number of throws of various types of dice.
>>> 
>>> You didn't say how you would use a normal distribution to approximate
>>> the outcomes of throwing a fair die.  The basic distribution is
>>> discrete, with six equally likely outcomes.
>>> 
>>> David Hoaglin
>>> 
>>> Iversen GR, Longcor WH, Mosteller F, Gilbert JP, Youtz C (1971). Bias
>>> and runs in dice throwing and recording: a few million throws.
>>> Psychometrika 36:1-19.
>>> 
>>> On Thu, Aug 29, 2013 at 5:38 PM, Yuval Arbel <[email protected]> wrote:
>>>> What about the central limit theorem? I was referring to physical
>>>> human features - such as height - and the example of Napoleon's army
>>>> candidates for draft. In an army of millions of soldiers - you would
>>>> expect a normal distribution of heights. The problem is that those who
>>>> tried to avoid drafting probably bribed somebody to write false
>>>> heights, which is shorter than the minimal required height. In this
>>>> case - you might get a skewed distribution of heights
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>>> 
>>> *
>>> *   For searches and help try:
>>> *   http://www.stata.com/help.cgi?search
>>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>>> *   http://www.ats.ucla.edu/stat/stata/
>> 
>> 
>> 
>> --
>> Dr. Yuval Arbel
>> School of Business
>> Carmel Academic Center
>> 4 Shaar Palmer Street,
>> Haifa 33031, Israel
>> e-mail1: [email protected]
>> e-mail2: [email protected]
>> You can access my latest paper on SSRN at:  http://ssrn.com/abstract=2263398
>> You can access previous papers on SSRN at: http://ssrn.com/author=1313670
>> *
>> *   For searches and help try:
>> *   http://www.stata.com/help.cgi?search
>> *   http://www.stata.com/support/faqs/resources/statalist-faq/
>> *   http://www.ats.ucla.edu/stat/stata/
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
> 
> 
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index