Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Why many things have Normal distribution
From
Joe Canner <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: Why many things have Normal distribution
Date
Sat, 31 Aug 2013 20:31:21 +0000
Yes, I would agree. I'm not sure that was the gist of the original question, but it's probably just as well that the conversation was turned to more practical applications, i.e., repeated sampling and the CLT.
As to categorizing the distributions in nature, that is probably like calculating the number of angels on the head of a pin. So, yes, we should probably move on to better uses of our time...
________________________________________
From: [email protected] [[email protected]] on behalf of Lucas [[email protected]]
Sent: Saturday, August 31, 2013 4:23 PM
To: [email protected]
Subject: Re: st: Why many things have Normal distribution
Well, can we agree that:
The number of distributions of parameter estimates upon repeated
sampling that are normally distributed dwarfs the number of
"distributions from nature" that are normally distributed?
And, I hope we can agree that inferential statistics is far more
concerned with the former than the latter. Alas, many take the former
claim to imply the latter.
Parenthetically, though, it is possible that more than half of the
"things" in nature are actually discrete, not continuous, which would
imply that any claim that most "things in nature" are normal is wrong.
For example, you indicate a few continuous physical phenomena. I
suspect one with more knowledge than I about biology could list just
as many discrete ones (e.g., number of hands, number of fingers,
number of thumbs, number of ears, . . ., number of chromosomes).
At any rate, my point was simply that analysts are usually more
concerned that the distribution of parameter estimates be normal than
with the distribution of the phenomena. It is easy to show the
difference with a discrete variable that cannot be normal even as the
mean (proportion) is normal on repeated sampling.
Anyway, this has been fun, but I should get back to work.
Sam
On Sat, Aug 31, 2013 at 10:47 AM, Joe Canner <[email protected]> wrote:
> I'm not sure how one would go about proving a statement like "most things in nature are not normally distributed" (nor its opposite: "most things in nature *are* normally distributed).
>
> Of course "number of hands" is not normally distributed, being a discrete count variable. However, there a lot of continuous variables in nature (height, weight, length, blood chemistry measures, etc.) that are in fact normal or at least close. Whether a distribution is slightly skewed or not (as per an earlier post) is less interesting (to me, at least) than why many distributions are (nearly) symmetric with mode near the median, rather than, say, exponential or uniform. I suspect this is what the original question was about, although I certainly can't speak for the person who posted it.
>
> I'm also not sure how income entered the discussion; I wouldn't call that a measurement from "nature".
>
> Joe
> ________________________________________
> From: [email protected] [[email protected]] on behalf of Lucas [[email protected]]
> Sent: Saturday, August 31, 2013 11:18 AM
> To: [email protected]
> Cc: Samuel Lucas
> Subject: Re: st: Why many things have Normal distribution
>
> I don't understand this thread. Most things in nature are not normally
> distributed. What is normally distributed is a parameter estimate from
> repeated random sampling from a population.
>
> In the U.S., for example, the number of hands per person is not
> normally distributed. The possibilities are 0, 1, and 2. The mean is
> probably something like 1.8. If we drew 1,000 samples of 1,000 people,
> the means from those samples would be or approach a normal
> distribution. The normal distribution of the mean from those samples
> would not signify that the distribution of hands per person is normal.
>
> The known distribution of the means justifies use of standard tools of
> inference (e.g., confidence interval calculation). It neither
> signifies nor requires the underlying distribution of the phenomenon
> to be normal.
>
> Sam
>
> On Sat, Aug 31, 2013 at 6:37 AM, Yuval Arbel <[email protected]> wrote:
>> Steve and David,
>>
>> If I come to think about it - and as David previously mentioned -
>> income, for example, is not normally distributed, no matter how much
>> we increase the sample:
>>
>> If we take the big corporations, for example - we find that most of
>> the workers earn a minimal wage, where senior managers earn at least
>> ten times more. I would therefore anticipate that the income variable
>> distribution will be skewed to the right. This also corresponds to
>> Pareto principle - that 80% of the wealth is concentrated among 20% of
>> the population
>>
>> One possible explanation - is the poverty trap: poor people remain
>> stuck without education or other means to get out of the trap -
>> because they get a subsistence wage.
>>
>> On Sat, Aug 31, 2013 at 1:12 AM, Steve Samuels <[email protected]> wrote:
>>>
>>> David,
>>>
>>> Here is some empirical evidence: the book by Hampel et al.(1986, pp
>>> 22-23) cites several investigators, starting with Bessel in 1818, who
>>> studied "very high quality" data sets. Most of the sets were
>>> longer-tailed than the normal and were well-approximated by
>>> t-distributions with 3-9 d.f. Slight skewness was also noted.
>>>
>>> Steve
>>>
>>>
>>> Reference:
>>>
>>> Hampel, Frank, Elvezio Ronchetti, Peter Rousseeuw, and Werner Stahel.
>>> 1986. Robust Statistics: The Approach Based on Influence Functions
>>> (Wiley Series in Probability and Mathematical Statistics). New York:
>>> John Wiley and Sons.
>>>
>>> Jeffereys, H. (1939,1961). Theory of Probability. Clarendon Press,
>>> Oxford
>>>
>>>
>>>
>>>
>>> On Aug 29, 2013, at 10:49 PM, David Hoaglin wrote:
>>>
>>> Yuval,
>>>
>>> The Central Limit Theorem (CLT) describes the behavior of the
>>> distribution of the sample mean as the sample size becomes large. In
>>> order for the distribution of the sample mean to approach a normal
>>> distribution, the underlying distribution of the data must satisfy
>>> some conditions, but those conditions are not very stringent. The CLT
>>> provides no information on how the underlying distribution behaves.
>>> One does, however, expect the behavior of samples to approach that of
>>> the underlying distribution (whatever that happens to be).
>>>
>>> I would have no special expectations of the distribution of heights in
>>> a large army. I would look at the actual distribution --- empirical
>>> evidence, rather than a thought experiment. Apart from any attempts
>>> to avoid serving, one would expect recruiters to reject people who
>>> were too short and people who were too tall. Also the actual
>>> distribution might be a mixture of components. As I recall, in the
>>> 19th century Quetelet used a frequency distribution of the chest
>>> circumference of Scottish soldiers to illustrate a method of fitting a
>>> normal distribution. In compiling the data he merged several
>>> components and made a variety of mistakes.
>>>
>>> The outcomes of tossing an actual "fair" die depend on how carefully
>>> the die was manufactured. Iversen et al. (1971) analyzed the results
>>> of a large number of throws of various types of dice.
>>>
>>> You didn't say how you would use a normal distribution to approximate
>>> the outcomes of throwing a fair die. The basic distribution is
>>> discrete, with six equally likely outcomes.
>>>
>>> David Hoaglin
>>>
>>> Iversen GR, Longcor WH, Mosteller F, Gilbert JP, Youtz C (1971). Bias
>>> and runs in dice throwing and recording: a few million throws.
>>> Psychometrika 36:1-19.
>>>
>>> On Thu, Aug 29, 2013 at 5:38 PM, Yuval Arbel <[email protected]> wrote:
>>>> What about the central limit theorem? I was referring to physical
>>>> human features - such as height - and the example of Napoleon's army
>>>> candidates for draft. In an army of millions of soldiers - you would
>>>> expect a normal distribution of heights. The problem is that those who
>>>> tried to avoid drafting probably bribed somebody to write false
>>>> heights, which is shorter than the minimal required height. In this
>>>> case - you might get a skewed distribution of heights
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>>
>> --
>> Dr. Yuval Arbel
>> School of Business
>> Carmel Academic Center
>> 4 Shaar Palmer Street,
>> Haifa 33031, Israel
>> e-mail1: [email protected]
>> e-mail2: [email protected]
>> You can access my latest paper on SSRN at: http://ssrn.com/abstract=2263398
>> You can access previous papers on SSRN at: http://ssrn.com/author=1313670
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/