Dear Statalisters,
I need to illustrate the failure of the normal distribution, in terms of
linear scaling, to describe a distribution with fairly distinguished
clusters.
To go tto he extreme and make this more obvious, I consider the following
illustration:
Let assume a sample of 1000 observations which are distributed in a [-1, 1]
interval. Suppose that there is a large but smoothly distributed cluster of
900 observations that take values in the [-0.2, 1] interval, and the rest
of the 100 observations lie in the [-1, -0.8] interval. Thirty percent of
the linear scaling will be used for non-existent values. The normal
distribution will fit a large volume of variation in that gap.
(The example is merely to make clear the possibility of mis-scaling and how
it works, I'm not bothered with the obvious mixture of distributions)
I need to generate the appropriate data and do this on a graph, e.g. a
histogram with a superimposed normal density (the graph is not a problem).
I have been playing around with the -invnorm(uniform())- function to
generate the data but I didnt even get near to what I want to do. Any
suggestions are gratefully appreciated!
many thanks in advance,
Dimitris
---------------------------------------------
Dimitris Christodoulou
Teaching and Research Associate
School for Business and Regional Development
University of Wales, Bangor
Hen Coleg
LL57 2DG Bangor
UK
e-mail: [email protected]
---------------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/