Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Re: st: Automatic fit of distribution
From
Richard Williams <[email protected]>
To
[email protected], [email protected]
Subject
Re: Re: st: Automatic fit of distribution
Date
Thu, 11 Jul 2013 13:04:56 -0500
Changing the subject slightly -- it is often recommended that you
examine your data, e.g. do graphs or whatever, run various
diagnostics. I am inclined to agree; indeed I always tell people to
start with assorted descriptive statistics before launching into
their high tech models. However, things like stepwise regression are
widely condemned. Again I am inclined to agree, but I have a hard
time explaining what exactly the difference is. In both cases, aren't
you looking at the data first and using that information to guide
your model building? By graphing the data first, couldn't that lead
to over-fitting, and run the risk that analysis with different data
would lead to different results? If, say, my visual examination or
diagnostics have led me to add squared terms or even use a different
statistical method, aren't my p values misleading? It seems like a
lot of the cautions and concerns raised with stepwise could also be
raised for approaches that are considered much more acceptable. My
instincts go with the conventional wisdom but I am not sure how I
would respond if pressed on these matters.
At 11:29 AM 7/11/2013, David Hoaglin wrote:
Diagnostics are fine, but there is no sustitute for looking at the
data (e.g., in well-chosen histograms and quantile-quantile plots).
Programs that rely on the sample skewness and kurtosis will be blind
to mixtures that show more than one mode, and the sensitivity of
sample moments to outliers makes those measures unsuitable for
diagnosing distribution shape.
Also, the process should take into account whether the data are
continuous or discrete.
David Hoaglin
On Thu, Jul 11, 2013 at 11:45 AM, Ariel Linden. DrPH
<[email protected]> wrote:
> I completely agree with Nick and Maarten that the user should do the work
> required to determine what type of distribution they are dealing
with and go
> from there. However, it seems to me that there could be a program that
> "points the user in the right direction" after running a few simple
> diagnostics. For example, there are several programs already available to
> test for normality (ie., -sktest-, -swilk-, -ksmirnov-). It would be rather
> straightforward to test for a Poisson distribution based on the variance =
> mean. It would get harder as we go to other distributions, or fall between
> choices...
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
-------------------------------------------
Richard Williams, Notre Dame Dept of Sociology
OFFICE: (574)631-6668, (574)631-6463
HOME: (574)289-5227
EMAIL: [email protected]
WWW: http://www.nd.edu/~rwilliam
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/