First off, consider -pwcorr-.
In terms of Stata code, looking inside the code for
-spearman- will reveal one recipe for doing it yourself,
using a call to -ttail()- with r(N) and r(rho) embedded
in the appropriate expression.
That all rather overlooks the advice, from Professor
Sir Ronald Aylmer Fisher, no less, to do inferential calculations
for correlations on a transformed scale, as the
sampling distribution can be very skew. Several years ago
John Gleason published a bundle of little programs to
do that in the Stata Technical Bulletin. -search gleason,
author- will reveal references.
However, the issue usually in mind when people want
P-values is whether the
correlation is definitely not zero, and the reference
sampling distribution in this case will be symmetric,
or nearly so.
However, many people wouldn't put much weight
on those calculations to the extent that they
rely on an assumption that your data are bivariate
normal. That perhaps points to bootstrapping as
a way of getting a handle on uncertainty.
All that said, I find that if the sample size is large enough
to make a correlation worth calculating then the
issue of strength swamps that of significance.
Standards may differ between sciences but I
regard any correlation below .3 as too weak
to be very interesting. Whether such a correlation
is also declared significant does not then interest me
much. Of course, that does not rule out such
a variable having a useful bit part in some
larger model together with other predictors,
but a display of correlations is not going to help
much with that kind of question.
I sometimes find that people want the P-value
for a correlation as a measure of strength
of relationship. That is unlikely to be your motivation,
but it is more common than one would hope, and
to me has the matter quite backwards as the
correlation itself is a measure of strength
of relationship (well, of linearity) and is
easier to think about than the P-value
(assuming always a scatter plot is in view).
Concretely, fire up
. sysuse auto
. pwcorr price-for, sig
and ponder the patterns. I see for example
that at this sample size (n = 74) a correlation of
0.3 is about significant at P = 0.01. Many
researchers would regard 0.01 as very good news
but a correlation of 0.3 does not carry much
predictability.
Nick
[email protected]
Michael Lemay
> I created this dataset of correlations using the following syntax:
>
> foreach x in multi lit alarmr interr secr porchr signr bganyr gphysr
> betph vand {
> quietly: corr `x' i_ok if (country==11 | country==18 | country==19)
> local all "`r(rho)'"
> quietly: corr `x' i_ok if country==11
> local gr "`r(rho)'"
> quietly: corr `x' i_ok if country==18
> local pt "`r(rho)'"
> quietly: corr `x' i_ok if country==19
> local pl "`r(rho)'"
> post asa_corrtab ("`x'") (`all') (`gr') (`pt') (`pl')
> }
>
> It works fine for my purpose except that I would like to add the
> p-value for each correlations to the dataset. I thought it would be
> fairly easy to do. However, -corr- does not save the p-value, unlike
> other commands such as -tab-. How would I go about accessing the
> p-value with -corr-?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/