Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Comparing overlapping groups
From
Fred Wolfe <[email protected]>
To
[email protected]
Subject
Re: st: Comparing overlapping groups
Date
Wed, 3 Oct 2012 04:31:53 -0500
Thanks very much, David.
Fred
On Tue, Oct 2, 2012 at 10:05 AM, David Hoaglin <[email protected]> wrote:
> Dear Fred,
>
> If the 4 definitions were mutually exclusive subsets, you could use a
> regression that has indicator variables for FM2, FM3, and FM4 (the
> constant term would handle FM1, or you could include an indicator for
> FM1 and turn off the constant). The result would be equivalent to a
> one-way analysis of variance with 4 groups.
>
> Since the definitions overlap (though you have not said how many of
> the overlaps are present in your data or the numbers of observations
> in the overlaps --- if all 2442 observations meet at least one of the
> 4 definitions, you could have as many as 15 subgroups), you could
> start with a regression model that has indicators for FM2, FM3, and
> FM4. The constant will give you an average for FM1, and the
> coefficients of the three indicators will give incremental effects,
> relative to FM1. The results may not be satisfactory, and they may be
> difficult to interpret. A better approach, along the lines of main
> effects and interactions, would also include indicators for each of
> the subsets that involve 2 or more of the definitions. Then, for
> example, you could get an estimate of the level of phq_sss among
> people who meet only FM1, an increment for people who meet both FM1
> and FM2, and further increments for people who meet FM1, FM2, and FM3
> and people who meet all 4 definitions.
>
> I hope this discussion is helpful.
>
> David Hoaglin
>
> On Tue, Oct 2, 2012 at 10:06 AM, Fred Wolfe
> <[email protected]> wrote:
>> Dear Statalisters,
>>
>> I am analyzing a medical condition (FM) that has 4 different
>> definitions for the same condition. A person can be in 1 or more of
>> four definition defined groups (FM1, FM2, FM3, FM4). There are 2442
>> observations.
>>
>> I am interested the value of a dependent variable, phq_sss, according
>> to each group definition.
>>
>> For the first two definitions, I get these results
>>
>> . regress phq_sss i.wsp
>>
>> Source | SS df MS Number of obs = 2442
>> -------------+------------------------------ F( 1, 2440) = 605.51
>> Model | 7621.27967 1 7621.27967 Prob > F = 0.0000
>> Residual | 30711.1417 2440 12.5865335 R-squared = 0.1988
>> -------------+------------------------------ Adj R-squared = 0.1985
>> Total | 38332.4214 2441 15.7035729 Root MSE = 3.5478
>>
>> ------------------------------------------------------------------------------
>> phq_sss | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> 1.wsp | 6.247731 .2538992 24.61 0.000 5.74985 6.745611
>> _cons | 2.728905 .0751615 36.31 0.000 2.581518 2.876292
>> ------------------------------------------------------------------------------
>>
>> . regress phq_sss i.mwsp
>>
>> Source | SS df MS Number of obs = 2442
>> -------------+------------------------------ F( 1, 2440) = 229.25
>> Model | 3292.19831 1 3292.19831 Prob > F = 0.0000
>> Residual | 35040.2231 2440 14.3607472 R-squared = 0.0859
>> -------------+------------------------------ Adj R-squared = 0.0855
>> Total | 38332.4214 2441 15.7035729 Root MSE = 3.7896
>>
>> ------------------------------------------------------------------------------
>> phq_sss | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> 1.mwsp | 10.37138 .6849863 15.14 0.000 9.028161 11.71459
>> _cons | 3.144753 .0771774 40.75 0.000 2.993413 3.296093
>> ------------------------------------------------------------------------------
>>
>> There are two additions definitions that are not shown.
>>
>> So the difference for group members as opposed to none groups members
>> in the two analyses above is:
>> wsp 6.2
>> mwsp 10.4
>> (there will be 2 other groups).
>>
>> My question is, how do i tell if the results are statistically
>> different between the 4 groups, given the overlapping membership in
>> the groups. I have a feeling that some sort of permutation test is the
>> way to get such an answer. I'd appreciate suggestions.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/