Re: st: trying to compare means and using xi and xi3 for survey data
Hitesh Chandwani <[email protected]>
[email protected]
Re: st: trying to compare means and using xi and xi3 for survey data
Tue, 5 Jul 2011 11:05:42 -0400
Austin, Steven, and Maarten,
Thank you for all the input! Your comments cleared up a lot of my queries.
On Tue, Jul 5, 2011 at 10:21 AM, Steven Samuels <[email protected]> wrote:
> "Is this interpretation accurate?"
> Yes
> Steve
> [email protected]
> On Jul 5, 2011, at 6:44 AM, Hitesh Chandwani wrote:
> Steven,
> I used the following commands:
> . char insured_pub_pvt_un[omit]2
> . xi: svy: regress totchg_num i.insured_pub_pvt_un
> And got the following output:
> i.insured_pub~n _Iinsured_p_0-4 (naturally coded; _Iinsured_p_2 omitted)
> (running regress on estimation sample)
> Survey: Linear regression
> Number of strata = 75 Number of obs = 103817
> Number of PSUs = 966 Population size = 469088.57
> Design df = 891
> F( 3, 889) = .
> Prob > F = .
> R-squared = 0.0106
> ------------------------------------------------------------------------------
> | Linearized
> totchg_num | Coef. Std. Err. t P>|t| [95% Conf. Interval]
> -------------+----------------------------------------------------------------
> _Iinsured_~0 | (dropped)
> _Iinsured_~1 | 6504.334 915.0348 7.11 0.000 4708.46 8300.209
> _Iinsured_~3 | -3015.988 705.0121 -4.28 0.000 -4399.666 -1632.31
> _Iinsured_~4 | 1070.352 1961.327 0.55 0.585 -2779.007 4919.711
> _cons | 13894.47 837.4082 16.59 0.000 12250.95 15538
> ------------------------------------------------------------------------------
> I think the fact that the "0" group was dropped again has something to
> do with the fact that all observations in this group have pweights set
> to zero. The way I interpret the output is that the coefficients are
> the differences in mean between the omitted group (group 2) and the
> other groups (1, 3, and 4, respectively) with the corresponding
> t-statistic values being a comparison of means with the omitted group.
> Is this interpretation accurate?
> Regards,
> Hitesh
> On Tue, Jul 5, 2011 at 7:30 AM, Hitesh Chandwani
> <[email protected]> wrote:
>> Hi Steven,
>> There is no evident coding error that I can see. If I use the
>> -,noomit- option, how do I interpret the results? The coefficients are
>> clearly the means, but what do the t-values indicate?
>> xi, noomit: svy: reg totchg_num i.insured_pub_pvt_un , nocons
>> (running regress on estimation sample)
>> Survey: Linear regression
>> Number of strata = 75 Number of obs = 103817
>> Number of PSUs = 966 Population size = 469088.57
>> Design df = 891
>> F( 4, 888) = .
>> Prob > F = .
>> R-squared = 0.1513
>> ------------------------------------------------------------------------------
>> | Linearized
>> totchg_num | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>> -------------+----------------------------------------------------------------
>> _Iinsured_~0 | (dropped)
>> _Iinsured_~1 | 20398.81 1171.304 17.42 0.000 18099.97 22697.64
>> _Iinsured_~2 | 13894.47 837.4082 16.59 0.000 12250.95 15538
>> _Iinsured_~3 | 10878.49 844.9702 12.87 0.000 9220.121 12536.85
>> _Iinsured_~4 | 14964.83 1801.761 8.31 0.000 11428.64 18501.02
>> ------------------------------------------------------------------------------
>> Regards,
>> Hitesh
>> On Tue, Jul 5, 2011 at 12:34 AM, Steven Samuels <[email protected]> wrote:
>>> I suspect a coding error.
>>> Suppose insure_cat is your original insurance variable. Have you looked at
>>> *******************************
>>> bys insure_cat: sum totchg_num
>>> *****************************
>>> Have you tabulated each insurance indicator against insure_cat?
>>> In any case, direct survey approaches are:
>>> ************************
>>> svy: mean totchg_num, over(insure_cat)
>>> xi, noomit: svy: reg totch_num i.insure_cat, nocons //pre-Stata 11
>>> svy: reg totch_num ibn.insure_cat, nocons //Stata 11 +
>>> ************************
>>> Steve
>>> On Jul 4, 2011, at 5:02 PM, Hitesh Chandwani wrote:
>>> Hello Statalisters,
>>> I am using cost survey data and have 2 questions:
>>> 1) Comparison of means
>>> Using the svy: mean procedure, I can get means of cost for all
>>> categories of a particular variable. But since this variable is not
>>> dichotomous, using -test- or -lincom- as a postestimation command to
>>> compare the means, doesn't yield any results. What I thought of was
>>> dummy coding the categories and then running a regression. Instead of
>>> manually creating dummy variables, I decided to use -xi-; which brings
>>> me to my next question,
>>> 2) -xi- and -xi3- will both omit one category as a reference
>>> category..which is fine. But, in my output, after omitting the first
>>> category, another category is indicated as (dropped). Moreover, there
>>> is still no value for the F-statistic.
>>> Firstly, is my approach correct? And secondly, why are 2 categories
>>> being dropped?
>>> (One explanation that I could come up with for the 2 dropped
>>> categories is that the pweight for the observations in the omitted
>>> category " _Iinsured_p_0" is set to zero and hence Stata needs to use
>>> another category as reference)
>>> The following is my syntax as well as output:
>>> xi: svy: regress totchg_num i.insured_pub_pvt_un
>>> i.insured_pub~n _Iinsured_p_0-4 (naturally coded; _Iinsured_p_0 omitted)
>>> (running regress on estimation sample)
>>> Survey: Linear regression
>>> Number of strata = 75 Number of obs = 103817
>>> Number of PSUs = 966 Population size = 469088.57
>>> Design df = 891
>>> F( 3, 889) = .
>>> Prob > F = .
>>> R-squared = 0.0106
>>> ------------------------------------------------------------------------------
>>> | Linearized
>>> totchg_num | Coef. Std. Err. t P>|t| [95% Conf. Interval]
>>> -------------+----------------------------------------------------------------
>>> _Iinsured_~1 | 6504.334 915.0348 7.11 0.000 4708.46 8300.209
>>> _Iinsured_~2 | (dropped)
>>> _Iinsured_~3 | -3015.988 705.0121 -4.28 0.000 -4399.666 -1632.31
>>> _Iinsured_~4 | 1070.352 1961.327 0.55 0.585 -2779.007 4919.711
>>> _cons | 13894.47 837.4082 16.59 0.000 12250.95 15538
>>> ------------------------------------------------------------------------------
>>> . test _Iinsured_p_1 _Iinsured_p_2 _Iinsured_p_3 _Iinsured_p_4
>>> Adjusted Wald test
>>> ( 1) _Iinsured_p_1 = 0
>>> ( 2) _Iinsured_p_2 = 0
>>> ( 3) _Iinsured_p_3 = 0
>>> ( 4) _Iinsured_p_4 = 0
>>> Constraint 2 dropped
>>> F( 3, 889) = 23.78
>>> Prob > F = 0.0000
>>> Any help in understanding this issue will be greatly appreciated.
>>> Regards,
