Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Proportional Independent Variables
From
Joerg Luedicke <[email protected]>
To
[email protected]
Subject
Re: st: Proportional Independent Variables
Date
Thu, 28 Feb 2013 01:43:00 -0500
I should have added that this is assuming that the omitted variable
has an effect of zero. If the effect of the omitted variable is
non-zero, then the estimates for the other variables are biased by an
amount equal to the effect size of the omitted predictor. For example,
if the effect for cnsx1 was 0.1 and the fifth variable (cnsx5) had an
effect of 0.1 as well, then the estimate for cnsx1 would be zero when
fitting the model without cnsx5 (in expectation).
Joerg
On Thu, Feb 28, 2013 at 12:39 AM, Joerg Luedicke
<[email protected]> wrote:
> When unsure about things like these, it is always a good idea to run a
> bunch of simulations with fabricated data. Below is some code for
> checking consistency of OLS estimates, based on the described set up.
> First, we generate 5 variables containing uniform random variates on
> the range [0,1), and constrain the variables such that they sum up to
> one for each observation. Then, we set up a program to feed to Stata's
> -simulate-, and finally inspect the results. You can change sample
> size, number of variables, and parameter values in order to closer
> resemble your problem at hand.
>
> The amount of bias looks indeed negligible to me, confirming Nick Cox'
> impressions. Efficiency might be a different story though...
>
> Joerg
>
> *--------------------------------------------
> // Generate data
> clear
> set obs 500
> set seed 1234
>
> forval i=1/5 {
> gen u`i' = runiform()
> }
>
> egen su = rowtotal(u*)
> gen wu = 1/su
>
> forval i=1/5 {
> gen cnsx`i' = u`i'*wu
> }
>
> keep cnsx*
>
> // Set up program for -simulate-
> program define mysim, rclass
>
> cap drop e y
> gen e = rnormal()
> gen y = 0.1*cnsx1 + 0.2*cnsx2 + ///
> 0.3*cnsx3 + 0.4*cnsx4 + e
> reg y cnsx1 cnsx2 cnsx3 cnsx4
>
> forval i = 1/4 {
> local b`i' = _b[cnsx`i']
> return scalar b`i' = `b`i''
> }
>
> end
>
> // Run simulations
> simulate b1=r(b1) b2=r(b2) b3=r(b3) b4=r(b4), ///
> reps(10000) seed(4321) : mysim
>
> // Results
> sum
> *--------------------------------------------
>
>
> On Wed, Feb 27, 2013 at 3:40 PM, nick bungy
> <[email protected]> wrote:
>> Dear Statalist,
>>
>> I have a dependent variable that is continuous
>> and a set of 20 independent variables that are percentage based, with
>> the condition that the sum of these variables must be 100% across each
>> observation. The data is across section only.
>>
>> I am aware that
>> interpretting the coefficients from a general OLS fit will be
>> inaccurate. The increase of one of the 20 variables will have to be
>> facilitated by a decrease in one or more of the other 19 variables.
>>
>> Is
>> there an approach to get consistent coefficient estimates of these
>> parameters that consider the influence of a proportionate decrease in
>> one or more of the other 20 variables?
>>
>> Best,
>>
>> Nick
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/