You seem to have changed the structure. You now have 8 vars: a, b, ..., h.
And now var5 depends on e through h, rather than a through d.
Let me assume that you want to use all the existing vars in varlist1 -- for
both collecting into a set of independent vars, and for creating an
additional composite variable. (I will call the composite variable compvar
rather than var5.)
I see no need to create a set of new variables; that will only waste space,
which seems to be scarce in this particular problem. Instead, just form a
new varlist that consists of only those that are actual variables (and not
completely missing). Thus it is a subset of varlist1.
Here's my suggestion, borrowing some ideas from what Nick wrote (which I
would not have thought of myself):
local varlist1 "a b c d e f g h" // or whatever you might have
foreach x of local varlist1 {
capture confirm var `x'
if _rc==0 {
capture assert mi(`x')
if _rc==0 {
drop `x'
}
else {
local varlist2 "`varlist2' `x'"
}
}
if trim("`varlist2'") ~= "" {
egen compvar = eqany(`varlist2'), v(1)
regress depvar `varlist2' compvar
}
----
One little point to remember: this picks up any variable that is not
completely missing. Thus, for example, if you have a million observations,
and a variable is nonmissing on only one of them, it will be included. But
the regression will be limited to only those observations that are
nonmissing on all variables.
It's not clear whether you want compvar to be 0 or missing when all of the
other vars are not 1. If it is to be made missing, then you want to also do...
replace compvar = . if compvar == 0
And I note that your regression is now limited to cases where at least one
of the other independent variables is == 1. But it is puzzling why you
want compvar included among the independent vars. Perhaps you meant...
regress depvar `varlist2' if compvar==1
Or perhaps I misinterpreted your structure.
----
Good luck with this.
-- David
At 07:29 PM 8/7/2003 -0400, you wrote:
Thanks Nick and David for the help.
David, I mean the exact correspondence: var1 for a, var2 for b, etc.
If a variable is absent, it would be preferable not to create it, since the
program is huge.
For example, if a does not exist or contains no observation, thus, it would be
preferable not to create var1.
Even though we create new variables with missing values, they would be
irrelevant for my regressions.
For var5, if at least one of the variables exists, then I want to use it to
create var5. If not, it will not be created.
Nick, all the code is there. My intend is simple.
Let me rewrite all my program below: my dependent variable is depvar (which is
common to all files).
local varlist1 "a b c d e f g h"
foreach x of local varlist1 {
capture confirm var `x'
if _rc==0 {
capture assert mi(`x')
if _rc==0 {
drop `x'
}
else {
g var1=a
g var2=b
g var3=c
g var4=d
g var5=.
replace var5=1 if e==1| f==1| g==1| h==1
}
}
regress depvar var1 var2 var3 var4 var5 /*if one of them exists*/
The first suggestion of Nick seems good, but since I have a lot of
variables to
create, it will be very difficult to rewrite the code for each of them.
I will try his second suggestion.
Best regards.
Amadou DIALLO,
AFTHD, The World Bank.
David Kantor
Institute for Policy Studies
Johns Hopkins University
[email protected]
410-516-5404
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/