Hello, Just wondering if someone could help me with some data
analysis/regression work I am doing. I am trying to do propensity score
matching, but first need to do logistic regression and that's what I am
have trouble with.. suppose I have a model as follows: gpa (the
dependent variable) and sex and race being the independent variables.
So if I want to do logistic regression, do I take the log of all the
variables (dependent and independent) first and then do the regression?
for example, in STATA do I type 'regress gpa sex race' (using the log
of all the variables) to get regression results.. or do I do 'logit gpa
sex race' (using the log of all the variables)? Also, in logistic
regression, are all the variables meant to be binary.
because race being a categorical variable and having several
categories, do I generate a new variable, that is for example, 1 =
black and else 0.. basically do I generate the dummy variable first and
then take the log of the dummy variable for logistic regression?
part of my data is as follows:
GPA sex race
3.2 m black
3.5 f black
3.1 m hispanic
3.6 f white
3.2 f white
3.5 m asian
3.3 f hispanic
3.6 m white
where "1" = black, "2" = hispanic, "3" = white
part of my stata code is as follows:
gen black = race==1
gen female = sex==f
gen loggpa = log(gpa)
gen logsex = log(female)
gen lograce = log(black)
and then do something like:
logit loggpa logsex lograce or
regress loggpa logsex lograce
does this seem correct for doing logistic regression? Or can someone
show me if I must do it differently, in terms of stata commands? Iâ??m
just confused about when to generate the variable, taking the log of
the variable and on to regression.
Thanks, any help will be greatly appreciated,
Mike
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/