The approach I generally use in such situations (in either SAS or Stata) is
to produce the output I want (create string variables to hold variable names
and formatted values) and concatenate it to an output dataset. In Stata,
use the postfile and post commands, now that they allow string variables.
Bryan Sayer
Statistician, SSS Inc.
-----Original Message-----
From: Rita Luk
To: '[email protected]'
Sent: 7/23/03 9:39 AM
Subject: st: output svymean
Hello all,
I am hoping that you can help me solve an issue that is causing me to
pull
out what is left of my thinning hair. I normally use SAS, but am
required to
use STATA for a current project.
I am trying to create a database of results produced from the survey
tabulation procedure (svytab). I have included the code that I would use
in
SAS proc freq below, so that someone who is familiar with both packages
can
see what I am up to. However, because of the complex survey design I
cannot
use SAS for this project.
I plan on doing A LOT of single tabulations and cross tabulations and
want
to produce an output dataset that might look like:
OBS NAME1 VALUE1 NAME2 VALUE2 COUNT PERCENT1 PERCENT2
1 sex female . . 50 0.25 .
2 sex male . . 150 0.75 .
3 sex female age old 25 0.50 0.33
4 sex female age young 25 0.50 0.20
5 sex male age old 50 0.33 0.67
6 sex male age young 100 0.67 0.80
As you can see, this dataset would result from running two different
tabulations. The first two observations would come from a single
variable
tabulation of sex. The subsequent 4 observations (3-6) would have
resulted
from a cross tabulation of the dichotomous variables age and sex.
Ideally I want my program/macro to be able to handle any categorical
variables automatically regardless of the number of categories.
In stata by using several e() and matrix commands, I can get the
percents,
and with a little data manipulation the counts and names, but I cannot
get
the values in a variable form. In addition, I can only do this for the
crosstabs (svytab), but not for the single variable frequencies because
the
svyprop command does not seem to give me saved estimates. Furthermore,
the
amount of coding needed just to get the percents and counts seems
excessive.
There must be a quicker way. I am doing something like:
*this gives me the row percentatges in variable form (unfortunately
attached
to my raw database)
svytab sex age, row
matrix mrowpct=e(b)'
svmat mrowpct, name(rowpct)
*this gives me the col percentages in variable form (unfortunately
attached
to my raw database)
svytab sex age, col
matrix mcolpct=e(b)'
svmat mcolpct, name(colpct)
and that is just to get TWO of the variables in my output dataset!!!
There
has got to be a simpler way, since in SAS just one option will give a
full
output dataset of everything I need, all one needs to do (as
demonstrated
below) is just massage the output to look the way I want. Stata must
have an
analogous feature!
I am not asking for someone to code this for me. I need to learn how to
use
this program. I just need some hints as to how to do this, commands etc.
(ie
is there a simple way to get output datasets other than piecing together
a
whole bunch of matrices.
Thanks for any help you can provide,
Charles
For those that want to see what I would do in SAS, to get an idea of
what I
am doing here:
I would first create the macro (I think stata users call these
programs?)
/**FOR SINGLE VARIABLE TABLES USE CODE=1, FOR CROSSTABS USE CODE=2**/
%macro outdata (var1, var2,code);
%if &code=1 %then %do;
proc freq data=mydata;
tables &var1 / out=predata;
run;
%end;
%if &code=2 %then %do;
proc freq data=mydata;
tables &var1*&var2 / out=predata outpct;
run;
%end;
data predata;
set predata;
rename &var1=VALUE1 &var2=VALUE2 pct_row=PERCENT1 pct_col=PERCENT2;
NAME1="&var1";
NAME2="&var2";
run;
data final;
set final predata;
run;
%mend;
Then run the macro on my data:
%outdata (sex, ,1);
%outdata (sex,age,2);
This program will work regarless of missing values, categories, or
number of
categories to produce the output that I want.
In fact almost all of the output that I want is created by simply the
addition of the "out=" option in the tables statement of the proc freq.
The
rest of the program is simply changing the names.
Essentially I am hoping that stata has the equivalent of the "OUT="
option
for its svy commands, but I cannot seem to find them.
Thanks,
Charles
_______________________________________________
J. Charles Victor BSc, MSc, PhD (candidate)
Department of Public Health Sciences
University of Toronto
Toronto, Ontario
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/