Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: Check for coding of new variables using data patterns
From
"Thirukumaran, Caroline Pinto" <[email protected]>
To
[email protected]
Subject
Re: st: RE: Check for coding of new variables using data patterns
Date
Tue, 22 Oct 2013 15:19:36 -0400
Joe, I'm sorry about not mentioning the -collapse- code I used. Here it is:
. collapse (first) var1 var2 var3 var4 var5, by(newvar)
I tried different statistics by which to -collapse- including count,
first etc. But they didn't seem to give me the output I needed.
The -groups- command suggested by David and the code provided by
Sergiy work perfectly! Thank you.
On Tue, Oct 22, 2013 at 2:48 PM, Joe Canner <[email protected]> wrote:
> Caroline,
>
> Yes, it would be nice to have something in Stata like the SAS "/ list" option. In the meantime, you don't say what exactly you did with -collapse- so it's hard to say why that doesn't work. If you have a variable (or can create one) that is nonmissing when var1-var5 are nonmissing you could do:
>
> . collapse (count) somevar, by (newvar var1-var5)
>
> Regards,
> Joe Canner
> Johns Hopkins University School of Medicine
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Thirukumaran, Caroline Pinto
> Sent: Tuesday, October 22, 2013 2:19 PM
> To: [email protected]
> Subject: st: Check for coding of new variables using data patterns
>
> Hi,
> Is there a way to check the coding of a newly created variable using data patterns of variables that have used to create the new variable?
>
> As an example, newvar is a variable that has been created based on values of var1-var5. The code used for creating the newvar variable is as follows:
> egen newvar=rsum(var1 var2 var3 var4 var5)
>
> To check that newvar has been correctly coded, it would be helpful to have an output like the one below:
>
> newvar var1 var2 var3 var4 var5 Frequency
> 0 0 0 0 0 0 10
> 1 0 0 0 0 1 20
> 1 0 1 0 0 0 40
> 2 0 0 1 1 0 70
> 2 0 1 1 0 0 80
> 3 2 0 1 0 0 110
> 3 1 0 1 0 1 120
> 4 1 0 1 2 0 130
>
> -collapse- gives an acceptable output (it does not give me the frequency count) only when var1 -var5 are binary.
>
> I am using Stata 12.1 for Windows.
>
> I get the output tabulated above from SAS using the following code:
> proc freq data=abc;
> tables newvar*var1*var2*var3*var4*var5 / list missing; run;
>
> Many thanks in advance,
> Caroline Thirukumaran
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/