Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Creating an id variable from one of each (string) observations in 6 variables
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Creating an id variable from one of each (string) observations in 6 variables
Date
Thu, 20 Mar 2014 10:17:38 +0000
This could mean various things.
I recommend against the word "unique" here. "Unique" means occurring
once only according to even permissive dictionaries and style guides.
Despite that, people using software often use "unique" to mean
"distinct", but I'd argue in that case for the latter word.
My guess is that some of this usage can be attributed to the Unix
command -uniq-, which reduces a set of values to a subset in which
each value occurs just once.
For more on this point, and more positively a bundle of related ideas, see
SJ-8-4 dm0042 . . . . . . . . . . . . Speaking Stata: Distinct observations
(help distinct if installed) . . . . . . N. J. Cox and G. M. Longton
Q4/08 SJ 8(4):557--568
shows how to answer questions about distinct observations
from first principles; provides a convenience command
http://www.stata-journal.com/sjpdf.html?articlenum=dm0042
and the -distinct- command it introduces.
Also, see -groups- (SSC), the tabulation command -tabm- in -tab_chi-
(SSC), -mrtab- (SJ) and the -egen- function -group()-. For example,
egen which = group(ll????)
will assign identical responses to the six questions to the same
identifier value.
Nick
[email protected]
On 20 March 2014 10:05, Jonas Klarin <[email protected]> wrote:
> Dear all,
>
> I have election survey data from six periods in time. In each point in time, a couple of thousand different people answered a question. The answers are stored in six string variables as text. I would like to create an id variable containing each unique answer from the six variables and then count the number of times each unique answer is recorded for every time period. In other words, I would like to know how many times the respondents replied eg. Syssels% for every time period (variable). Can someone help me with this?
>
> The data looks like this (IIxxxx are the variable names):
>
> II1994 II1998 II2002 II2006 II2010 II1991
> Syssels‰ Syssels‰ . Syssels‰ . .
> Familjep KulturfrÂgor . . . .
> Sveriges SjukvÂrd H‰lso- o Miljˆ/miljˆv . .
> . . SjukvÂrd/sju . . .
> . . SjukvÂrd ƒldrevÂr Skatter/arbe .
> … … … … … …
>
> etc..
>
>
> Kind regards,
> Jonas Klarin
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/