Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: stacking unique values of several variables under one new variable
From
James Bernard <[email protected]>
To
[email protected]
Subject
Re: st: stacking unique values of several variables under one new variable
Date
Mon, 25 Feb 2013 21:20:48 +0800
thanks a lot
helpful as usual
On Mon, Feb 25, 2013 at 4:44 PM, Nick Cox <[email protected]> wrote:
> For "unique" read "distinct".
>
> My code is very similar to Maarten's but I will post it nevertheless.
>
> If it's as simple as your example implies then you can do this:
>
> . gen long obs = _n
>
> . split technology , p(,)
> variables created as string:
> technology1 technology2
>
> . local k = r(nvars)
>
> . expand `k'
> (4 observations created)
>
> . forval j = 1/`k' {
> 2. bysort obs : replace technology = technology`j'[1] if _n == `j'
> 3. }
> (2 real changes made)
> (4 real changes made)
>
> . drop if missing(technology)
> (2 observations deleted)
>
> . replace technology = trim(technology)
> (2 real changes made)
>
> . drop technology?
>
> . duplicates drop technology, force
>
> Duplicates in terms of technology
>
> (1 observation deleted)
>
> . list
>
> +-------------------+
> | technology obs |
> |-------------------|
> 1. | Monoclonals 1 |
> 2. | Vaccines 2 |
> 3. | Adjuvant 3 |
> 4. | Vaccine 3 |
> 5. | Combinchem 4 |
> +-------------------+
>
> Here's the code in one
>
> gen long obs = _n
> split technology , p(,)
> local k = r(nvars)
> expand `k'
> forval j = 1/`k' {
> bysort obs : replace technology = technology`j'[1] if _n == `j'
> }
> drop if missing(technology)
> replace technology = trim(technology)
> drop technology?
> duplicates drop technology, force
> list
>
> Notes: Knowing that "Vaccines" and "Vaccine" mean the same, and
> anything similar, will have to be part of extra code.
>
> Maarten's code assumes that the separator is always ", ". I don't
> assume that there is a space always, so I am obliged to trim spaces
> afterwards.
>
> Nick
>
> On Mon, Feb 25, 2013 at 6:15 AM, James Bernard <[email protected]> wrote:
>
>> I have been struggling with the following. I would appreciate you help
>>
>> I have a variable ("Technology) that indicates type(s) of a technology
>> for each record. I want to aggregate the unique values of this
>> variable under one new variable, say, called "Type:
>>
>>
>> Technology
>> -------------------------
>> Monoclonals
>> Vaccines
>> Adjuvant, Vaccine
>> Combinchem, Monoclonals
>>
>>
>>
>>
>>
>> Now, i want to create a variable that stores unique values:
>>
>> Type
>> -----------
>> Monoclonals
>> Vaccines
>> Adjuvant,
>> Combinchem
>>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/