Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: reshaping key-value pair data
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: reshaping key-value pair data
Date
Wed, 2 Oct 2013 01:02:45 +0100
I guess you have a question: Is it possible ... [not "It is possible..."]
Here is one idea:
. clear
. input item str5 key str6 value
item key value
1. 1 color blue
2. 1 color red
3. 1 size XL
4. 2 color orange
5. 2 size S
6. end
. bysort item key (value): gen which = value[1]
. bysort item key : replace which = value[_n-1] + "@" + value if _n > 1
(1 real change made)
. by item key : replace which = which[_N]
(1 real change made)
. by item key : keep if _n == _N
(1 observation deleted)
. drop value
. reshape wide which, j(key) string i(item)
(note: j = color size)
Data long -> wide
-----------------------------------------------------------------------------
Number of obs. 4 -> 2
Number of variables 3 -> 3
j variable (2 values) key -> (dropped)
xij variables:
which -> whichcolor whichsize
-----------------------------------------------------------------------------
. split whichcolor, p(@)
variables created as string:
whichcolor1 whichcolor2
. renpfix which
. drop color
. list
+-------------------------------+
| item size color1 color2 |
|-------------------------------|
1. | 1 XL blue red |
2. | 2 S orange |
+-------------------------------+
Nick
[email protected]
On 2 October 2013 00:31, Dimitriy V. Masterov <[email protected]> wrote:
> I have some data in an awkward key-value pair format:
>
> item key value
> 1 color blue
> 1 color red
> 1 size XL
> 2 color orange
> 2 size S
>
> It is possible to reshape this data into something like this:
>
> item color1 color2 size
> 1 blue red XL
> 2 orange S
>
> The order for the values should be alphabetical,so blue before red.
>
> I tried the following:
>
> gen color = value if key=="color"
> gen size = value if key=="size"
>
> sort item key value
> collapse (firstnm) color1=color (lastnm) color2=color (firstnm) size, by(item)
>
> This mostly works, but it won't work for more than 2 values per key
> and orange appears twice for item 2.
>
> DVM
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/