Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: Combining multiple observations into one observation with multiple variables
From
Sven-Oliver Spieß <[email protected]>
To
[email protected]
Subject
Re: st: Re: Combining multiple observations into one observation with multiple variables
Date
Wed, 30 Jun 2010 10:57:25 +0200
Hi Conor
Generally 'reshape' would do that. Does the following example point in
the right direction?
===================================
input id char
1 1
1 3
1 7
1 11
2 1
2 8
3 2
3 7
3 13
end
bysort id (char): gen count = _n
reshape wide char, i(id) j(count)
list
===================================
Best,
Sven-Oliver
On Wed, Jun 30, 2010 at 09:06, Conor Hughes <[email protected]> wrote:
> Sorry, my tables got smushed:
> Dataset1
> ----------------------------------------
> household id | individual id
> ----------------------------------------
> 1 | 1
> 1 | 2
> 1 | 3
> 2 | 1
> 2 | 2
> 3 | 1
> 3 | 2
>
> Dataset 2
> -----------------------------------------------------------
> household id | household characteristic id
> ------------------------------------------------------------
> 1 | 1
> 1 | 3
> 1 | 7
> 1 | 11
> 2 | 1
> 2 | 8
> 3 | 2
> 3 | 7
> 3 | 13
>
>
> On Wed, Jun 30, 2010 at 1:40 PM, Conor Hughes <[email protected]> wrote:
>> Hi All,
>> I have a couple of survey datasets that I need to merge, but they're
>> organized in an inconvenient way. The first is organized by
>> household, and individuals within the household. The second is only
>> organized by household. I'd like to do a many-to-one merge on
>> household, so as to preserve the individual id's. However, in the
>> second dataset, rather than adding household characteristics as
>> variables, it adds them as observations, e.g.:
>>
>> Dataset 1 Dataset 2
>> -------------------------------------
>> -----------------------------------------------------------
>> household id | individual id household id |
>> household characteristic id
>> -------------------------------------
>> ------------------------------------------------------------
>> 1 | 1
>> 1 | 1
>> 1 | 2
>> 1 | 3
>> 1 | 3
>> 1 | 7
>> 2 | 1
>> 1 | 11
>> 2 | 2
>> 2 | 1
>> 3 | 1
>> 2 | 8
>> 3 | 2
>> 3 | 2
>>
>> 3 | 7
>>
>> 3 | 13
>> I'd prefer, in the second dataset, to have one observation for each
>> household, including household characteristics as dummy variables. As
>> it is, the only way to get them together is via many-to-many merge,
>> which is foolish and doesn't work well, giving an output like
>> -------------------------------------------------------------------------------
>> household id | individual id | household characteristic id
>> -------------------------------------------------------------------------------
>> 1 | 1 | 1
>> 1 | 2 | 3
>> 1 | 3 | 7
>> 1 | 3 | 11
>> 2 | 1 | 1
>> 2 | 2 | 8
>> 3 | 1 | 2
>> 3 | 2 | 7
>> 3 | 2 | 13
>> Which messes up the the first dataset, since it creates repeat
>> observations of individuals. Is there a graceful way of the changing
>> the multiple observations per household in the second dataset to one
>> observation per household with characteristics represented as dummy
>> variables? Any help would be greatly appreciated. And please let me
>> know if I've described the situation poorly and you'd like
>> clarification.
>>
>> Cheers,
>> Conor
>>
>
>
>
> --
> Conor Hughes
> Mathematics and Economics
> University of Chicago 2011
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/