Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Arranging variables across rows
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Arranging variables across rows
Date
Tue, 26 Jun 2012 23:30:59 +0100
-rowsort- is from SJ. I think you are correct; it does not help here
at all, nor does it purport to.
My main advice is to restructure to a long structure as fast as
possible. With this structure this will only be the first of several
awkward problems and not even the most difficult of those.
I created some similar data and did this
. list
+-------------------------------------------------+
| A1 A2 A3 A4 B1 B2 B3 family |
|-------------------------------------------------|
1. | 101 102 103 104 102 104 . alpha |
2. | 201 202 203 204 203 . . beta |
+-------------------------------------------------+
. keep family A*
. reshape long A, i(family)
(note: j = 1 2 3 4)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 2 -> 8
Number of variables 5 -> 3
j variable (4 values) -> _j
xij variables:
A1 A2 ... A4 -> A
-----------------------------------------------------------------------------
. drop if _j == .
(0 observations deleted)
. drop _j
. gen treated = 0
. rename A person
. save Afile, replace
file Afile.dta saved
. use rowprob
. keep family B*
. reshape long B, i(family)
(note: j = 1 2 3)
Data wide -> long
-----------------------------------------------------------------------------
Number of obs. 2 -> 6
Number of variables 4 -> 3
j variable (3 values) -> _j
xij variables:
B1 B2 B3 -> B
-----------------------------------------------------------------------------
. rename B person
. drop if person == .
(3 observations deleted)
. drop _j
. gen treated = 1
. list
+---------------------------+
| family person treated |
|---------------------------|
1. | alpha 102 1 |
2. | alpha 104 1 |
3. | beta 203 1 |
+---------------------------+
. append using Afile
. collapse (max) treated, by(family person)
. list
+---------------------------+
| family person treated |
|---------------------------|
1. | alpha 101 0 |
2. | alpha 102 1 |
3. | alpha 103 0 |
4. | alpha 104 1 |
5. | beta 201 0 |
|---------------------------|
6. | beta 202 0 |
7. | beta 203 1 |
8. | beta 204 0 |
+---------------------------+
Here's the code in case it's useful.
list
keep family A*
reshape long A, i(family)
drop if _j == .
drop _j
gen treated = 0
rename A person
save Afile, replace
use rowprob
keep family B*
reshape long B, i(family)
rename B person
drop if person == .
drop _j
gen treated = 1
list
append using Afile
collapse (max) treated, by(family person)
list
I think this is messier than a single -reshape- because you have
different numbers of A and B variables and they don't map on to one
another. There would be a -merge- solution as well, for sure.
On Tue, Jun 26, 2012 at 10:37 PM, samuel gyetvay <[email protected]> wrote:
> I have two sets of variables, let's call them A1, A2, ... A19 and B1,
> B2, ... B8.
>
> A1, A2, ... A19 give identification numbers for up to 19 individuals
> per family. Each family occupies a row in the data set.
>
> B1, B2, ... B8 list identification numbers of up to 8 individuals who
> have received treatment.
>
> I need to preserve the order and placement of variables in A1, ... A19
> and would like to create a dummy variable equal to 1 whenever an
> individual has received treatment. Basically, I need to go from
> something that like this:
>
> A1 A2 A3 ... A19
> 101 102 103 ... 19
>
> B1 B2 B3 ... B8
> 103 . . ... .
>
> To something like this
>
> A1 A2 A3 ... A19
> 101 102 103 ... 119
>
> D1 D2 D3 ... D19
> 0 0 1 ... 0
>
> I am aware of the command rowsort, but it does not solve this
> particular problem. rowsort would turn
>
> B1 B2 B3 ... B8
> . . 102 ... .
>
> into
>
> B1 B2 B3 ... B8
> 102 . . ... .
>
> when what I need is
>
> B1 B2 B3 ... B8
> . 102 . ... .
>
> I could create a dummy variable equal to 1 when A is equal to B
>
>
> Hopefully this question is clearly phrased, and there exists a simple
> solution. Please let me know if you have any suggestions or if
> anything is unclear.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/