| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
Re: st: Reshape problem
In this case, We have to create a routine program because it's not
standard format. Here is the example :
/* Open the datasets */
/* The example here is original.dta that contain :
+---------------------------------+
| v1 v2 v3 v4 |
|---------------------------------|
1. | Sex M M M |
2. | Age 47 66 56 |
3. | Left eye |
4. | Right eye Y Y Y |
5. | Lower eyelid Y Y Y |
6. | Upper eyelid |
7. | Lateral canthus |
8. | Medial canthus Y Y |
9. | Recurrent lesion |
10. | Primary lesion Y Y Y |
+---------------------------------+
The result of process like this :
+-----------------------------------------------------------------------------------------------------------------------------+
| Sex Age Left_eye Right_eye Lower_eyelid Upper_eyelid
Lateral_canthus Medial_canthus Recurrent_lesion Primary_lesion |
|-----------------------------------------------------------------------------------------------------------------------------|
| M 47 . Y Y .
. Y . |
| M 66 . Y Y .
. Y . Y |
| M 56 . Y Y .
. . Y |
+-----------------------------------------------------------------------------------------------------------------------------+
*/
use original,clear
/* Fill in the number of variable you have */
/* In this sample, the local numvar is having 4 variables */
/* The result of this process is destination.dta */
local numvar=4
local vartot=_N
local i = 1
while `i' <= `vartot' {
local j=1
while `j' <= `numvar' {
use original in `i',clear
local varx = v`j'
local nvar = trim("`varx'"[`j'])
local varv = "`varx'"
drop _all
if `j'==1 {
if `i'>1 {
use destination,clear
}
local varname = subinstr("`nvar'"," ","_",1)
qui gen `varname'=""
}
else {
local obs=`j'-1
use destination,clear
if `j' >= 2 {
if `i'==1 {
qui set obs `obs'
}
qui replace `varname'="`varv'" in `obs'
}
}
qui save destination,replace
local ++j
}
local ++i
}
/* In case contain numerik variables */
qui destring,replace
----- Original Message -----
From: "Nick Cox" <[email protected]>
To: <[email protected]>
Sent: Friday, May 05, 2006 2:57 AM
Subject: RE: st: Reshape problem
>I agree with Radu that this is a problem requiring
> a transpose of the data.
>
> As Radu hints, -xpose- requires all variables
> to be numeric. But -sxpose- on SSC is a rough-and-ready
> transpose for string variables.
>
> So I had a go:
>
> . sxpose, clear force
>
> . l
>
> +-------------------------------------------------------------------------------------------------------+
> 1. | _var1 | _var2 | _var3 | _var4 | _var5 | _var6
| _var7 | _var8 |
> | 2 | 3 | 4 | 5 | 6 | 7
| 8 | 9 |
> |-------------------------------------------------------------------------------------------------------|
> | _var9 |
_var10 |
> | 10 |
11 |
> +-------------------------------------------------------------------------------------------------------+
>
> +-------------------------------------------------------------------------------------------------------+
> 2. | _var1 | _var2 | _var3 | _var4 | _var5 | _var6
| _var7 | _var8 |
> | Sex | Age | Left eye | Right eye | Lower eyelid | Upper eyelid
| Lateral canthus | Medial canthus |
> |-------------------------------------------------------------------------------------------------------|
> | _var9 |
_var10 |
> | Recurrent lesion |
Primary lesion |
> +-------------------------------------------------------------------------------------------------------+
>
> +-------------------------------------------------------------------------------------------------------+
> 3. | _var1 | _var2 | _var3 | _var4 | _var5 | _var6
| _var7 | _var8 |
> | M | 47 | | Y | Y |
| | Y |
> |-------------------------------------------------------------------------------------------------------|
> | _var9 |
_var10 |
> | |
Y |
> +-------------------------------------------------------------------------------------------------------+
>
> +-------------------------------------------------------------------------------------------------------+
> 4. | _var1 | _var2 | _var3 | _var4 | _var5 | _var6
| _var7 | _var8 |
> | M | 66 | | Y | Y |
| | Y |
> |-------------------------------------------------------------------------------------------------------|
> | _var9 |
_var10 |
> | |
Y |
> +-------------------------------------------------------------------------------------------------------+
>
> +-------------------------------------------------------------------------------------------------------+
> 5. | _var1 | _var2 | _var3 | _var4 | _var5 | _var6
| _var7 | _var8 |
> | M | 56 | | Y | Y |
| | |
> |-------------------------------------------------------------------------------------------------------|
> | _var9 |
_var10 |
> | |
Y |
> +-------------------------------------------------------------------------------------------------------+
>
> . drop in 1
> (1 observation deleted)
>
> . foreach v of var _var* {
> 2. label var `v' "`= `v'[1]'"
> 3. }
>
> . drop in 1
> (1 observation deleted)
>
> . destring, replace
> _var1 contains non-numeric characters; no replace
> _var2 has all characters numeric; replaced as byte
> _var3 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var4 contains non-numeric characters; no replace
> _var5 contains non-numeric characters; no replace
> _var6 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var7 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var8 contains non-numeric characters; no replace
> _var9 has all characters numeric; replaced as byte
> (3 missing values generated)
> _var10 contains non-numeric characters; no replace
>
> . d
>
> Contains data from spiedel.dta
> obs: 3
> vars: 10 4 May 2006 20:39
> size: 183 (99.9% of memory free)
> -------------------------------------------------------------------------------
> storage display value
> variable name type format label variable label
> -------------------------------------------------------------------------------
> _var1 str3 %9s Sex
> _var2 byte %10.0g Age
> _var3 byte %10.0g Left eye
> _var4 str9 %9s Right eye
> _var5 str12 %12s Lower eyelid
> _var6 byte %10.0g Upper eyelid
> _var7 byte %10.0g Lateral canthus
> _var8 str14 %14s Medial canthus
> _var9 byte %10.0g Recurrent lesion
> _var10 str14 %14s Primary lesion
> -------------------------------------------------------------------------------
> Sorted by:
> Note: dataset has changed since last saved
>
> . l
>
> +--------------------------------------------------------------------------------+
> | _var1 _var2 _var3 _var4 _var5 _var6 _var7 _var8
_var9 _var10 |
> |--------------------------------------------------------------------------------|
> 1. | M 47 . Y Y . . Y
. Y |
> 2. | M 66 . Y Y . . Y
. Y |
> 3. | M 56 . Y Y . .
. Y |
> +--------------------------------------------------------------------------------+
>
>
> Naturally, I can't see what problems lurk in the rest of the data, but so
> far, so good.
>
> Nick
> [email protected]
>
> Radu Ban
>
>> i don't know if this is the best way, but you could use the
>> -xpose- command.
>>
>> but before that you need to make all you variables numerical.
>> for example:
>>
>> /*coding males as 1 females as 0, Y as 1 no(missing) as 0 */
>> /*coding age as numeric age*/
>>
>> forval i = 2/256 {
>> gen real_v`i' = 1 if v`i' == "M" | v`i' == "Y"
>> replace real_v`i' = 0 if v`i' == "F" | v`i' == ""
>> replace real_v`i' = real(v`i') in 2
>> }
>>
>> /*now you no longer really need id and v1, as long as you make note of
>> what each row means*/
>>
>> drop id v1
>>
>> /*now you can transpose*/
>> xpose, clear
>>
>> /*now you rename to get the variables name back*/
>> rename v1 sex
>> rename v2 age
>> ...
>>
>> /*now you label values*/
>> label define yesno 1 yes 2 no
>> label define gender 1 male 0 female
>> ...
>>
>> hope this helps. there are probably more elegant solutions though.
>
> Thomas Speidel
>
>> > I have a dataset that I need to re-organize in a long format.
>> > Here is a sample:
>> >
>> > +--------------------------------------+
>> > | id v1 v2 v3 v4 |
>> > |--------------------------------------|
>> > | 2 Sex M M M |
>> > | 3 Age 47 66 56 |
>> > | 4 Left eye |
>> > | 5 Right eye Y Y Y |
>> > | 6 Lower eyelid Y Y Y |
>> > |--------------------------------------|
>> > | 7 Upper eyelid |
>> > | 8 Lateral canthus |
>> > | 9 Medial canthus Y Y |
>> > | 10 Recurrent lesion |
>> > | 11 Primary lesion Y Y Y |
>> > +--------------------------------------+
>> >
>> > There are 255 observations (v2-v256) occupying the columns.
>> I need each
>> > observation to be a distinct row. Almost all of the
>> entries are string,
>> > (missing means "NO"). I am having difficulties using the
>> reshape command
>> > to achieve my goal, possibly because of the strings. Any
>> suggestion on
>> > how to approach this? Should I create a loop to encode all
>> variables?
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/