| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Reshape problem
Hendratno is replying to my solution, but
does not comment on it.
Here is my code again, but cut to
the essentials and revised:
sxpose, clear force destring
drop in 1
foreach v of var _var* {
label var `v' "`= `v'[1]'"
}
drop in 1
Among other details, I note that this
is shorter.
Nick
[email protected]
[email protected]
> In this case, We have to create a routine program because it's not
> standard format. Here is the example :
>
> /* Open the datasets */
> /* The example here is original.dta that contain :
> +---------------------------------+
> | v1 v2 v3 v4 |
> |---------------------------------|
> 1. | Sex M M M |
> 2. | Age 47 66 56 |
> 3. | Left eye |
> 4. | Right eye Y Y Y |
> 5. | Lower eyelid Y Y Y |
> 6. | Upper eyelid |
> 7. | Lateral canthus |
> 8. | Medial canthus Y Y |
> 9. | Recurrent lesion |
> 10. | Primary lesion Y Y Y |
> +---------------------------------+
>
> The result of process like this :
> +-------------------------------------------------------------
> ----------------------------------------------------------------+
> | Sex Age Left_eye Right_eye Lower_eyelid Upper_eyelid
> Lateral_canthus Medial_canthus Recurrent_lesion Primary_lesion |
> |-------------------------------------------------------------
> ----------------------------------------------------------------|
> | M 47 . Y Y .
>
> . Y . |
> | M 66 . Y Y .
>
> . Y . Y |
> | M 56 . Y Y .
>
> . . Y |
> +-------------------------------------------------------------
> ----------------------------------------------------------------+
>
> */
>
> use original,clear
>
> /* Fill in the number of variable you have */
> /* In this sample, the local numvar is having 4 variables */
> /* The result of this process is destination.dta */
>
> local numvar=4
>
> local vartot=_N
> local i = 1
>
> while `i' <= `vartot' {
> local j=1
>
> while `j' <= `numvar' {
> use original in `i',clear
> local varx = v`j'
> local nvar = trim("`varx'"[`j'])
> local varv = "`varx'"
>
> drop _all
>
> if `j'==1 {
> if `i'>1 {
> use destination,clear
> }
> local varname = subinstr("`nvar'"," ","_",1)
> qui gen `varname'=""
> }
> else {
> local obs=`j'-1
> use destination,clear
> if `j' >= 2 {
> if `i'==1 {
> qui set obs `obs'
> }
> qui replace `varname'="`varv'" in `obs'
> }
> }
> qui save destination,replace
> local ++j
> }
> local ++i
> }
>
> /* In case contain numerik variables */
> qui destring,replace
Nick Cox
> >I agree with Radu that this is a problem requiring
> > a transpose of the data.
> >
> > As Radu hints, -xpose- requires all variables
> > to be numeric. But -sxpose- on SSC is a rough-and-ready
> > transpose for string variables.
> >
> > So I had a go:
> >
> > . sxpose, clear force
> >
> > . l
> >
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> > 1. | _var1 | _var2 | _var3 | _var4 | _var5 |
> _var6
> | _var7 | _var8 |
> > | 2 | 3 | 4 | 5 | 6 |
> 7
> | 8 | 9 |
> >
> |-------------------------------------------------------------
> ------------------------------------------|
> > | _var9 |
>
> _var10 |
> > | 10 |
>
> 11 |
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> > 2. | _var1 | _var2 | _var3 | _var4 | _var5 |
> _var6
> | _var7 | _var8 |
> > | Sex | Age | Left eye | Right eye | Lower eyelid |
> Upper eyelid
> | Lateral canthus | Medial canthus |
> >
> |-------------------------------------------------------------
> ------------------------------------------|
> > | _var9 |
>
> _var10 |
> > | Recurrent lesion |
>
> Primary lesion |
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> > 3. | _var1 | _var2 | _var3 | _var4 | _var5 |
> _var6
> | _var7 | _var8 |
> > | M | 47 | | Y | Y |
>
> | | Y |
> >
> |-------------------------------------------------------------
> ------------------------------------------|
> > | _var9 |
>
> _var10 |
> > | |
>
> Y |
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> > 4. | _var1 | _var2 | _var3 | _var4 | _var5 |
> _var6
> | _var7 | _var8 |
> > | M | 66 | | Y | Y |
>
> | | Y |
> >
> |-------------------------------------------------------------
> ------------------------------------------|
> > | _var9 |
>
> _var10 |
> > | |
>
> Y |
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> > 5. | _var1 | _var2 | _var3 | _var4 | _var5 |
> _var6
> | _var7 | _var8 |
> > | M | 56 | | Y | Y |
>
> | | |
> >
> |-------------------------------------------------------------
> ------------------------------------------|
> > | _var9 |
>
> _var10 |
> > | |
>
> Y |
> >
> +-------------------------------------------------------------
> ------------------------------------------+
> >
> > . drop in 1
> > (1 observation deleted)
> >
> > . foreach v of var _var* {
> > 2. label var `v' "`= `v'[1]'"
> > 3. }
> >
> > . drop in 1
> > (1 observation deleted)
> >
> > . destring, replace
> > _var1 contains non-numeric characters; no replace
> > _var2 has all characters numeric; replaced as byte
> > _var3 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var4 contains non-numeric characters; no replace
> > _var5 contains non-numeric characters; no replace
> > _var6 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var7 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var8 contains non-numeric characters; no replace
> > _var9 has all characters numeric; replaced as byte
> > (3 missing values generated)
> > _var10 contains non-numeric characters; no replace
> >
> > . d
> >
> > Contains data from spiedel.dta
> > obs: 3
> > vars: 10 4 May 2006 20:39
> > size: 183 (99.9% of memory free)
> >
> --------------------------------------------------------------
> -----------------
> > storage display value
> > variable name type format label variable label
> >
> --------------------------------------------------------------
> -----------------
> > _var1 str3 %9s Sex
> > _var2 byte %10.0g Age
> > _var3 byte %10.0g Left eye
> > _var4 str9 %9s Right eye
> > _var5 str12 %12s Lower eyelid
> > _var6 byte %10.0g Upper eyelid
> > _var7 byte %10.0g Lateral canthus
> > _var8 str14 %14s Medial canthus
> > _var9 byte %10.0g Recurrent lesion
> > _var10 str14 %14s Primary lesion
> >
> --------------------------------------------------------------
> -----------------
> > Sorted by:
> > Note: dataset has changed since last saved
> >
> > . l
> >
> >
> +-------------------------------------------------------------
> -------------------+
> > | _var1 _var2 _var3 _var4 _var5 _var6 _var7
> _var8
> _var9 _var10 |
> >
> |-------------------------------------------------------------
> -------------------|
> > 1. | M 47 . Y Y . .
> Y
> . Y |
> > 2. | M 66 . Y Y . .
> Y
> . Y |
> > 3. | M 56 . Y Y . .
>
> . Y |
> >
> +-------------------------------------------------------------
> -------------------+
> >
> >
> > Naturally, I can't see what problems lurk in the rest of
> the data, but so
> > far, so good.
> >
> > Nick
> > [email protected]
> >
> > Radu Ban
> >
> >> i don't know if this is the best way, but you could use the
> >> -xpose- command.
> >>
> >> but before that you need to make all you variables numerical.
> >> for example:
> >>
> >> /*coding males as 1 females as 0, Y as 1 no(missing) as 0 */
> >> /*coding age as numeric age*/
> >>
> >> forval i = 2/256 {
> >> gen real_v`i' = 1 if v`i' == "M" | v`i' == "Y"
> >> replace real_v`i' = 0 if v`i' == "F" | v`i' == ""
> >> replace real_v`i' = real(v`i') in 2
> >> }
> >>
> >> /*now you no longer really need id and v1, as long as you
> make note of
> >> what each row means*/
> >>
> >> drop id v1
> >>
> >> /*now you can transpose*/
> >> xpose, clear
> >>
> >> /*now you rename to get the variables name back*/
> >> rename v1 sex
> >> rename v2 age
> >> ...
> >>
> >> /*now you label values*/
> >> label define yesno 1 yes 2 no
> >> label define gender 1 male 0 female
> >> ...
> >>
> >> hope this helps. there are probably more elegant solutions though.
> >
> > Thomas Speidel
> >
> >> > I have a dataset that I need to re-organize in a long format.
> >> > Here is a sample:
> >> >
> >> > +--------------------------------------+
> >> > | id v1 v2 v3 v4 |
> >> > |--------------------------------------|
> >> > | 2 Sex M M M |
> >> > | 3 Age 47 66 56 |
> >> > | 4 Left eye |
> >> > | 5 Right eye Y Y Y |
> >> > | 6 Lower eyelid Y Y Y |
> >> > |--------------------------------------|
> >> > | 7 Upper eyelid |
> >> > | 8 Lateral canthus |
> >> > | 9 Medial canthus Y Y |
> >> > | 10 Recurrent lesion |
> >> > | 11 Primary lesion Y Y Y |
> >> > +--------------------------------------+
> >> >
> >> > There are 255 observations (v2-v256) occupying the columns.
> >> I need each
> >> > observation to be a distinct row. Almost all of the
> >> entries are string,
> >> > (missing means "NO"). I am having difficulties using the
> >> reshape command
> >> > to achieve my goal, possibly because of the strings. Any
> >> suggestion on
> >> > how to approach this? Should I create a loop to encode all
> >> variables?
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/