Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: data transformation
From
"Joseph Coveney" <[email protected]>
To
<[email protected]>
Subject
st: Re: data transformation
Date
Mon, 4 Mar 2013 12:35:03 +0900
Seok-Woo Kwon wrote:
I do not know how to describe my problem in general terms. So let me
use an example to describe it. (I am using Stata 12 for Windows.)
I have about 10,000 observations and 1,000 variables on individuals,
but 4 observations and 3 variables will be enough to show the problem.
My data looks like this:
id X Y Z
-----------------------
001 1 2 X
002 2 3 Y
003 3 5 2
004 4 6 Y
-----------------------
I need data that looks like this:
id X Y Z Z_transformed
--------------------------------------------
001 1 2 X 1
002 2 3 Y 3
003 3 5 2 2
004 4 6 Y 6
--------------------------------------------
That is, some values in the variable "Z" will be a variable name (like
"X" or "Y"). I would like to transform that variable into a value of
that variable for the observation.
For example, the value of Z for id 001 is X. Instead of X, I would
like to show the value of X for id oo1 (which is 1). Is there a way to
program this in Stata?
--------------------------------------------------------------------------------
Yes. See below. You might want to look upstream to see whether there is a way
to get the data the way you need it in the first place.
Joseph Coveney
. clear *
. set more off
. input str3 id byte (X Y) str1 Z
id X Y Z
1. "001" 1 2 "X"
2. "002" 2 3 "Y"
3. "003" 3 5 "2"
4. "004" 4 6 "Y"
5. end
.
. *
. * For pretty
. *
. preserve
. drop Z
. tempfile tmpfil0
. quietly save `tmpfil0'
. restore
.
. *
. * Segregate data with nonnumeric values of Z
. *
. preserve
. quietly drop if missing(real(Z))
. keep id Z
. quietly destring Z, generate(Z_transformed)
. tempfile tmpfil1
. quietly save `tmpfil1'
. restore
. quietly keep if missing(real(Z))
.
. *
. * Reshape target variables long
. *
. preserve
. keep id Z
. tempfile tmpfil2
. quietly save `tmpfil2'
. restore
. drop Z
. foreach var of varlist _all {
2. if "`var'" == "id" continue
3. rename `var' Z_transformed`var'
4. }
. quietly reshape long Z_transformed, i(id) j(Z) string
.
. *
. * Let -merge- select the values
. *
. merge m:1 id Z using `tmpfil2', assert(match master) keep(match) ///
> nogenerate noreport
.
. *
. * Reassembly
. *
. append using `tmpfil1'
.
. *
. * Pretty
. *
. merge 1:1 id using `tmpfil0', assert(match) nogenerate noreport
. order id X Y Z
. sort id
. list, noobs separator(0) abbreviate(20)
+---------------------------------+
| id X Y Z Z_transformed |
|---------------------------------|
| 001 1 2 X 1 |
| 002 2 3 Y 3 |
| 003 3 5 2 2 |
| 004 4 6 Y 6 |
+---------------------------------+
.
. exit
end of do-file
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/