|
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: RE: Problem with -reshape- and value labels
A possible solution might involve using the descsave package
(downloadable from SSC using the ssc command) to save the specifications
of variable attributes (including value labels) in a do-file before the
first of your reshape commands, and to execute this do-file after the
last of your reshape commands. Before the first of your reshape
commands, you might type
tempfile df0
descsave resp*, do(`"`df0'"', replace)
to create a do-file in the temporary file specified by `"`df0'"'. Then,
after the last of your reshape commands, you might type
run `"`df0'"'
and the variables resp1-resp6 will have the variable labels, formats,
value labels and storage types that they had in the original dataset,
following the execution of this do-file.
I hope this helps.
Roger
Roger B Newson
Lecturer in Medical Statistics
Respiratory Epidemiology and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/pop
genetics/reph/
Opinions expressed are those of the author, not of the institution.
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Clyde
Schechter
Sent: 11 June 2008 19:13
To: [email protected]
Subject: st: Problem with -reshape- and value labels
I am having a problem whereby I start out with a data set that has a
number of variables with some different value labels. They
variables' names share a common prefix, and when I reshape the data
to long format, it seems that the value label assigned to the _last_
of the variables is carried to the new variable that equals the
common prefix. For example:
. des
Contains data
obs: 10
vars: 7
size: 160 (99.9% of memory free)
------------------------------------------------------------------------
-----------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------
-----------------------------------
seq int %8.0g
resp1 byte %8.0g boolean 1 resp
resp2 byte %8.0g boolean 2 resp
resp3 byte %8.0g boolean 3 resp
resp4 byte %8.0g boolean 4 resp
resp5 byte %8.0g boolean 5 resp
resp6 byte %8.0g other 6 resp
------------------------------------------------------------------------
-----------------------------------
Sorted by: seq
. reshape long resp, i(seq) j(item)
(note: j = 1 2 3 4 5 6)
Data wide -> long
------------------------------------------------------------------------
-----
Number of obs. 10 -> 60
Number of variables 7 -> 3
j variable (6 values) -> item
xij variables:
resp1 resp2 ... resp6 -> resp
------------------------------------------------------------------------
-----
. des
Contains data
obs: 60
vars: 3
size: 720 (99.9% of memory free)
------------------------------------------------------------------------
-----------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------
-----------------------------------
seq int %8.0g
item byte %9.0g
resp byte %8.0g other
------------------------------------------------------------------------
-----------------------------------
Sorted by: seq item
Note: dataset has changed since last saved
But the real problem arises further on:
<snip> do stuff to resp variable
<end snip>
. reshape wide
(note: j = 1 2 3 4 5 6)
Data long -> wide
------------------------------------------------------------------------
-----
Number of obs. 60 -> 10
Number of variables 3 -> 7
j variable (6 values) item -> (dropped)
xij variables:
resp -> resp1 resp2 ... resp6
------------------------------------------------------------------------
-----
. des
Contains data
obs: 10
vars: 7
size: 160 (99.9% of memory free)
------------------------------------------------------------------------
-----------------------------------
storage display value
variable name type format label variable label
------------------------------------------------------------------------
-----------------------------------
seq int %8.0g
resp1 byte %8.0g other 1 resp
resp2 byte %8.0g other 2 resp
resp3 byte %8.0g other 3 resp
resp4 byte %8.0g other 4 resp
resp5 byte %8.0g other 5 resp
resp6 byte %8.0g other 6 resp
------------------------------------------------------------------------
-----------------------------------
Sorted by: seq
Notice now that the value label "other" has been spread on to all of
the variables resp1-resp5 that originally had value label "boolean."
This then raises problems because I later attempt to select a group
of variables for some further analyses with:
ds, has(vallabel boolean)
which now comes up empty.
I can't get around this by just moving the resp6 variable earlier in
the data set: its unique value label gets singled out for the
long-format prefix-named variable regardless of where it physically
is in the data set. In fact, the work around seems to be to rename
one of the "boolean" labeled variables to have a name that is
alphabetically last.
That would keep the "boolean" label from getting wiped out, but then
it results in all the variables being so labeled when I reshape back
to wide, so the -ds- command then traps variables that should be
excluded from further analysis. Is there anyway to have -reshape-
restore the original labels?
(Evidently I can just relabel them by hand in this example, but the
real data set I'm working with has several dozen such variables, so
this starts to get impractical.)
I checked the -reshape- section of the manual and I find no mention
of anything about how value labels are handled.
Any help would be appreciated. Thanks in advance.
Clyde Schechter
Albert Einstein College of Medicine
Bronx, New York, USA
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/