Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
From
Michael McCulloch <[email protected]>
To
[email protected]
Subject
Re: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
Date
Mon, 17 May 2010 10:38:44 -0700
Thanks for the explanation!
On May 17, 2010, at 10:26 AM, Nick Cox wrote:
None of these statements is entirely correct.
In the first clause, the maximum can be less than 9.
In the second and third clauses, the test is whether the maximum
falls in certain intervals. The range of the data is otherwise not
considered.
The last clause isn't unconditional as in your paraphrase; it
applies only when r(max) exceeds 999.
Nick
[email protected]
Michael McCulloch
Thanks Nick, this helped me understand the code. Am I correct then to
understand that:
. if r(max)<=9 mvdecode `var', mv(9)
means: "change all values of 9 to missing when 9 is the max of the
range"
. else if inrange(r(max),10,99) mvdecode `var', mv(99)
means: "change all values of 99 to missing when the range is 10 to
99"
. else if inrange(r(max),100,999) mvdecode `var', mv(999)
means: "change all values of 999 to missing when the range is 100 to
999"
. else mvdecode `var', mv(9999)
means: "change all values of 9999 to missing"
On May 17, 2010, at 9:58 AM, Nick Cox wrote:
99 isn't changed because there are bigger values in the same
variable. Thus, it is assumed that it does not mean missing.
Michael McCulloch
In Martin's code, I noticed that:
for observation #8, var4 is changed to missing,
for observation #4, var3 is not changed to missing.
This puzzled me because they both have "999" as original value.
It also looks like values "9", "999" and "9999" are changed to
missing, but not "99".
Michael
On May 17, 2010, at 9:30 AM, Lachenbruch, Peter wrote:
Looks good to me.
Tony
Peter A. Lachenbruch
Department of Public Health
Oregon State University
Corvallis, OR 97330
Phone: 541-737-3832
FAX: 541-737-4001
-----Original Message-----
From: [email protected] [mailto:[email protected]
] On Behalf Of Martin Weiss
Sent: Monday, May 17, 2010 12:35 AM
To: [email protected]
Subject: AW: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
<>
What does the -mvdecode- solution look like then? Like this?
*************
clear*
inp byte(var1 var2) int(var3 var4)
1 1 1 1
2 2 2 2
3 3 3 3
4 8 99 999
5 9 100 1000
6 10 101 1001
7 11 150 5000
9 12 999 9999
end
foreach var of varlist *{
sum `var', mean
if r(max)<=9 mvdecode `var', mv(9)
else if inrange(r(max),10,99) mvdecode `var', mv(99)
else if inrange(r(max),100,999) mvdecode `var', mv(999)
else mvdecode `var', mv(9999)
}
li, noo
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Steve
Samuels
Gesendet: Montag, 17. Mai 2010 03:00
An: [email protected]
Betreff: Re: st: RE: AW: RE: AW: recode 9, 99, 999,..., into missing
Mandy, if you know this much about each variable, I see no advantaqe
or necessity to your approach. -mvdecode- appears to be superior in
every way. It is not only more direct, clearer, and will handle
all the other "non-data" codes. Clarity is very important: other
people (and you, perhaps, in the future) will be able to understand
your Stata statements without any lengthy explanation. None of the
other solutions can claim that.
Steve
On Sun, May 16, 2010 at 8:33 PM, Amanda Fu <[email protected]>
wrote:
Dear Mr. Weiss and Lachenbruch,
I am sorry that I should be more clear when describing my question.
In
my opinion, I need to be careful about this problem : for example,
for
a variable that has 10 scales, the 9 value means a real scale and
99
in that case means "not answered".
The pattern is like this:
(1) if the maximum value of a variable is smaller than 9 , then
the
"not answered" takes the value 9;
(2) if the maximum value of a variable is smaller than 99 but
greater
than 10, then the "not answered" takes the value 99;
(3) if the maximum value of a variable is smaller than 999 but
greater than 100, then the "not answered" takes the value 999;
and so on.
(And you are absolutely right for the reminder that there are
values
such as 7,8, 98, or 97 to indicate "refused to answer" "invalid
answer". Here I would like to keep focus on one example of "not
answered" , because the other values could be dealt with using the
same way.)
Thanks for help from both of you!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Best wishes,
Michael McCulloch, LAc MPH PhD
Pine Street Foundation
124 Pine Street
San Anselmo, CA 94960-2674
tel: 415-407-1357
fax: 206-338-2391
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/