Adrian de la Garza
> I need to replace some values of a variable that meet certain
> conditions. Although I know it's not possible to make
> direct references
> to rows using the -replace- command, this is what I want to do:
>
> replace status = status[`j'] if status[`j'] == "NM" |
> status[`j'] == "W"
> in `i'
This is perfectly legal, in itself, so long as the local
macro makes sense in this context, i.e. depending on
what local macro j contains. However, it's not a form
that seems needed very often.
> I tried to use a local variable to avoid using status[`j'] in the
> command line but it didn't work. Like this:
>
> local sta = status[`j']
> replace status = "`sta'" if
> status[`j']=="NM"|status[`j']=="W" in `i'
No, this just puts the _text_ in that position, not the _value_.
> You will find my whole script below.
>
> Do you know how I can get past this problem?
>
> Here I present a chunk of my dataset so that you understand
> the problem
> better. You don't need to read the rest if you already
> understood what I
> want to do from the command line above.
>
> The data is organized in a panel format by country-date.
> What I need is
> to fill in the values of 'status' whenever they are missing and they
> meet certain conditions.
>
> Suppose I start a loop that sweeps through 'status' row by row. Most
> observations in 'status' are missing and suddenly, in
> observation [i] I
> find one cell that is not missing.
> (a) If status[i] = "M"/"W"/"NM", the subsequent rows
> (observations [j],
> where j gets different values) should be "M"/"W"/"NM" whenever
> country[i]=country[j] and date1[i]=date1[j].
>
> (b) If, however, status[i] = "PC" then I need to look at the next
> non-empty cell to decide what "PC" is going to be in [i].
> PC is going to
> be converted into "M" except if a subsequent value is "W" or "NM". A
> value in row [j] is considered subsequent if [j] > [i] and
> country[i]=country[j] and date1[i]=date1[j].
>
> The data looks like this:
>
> country date date1 date2 status
> 401. | Armenia 1997m5 1995m6 1999m12 |
> 402. | Armenia 1997m6 1996m2 1999m12 PC |
> 403. | Armenia 1997m7 1996m2 1999m12 |
> 404. | Armenia 1997m8 1996m2 1999m12 |
> 405. | Armenia 1997m9 1996m2 1999m12 |
> |------------------------------------------------------|
> 406. | Armenia 1997m10 1996m2 1999m12 |
> 407. | Armenia 1997m11 1996m2 1999m12 |
> 408. | Armenia 1997m12 1996m2 1999m12 |
> 409. | Armenia 1998m1 1996m2 1999m12 |
> 410. | Armenia 1998m2 1996m2 1999m12 W |
> |------------------------------------------------------|
> 411. | Armenia 1998m3 1996m2 1999m12 |
> 412. | Armenia 1998m4 1996m2 1999m12 |
> 413. | Armenia 1998m5 1996m2 1999m12 |
> 414. | Armenia 1998m6 1996m2 1999m12 |
> 415. | Armenia 1998m7 1996m2 1999m12 |
> |------------------------------------------------------|
> 416. | Armenia 1998m8 1996m2 1999m12 |
> 417. | Armenia 1998m9 1996m2 1999m12 |
> 418. | Armenia 1998m10 1996m2 1999m12 |
> 419. | Armenia 1998m11 1996m2 1999m12 |
> 420. | Armenia 1998m12 1996m2 1999m12 PC |
> |------------------------------------------------------|
> 421. | Armenia 1999m1 1996m2 1999m12 |
> 422. | Armenia 1999m2 1996m2 1999m12 |
> 423. | Armenia 1999m3 1996m2 1999m12 |
> 424. | Armenia 1999m4 1996m2 1999m12 |
> 425. | Armenia 1999m5 1996m2 1999m12 |
> |------------------------------------------------------|
> 426. | Armenia 1999m6 1996m2 1999m12 |
> 427. | Armenia 1999m7 1996m2 1999m12 |
> 428. | Armenia 1999m8 1996m2 1999m12 |
> 429. | Armenia 1999m9 1996m2 1999m12 |
> 430. | Armenia 1999m10 1996m2 1999m12 M |
> |------------------------------------------------------|
> 431. | Armenia 1999m11 1996m2 1999m12 |
> 432. | Armenia 1999m12 1996m2 1999m12 |
> 433. | Armenia 2000m1 . . |
> 434. | Armenia 2000m2 . . |
>
> And after running my script it should look like this:
>
> country date date1 date2 status
> 401. | Armenia 1997m5 1995m6 1999m12 |
> 402. | Armenia 1997m6 1996m2 1999m12 W |
> 403. | Armenia 1997m7 1996m2 1999m12 W |
> 404. | Armenia 1997m8 1996m2 1999m12 W |
> 405. | Armenia 1997m9 1996m2 1999m12 W |
> |------------------------------------------------------|
> 406. | Armenia 1997m10 1996m2 1999m12 W |
> 407. | Armenia 1997m11 1996m2 1999m12 W |
> 408. | Armenia 1997m12 1996m2 1999m12 W |
> 409. | Armenia 1998m1 1996m2 1999m12 W |
> 410. | Armenia 1998m2 1996m2 1999m12 W |
> |------------------------------------------------------|
> 411. | Armenia 1998m3 1996m2 1999m12 W |
> 412. | Armenia 1998m4 1996m2 1999m12 W |
> 413. | Armenia 1998m5 1996m2 1999m12 W |
> 414. | Armenia 1998m6 1996m2 1999m12 W |
> 415. | Armenia 1998m7 1996m2 1999m12 W |
> |------------------------------------------------------|
> 416. | Armenia 1998m8 1996m2 1999m12 W |
> 417. | Armenia 1998m9 1996m2 1999m12 W |
> 418. | Armenia 1998m10 1996m2 1999m12 W |
> 419. | Armenia 1998m11 1996m2 1999m12 W |
> 420. | Armenia 1998m12 1996m2 1999m12 M |
> |------------------------------------------------------|
> 421. | Armenia 1999m1 1996m2 1999m12 M |
> 422. | Armenia 1999m2 1996m2 1999m12 M |
> 423. | Armenia 1999m3 1996m2 1999m12 M |
> 424. | Armenia 1999m4 1996m2 1999m12 M |
> 425. | Armenia 1999m5 1996m2 1999m12 M |
> |------------------------------------------------------|
> 426. | Armenia 1999m6 1996m2 1999m12 M |
> 427. | Armenia 1999m7 1996m2 1999m12 M |
> 428. | Armenia 1999m8 1996m2 1999m12 M |
> 429. | Armenia 1999m9 1996m2 1999m12 M |
> 430. | Armenia 1999m10 1996m2 1999m12 M |
> |------------------------------------------------------|
> 431. | Armenia 1999m11 1996m2 1999m12 M |
> 432. | Armenia 1999m12 1996m2 1999m12 M |
> 433. | Armenia 2000m1 . . |
> 434. | Armenia 2000m2 . . |
>
> This is my script and I don't know what the problem is:
>
> local m = _N
> forv i = 1/`m' {
> if status[`i'] != "" {
> if status[`i'] == "PC" {
> local j = `i'+1
> while status[`j'] == "" &
> country[`j']==country[`i'] & date1[`i']==date1[`j'] {
> local sta = status[`j']
> replace status = "`sta'" if
> status[`j']=="NM"|status[`j']=="W" in `i'
> replace status = "M" if
> status[`j']=="M"|(date1[`j']!=date1[`i'] &
> country[`j']!=country[`i'] in
> `i'
> }
> }
> else {
> local j = `i'+1
> local k = 0
> while status[`j'] == "" &
> country[`i']==country[`j'] & date1[`i']==date1[`j'] {
> local sta = status[`i']
> replace status =
> "`sta'" in `j'
> local j = `j'+1
> local k = `k'+1
> }
> }
> local i = `i'+`k'
> }
> }
The spirit is willing, but the flesh is weak,
especially Friday evening (local time),
so I am not going to try and work out what's wrong
with this code. But broadly, I'll assert that
loops are only rarely necessary in data management.
There is guidance on similar problems in the
FAQ at
How can I replace missing values with previous
or following nonmissing values?
http://www.stata.com/support/faqs/data/missing.html
I'll probably misunderstand some details here,
but something like this should help.
Case (a) is covered by this, I think:
bysort country date1 (date) : replace status = status[_n-1]
if mi(status) & status[_n-1] != "PC"
Case (b) is worse, but tractable. I think you need some preparation
gen negdate = -date
gen next = status
bysort country date1 (negdate) :
replace next = status[_n-1] if mi(next)
bysort country date1 (date) :
replace status = next
if status == "PC" & (next == "W" | next == "NM")
by country date1 :
replace status = "M" if status == "PC"
by country date1 : replace status = status[_n-1] if mi(status)
Note some features here:
* The constraint that things are done for the same
-country- and the same value of -date1- are enforced
within the framework of -by:-.
* Reversing time is the natural way of looking ahead,
given that Stata is better at looking behind, because
it knows what it has just done, but what it is going
to do next.
Again, do consider the FAQ.
Nick
[email protected]
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/