| 
    
 |   | 
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Re: Repeating values
...
I think this will do what you want (assuming that you also have a variable 
named company in your dataset):
gsort company -year
by company: gen todrop=sum(Var1==Var1[1] & Var1==Var1[_n+1])
by company: replace Var1=. if todrop=_n
drop todrop
The data are sorted in reverse time (as Nick mentioned) and then todrop is 
created as a counter for the cumulative number of observations equal to the 
first (final timewise) observation and also equal to the next observation 
(prior timewise).  Then, if the observation number equals todrop, it should 
be changed to missing.
Michael Blasnik
----- Original Message ----- 
From: "Thomas Erdmann" <[email protected]>
To: <[email protected]>
Sent: Thursday, December 07, 2006 11:56 AM
Subject: st: Repeating values
Hi,
I am working with some variables that or "wrong" in the sense that if one
share was taken off the market (i.e. the company was dissolved), the last
value of the variable is repeated instead of containing missing values.
e.g.
Status    Year    Var1
Listed    1991     0.9
Listed    1992     0.95
Listed    1993     0.93
Delisted  1994     0.93
Delisted  1995     0.93
..
Delisted  2006     0.93 (value is always repeated up to present time)
Whereas years 1994-2006 should contain missing values. I came up with this
cleaning process:
foreach X of varlist var1 var2 var3 {
generate `X'new=`X'
replace `X'new=. if `X'==L.`X'
replace `X'=`X'new
drop `X'new
}
Which is okay, but also sets the value to missing if one observation for a
listed company repeats, so it also deletes observations that would be 
fine.
Any suggestions on how I can only replace the "wrong" values?
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/