Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: replacing missing time period data with next closest period
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: replacing missing time period data with next closest period
Date
Wed, 16 Mar 2011 16:15:03 +0000
I think you can just
1. -ipolate-
2. replace interpolations of 0.5 with missing
3. round the interpolation using -round()-.
That rounds to the nearest integer, which is just going to be 0 or 1 according to whether interpolated values are below or above 0.5.
Two lines, I imagine, for each variable:
bysort state (date) : ipolate policy_on_X date, gen(policy_on_X_2)
replace policy_on_X_2 = cond(policy_on_X_2 == 0.5, ., round(policy_on_X_2))
inside a loop over variables:
sort state date
foreach v of var <whatever> {
by state : ipolate `v' date, gen(`v'_2)
replace `v'_2 = cond(`v'_2 == 0.5, ., round(`v'_2))
}
Nick
[email protected]
Doug Hess
I'm hoping there are a set of commands that will help me edit cells
for this missing data problem.
Each row in my dataset is a month for each state (and DC) over five
years (i.e., 51x5x12= 3,060 rows). The columns are binary values for
the reported presence of a policy that some states in some months have
implemented and others have not. Unfortunately, the months in which
states report the existence (or not) of the policy option varies by
year, and only the months when the option is reported have been coded.
Thus, a state that reports not having selected a policy option in
April 2001 and then reports adopting the policy in August 2001, with
no reports in between, would have the value of 0 for April and 1 for
August, but missing "." for the other 10 months of the year (i.e.,
missing until the next report).
I would like to fill in the months in between reports with the value
of the next closest month that has a value, and default to zero if the
missing month is equidistant to a 0 or 1. I.e., in the example above,
assuming the policy had not changed before or after these reported
months, the cells from January thru June = 0, July thru December= 1.
I'm starting to think I'll just do this by hand, but I wonder if
there's some nifty "if then...do" routine that can be created as I
would like to add more years and do this for a large number of policy
variables.
If it matters: I'm using Stata 11 and don't have Stata programming
language experience (as opposed to issuing commands). My computer
programming skills stopped circa 1982 when I learned Basic on an Apple
IIe. So, if there's a choice between clunky commands and elegant
programming with the same results, I might go for the clunky commands.
But I am happy to learn if there's a resource I need to consult to
understand any advice or tips that are provided.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/