Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: backfill missing data
From 
 
David Kantor <[email protected]> 
To 
 
[email protected] 
Subject 
 
Re: st: backfill missing data 
Date 
 
Tue, 24 Aug 2010 12:14:18 -0400 
At 11:55 AM 8/24/2010, David Torres wrote:
I'm working with longitudinal data (12 rounds of info collected so
far) and need to backfill information for respondents who were not
interviewed in a given year subsequent to round 1.  Information on my
variables of interest, when not collected in a round due to
noninterview, can be gathered in the next round in which respondents
are interviewed.  I'd like to carry that information back so that it
fills in the missing cells in the year and job number to which it
should apply.
I've concatenated unformatted date variables for each year and job
number so that start and finish dates for a job are carried back
together.  Every pair of numbers, then, including the space in
between, represent a start and finish date.  All dates here, though
for example purposes only, are year specific.  An example of what I
have, then, is:
pubid stfin1_1998 stfin2_1998 stfin1_1999 stfin2_1999 stfin1_2000 stfin2_2000
1     13901 14200 14100 14200                         14247 14590
2     13890 14198                                     14310 14525
3                                                     14000 14208 14311 14915
4                             13883 14650 14351 14600 14635 14900
For pubid 1, the values in stfin1_2000 would be copied to stfin1_1999
as it applies to that year.  The same goes for pubid 2.  In pubid 3,
stfin1_2000 should be copied to stfin1_1998 as it applies to that
year; stfin2_2000 should be copied to stfin1_1999 since it applies to
that year.  In pubid 4, stfin1_1999 should be copied to stfin1_1998.
I only mean to copy follow-up year information to cells for which
current year information is missing, or ". ."
Is there an easy way to do this across several years and job numbers
at the same time?  Perhaps using a foreach command?
I recommend reshaping to long, though it may be complicate by having 
the stfin1_ stfin2_ variables to be of the same series. You may need 
to do something clever to make that happen.
Follow that by a use of carryforward. See -ssc desc carryforward-.
You may want to go backward as well as forward (or maybe backward 
only). The help for carryforward explains that.
Finally, if you prefer, reshape it back to the way it was. Though, it 
may be better to let it stay in long form.
HTH
--David
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/