Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: conditional merging

From	"Ben Hoen" <[email protected]>
To	<[email protected]>
Subject	RE: st: conditional merging
Date	Wed, 7 Nov 2012 10:37:37 -0500

Thanks Nick.  

I am not sure there is a standard way that these "condition" values trend
over time across the whole dataset, and therefore interpolating them might
not be appropriate.  Moreover, for each home, there might not be many data
points.  Finally, the values that are allowable for condition are discreet
(non-continuous), and therefore would complicate a linear, cubic, cubic
spline process (though, of course that could be dealt with by using .=int(x)
).  Would the interpolation allow me to take into account all of these
characteristics?

For, in part, this reason, I was hoping to find some way to execute a
"conditional merge" (again, my words).  Additionally, the process of
learning how one might do it with this "condition" data, could be applied to
extracting other characteristic data that are also only present sporadically
across time (e.g., size of the home) but that also might periodically change
(e.g., the home might be added to).  

Is there a way to use if/then statements in a merge process?

Ben

Ben Hoen
LBNL
Office: 845-758-1896
Cell: 718-812-7589

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Nick Cox
Sent: Tuesday, November 06, 2012 6:47 PM
To: [email protected]
Subject: Re: st: conditional merging

Carry forward can be as little as one line of code: see

FAQ     . . . . . . . . . . . . . . . . . . . . . . . Replacing missing
values
        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J.
Cox
        2/03    How can I replace missing values with previous or
                following nonmissing values?
http://www.stata.com/support/faqs/data-management/replacing-missing-values/

I don't see that this is an imputation problem at all. It calls for
interpolation. Indeed, have you considered some kind of interpolation,
say linear, cubic, cubic spline?

On Tue, Nov 6, 2012 at 7:33 PM, Ben Hoen <[email protected]> wrote:

> I have two files sales.dta and condition.dta.  sales.dta has two variables
> (home_id saleyear), and condition.dta has three variables (home_id
> inspection_year condition).  The variable inspection_year can take the
vales
> of 2000-2011 for any home but for many homes only some years are present
(in
> many years the home was not inspected. Therefore a sample of the data
might
> look like:
>
> home_id inspection_year condition
> 50121           2002                    4
> 50121           2006                    4
> 50121           2011                    3
> 50681           2004                    2
> 50681           2010                    3
> 51040           2006                    2
> 51040           2010                    2
> 51040           2011                    3
>
> I would like to populate the sales.dta file with the condition of the
parcel
> in the inspection_year that is the closest to, but not following the
> sale_year.
>
> So, for example, the following dataset would result
>
> home_id sale_year       condition
> 50121           2007            4
> 50121           2011            3
> 50681           2008            2
> 51040           2003            .
> 51040           2010            3
>
> I know I am not the first person to have this problem, but could not find
> threads on this.  Maybe I am using the wrong search terms.  Any help would
> be greatly appreciated.
>
> (As I wrote this I realized one not as elegant work-around would be to
> fill-in missing data for each missing year in the condition.dta file,
> potentially using the user-written "carryforward" or even imputing the
data
> using, e.g., mi impute, and then matching home_id sale_year to home_id
> inspection_year.)
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: conditional merging
  - From: Nick Cox <[email protected]>

References:
- st: conditional merging
  - From: "Ben Hoen" <[email protected]>
- Re: st: conditional merging
  - From: Nick Cox <[email protected]>

Prev by Date: st: Question about ir
Next by Date: st: Need help weighting observations in a random-effects regression
Previous by thread: Re: st: conditional merging
Next by thread: Re: st: conditional merging
Index(es):
- Date
- Thread