[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Re: merge and joinby

From	Christopher F Baum <[email protected]>
To	[email protected]
Subject	st: Re: merge and joinby
Date	Thu, 28 Aug 2003 06:45:20 -0400

On Thursday, August 28, 2003, at 02:33 AM, John wrote:

What is the difference between the way .merge and .joinby work? I've been
using joinby because it appears to work the same way relational databases
do, and I'm familiar with that concept.

That is correct--joinby forms the Cartesian product (outer join), which users of RDBMS are exhorted to avoid at all costs (run a proposed SELECT statement with an outer join by your DBA and see what s/he says). You practically never really want a Cartesian product, which generates a row (observation) for every defined combination of the two sets (in Stataese, the master and using dataset). More usually, you want to somehow match the observations in the using dataset with the master dataset -- with a one-to-one, one-to-many, or many-to-one merge. If you have about the same number of obs. in both datasets it would seem that you're really trying to do a one-to-one merge. joinby will not achieve that, but will generate a huge number of observations in the Cartesian product (about 450^2? 450^2 obs and 47 variables is quite a bit larger than 450 x 45).

Kit

*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/

Prev by Date: st: RE: stcox with continuously time-varying covariates only
Next by Date: st: Re: SAS to Stata do instruction
Previous by thread: st: RE: Re: SAS -> Stata
Next by thread: st: RE: NHSDA data, accounting for the sampling design RE: statalist-digest
Index(es):
- Date
- Thread