Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: specific doubt

From   Buzz Burhans <[email protected]>
To   [email protected]
Subject   Re: st: specific doubt
Date   Mon, 13 Jan 2003 11:25:17 -0500

Hi Rodrigo,

It seems there are several possibilities your data is not aggregating as expected. Several that have given me problems recently (but not the only possibilities):

1. Run "codebook" on your ID and matching variables in each dataset, and closely examine that the IDs are the same in all datasets. I recently had a similar problem, and every time I used the browse to examine the data I couldn't see why I had trouble; running codebook revealed that my ID values were different, but were labeled the same so they appeared to be the same values when I looked at them in browse.

2. If you have duplicates within an id in some of the datasets, the joined dataset will be expanded into a larger set of all possible combinations of the duplicates...; you will need to decide if this is occuring

3. If the datasets have the same number of unique ids and no duplicates, merge could work, but observations in the larger sets will have missing values where there is no value in the smaller set

Buzz Burhans

At 09:48 AM 1/13/03 -0600, you wrote:

Hello Stata listers. I'm having a problem related with several databases and
I wish to ask you for specific information required.
I have four databases:
ENIGH-RJB1.dta=19,470 obs.
ENIGH-RJB2.dta=19,120 obs.
ENIGH-RJB3.dta=  7,719 obs.
ENIGH-RJB4.dta=23,715 obs.
First of all, the bases have the same ID variables (in this case could be
household number, member number, home number, primary unit of sample, etc.).
My purpose is to build an unique database with the differents variables that
have each database.
I tried:
use ENIGH-RJB4.dta
joinby using ENIGH-RJB3.dta, but I got a new database with 72,077
observations, that is totally wrong (I assume that the maximum would be
I tried too:
use ENIGH-RJB4.dta
merge ENIGH-RJB3.dta I got a large number of missings (from obs, 7720 up).
I know that I 'm doing something wrong but I can't understand what? Can
somebody please provide me some help?
Thank you very much.

*   For searches and help try:
*   For searches and help try:

© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index