| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
st: Merge two long datasets? and re: stopping loops
Thanks Scott for the tip. I did make a mistake in copying my data, so
thanks for pointing that out. I have one follow-up and one new question:
First, there is the continue command for breaking out of loops. I just
found it in the Stata 9 Programming manual. So, anyone who is trying to
figure that out might want to check out that manual [P] under continue. I
wish I had found it earlier.
Second, before I found that command, I did as you advised, and managed to
merge the data together beautifully. However, this poses another question:
Is it possible to merge on two variables? That is, can I merge two
datafiles by momid AND by year at the same time? Or, is it always
necessary to convert both datasets back to wide form, then merge, then
reconvert the new dataset to long. This is what I did. I have done some
digging to try to figure out how to merge long datasets, and I have always
come up short.
Claire
At 03:27 PM 8/23/2006, [email protected] wrote:
I didn't read through all your code, but perhaps, using -merge- can
accomplish your goal. See the exmaple below. Also it is not clear
how, starting with your initial data set, you get the 1991 Divorced
and 1992 Remarried for momid 2 in your final data set.
Scott
. l , noobs sepby(mom)
+---------------------------------------------------+
| momid year type1 y1 type2 y2 |
|---------------------------------------------------|
| 1 2000 Married 2000 . |
| 1 2001 . . |
| 1 2002 Separated 2001 Divorced 2001 |
| 1 2003 . . |
| 1 2004 . . |
|---------------------------------------------------|
| 2 1988 Married 1987 . |
| 2 1989 . . |
| 2 1990 . . |
| 2 1991 . . |
| 2 1992 . . |
| 2 1993 . . |
| 2 1994 . . |
| 2 1995 . . |
| 2 1996 Divorced 1993 . |
| 2 1997 . . |
| 2 1998 . . |
| 2 1999 . . |
| 2 2000 Remarried 1998 . |
+---------------------------------------------------+
. drop year
. rename y1 year
. sort mom year
. merge mom year using "C:\Documents and
Settings\scott.merryman\Desktop\foo.dta"
variables momid year do not uniquely identify observations in the
master data
. drop if year ==.
(13 observations deleted)
. drop _m
. sort mom year
. order mom year type1 y1 type2 y2
. l, noob sepby(mom)
+---------------------------------------------------+
| momid year type1 y1 type2 y2 |
|---------------------------------------------------|
| 1 2000 Married 2000 . |
| 1 2001 Separated . Divorced 2001 |
| 1 2002 Separated 2001 Divorced 2001 |
| 1 2003 . . |
| 1 2004 . . |
|---------------------------------------------------|
| 2 1987 Married . . |
| 2 1988 Married 1987 . |
| 2 1989 . . |
| 2 1990 . . |
| 2 1991 . . |
| 2 1992 . . |
| 2 1993 Divorced . . |
| 2 1994 . . |
| 2 1995 . . |
| 2 1996 Divorced 1993 . |
| 2 1997 . . |
| 2 1998 Remarried . . |
| 2 1999 . . |
| 2 2000 Remarried 1998 . |
+---------------------------------------------------+
----- Original Message -----
From: "Claire M. Kamp Dush" <[email protected]>
Date: Wednesday, August 23, 2006 12:52 pm
Subject: st: programming: stopping loops?
To: [email protected]
> Hello, I feel embarrassed to post this because I am sure the
> answer to this
> is obvious, but I have been puzzling over this issue for a few
> hours. I am
> trying to recode the family structure data in the NLSY 79 through
> 2004. I
> am trying to go back and recode the data for missing years based
> on reports
> of marital changes between interviews at follow-ups. For
> instance, if an
> individual was interviewed in 1991 and not in 1992, in 1993 they
> are asked
> to report up to 3 marital changes since the last time they were
> interviewed. My data is stacked, with each individual having 26
> lines of
> data, for years 1979 through 2004. The id variable is momid and
> the year
> variable is year. change1type, change2type, and change3type are
> measured
> each year where the respondent has data, and is a categorical
> variable with
> categories including married, divorced, separated, widowed, etc.
> changey1_
> , changey2_, and changey3_ are the years in which the each change
> is said
> to occur. Here is an example of what the data look like:
>
> momid year change1type changey1_ change2type
> changey2_1 2000 Married 2000
> 1 2001
> 1 2002 Separated 2001 Divorced
> 2001
> 1 2003
> 1 2004
> 2 1988 Married 1987
> 2 1989
> 2 1990
> 2 1991
> 2 1992
> 2 1993
> 2 1994
> 2 1995
> 2 1996 Divorced 1993
> 2 1997
> 2 1998
> 2 1999
> 2 2000 Remarried 1998
>
> My goal is to have my data look like the following:
>
> momid year change1type changey1_ change2type
> changey2_
> change1misstype change2misstype
> 1 2000 Married 2000
> Married
> 1 2001
> Separated Divorced
> 1 2002 Separated 2001 Divorced
> 2001
> 1 2003
> 1 2004
> 2 1987
> Married
> 2 1988 Married 1987
> 2 1989
> 2 1990
> 2 1991 Divorced 1991
> Divorced Remarried
> 2 1992 Remarried 1991
> 2 1993
> Divorced
> 2 1994
> 2 1995
> 2 1996 Divorced 1993
> 2 1997
> 2 1998
> Remarried
> 2 1999
> 2 2000 Remarried 1998
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
Claire M. Kamp Dush, Ph.D.
Postdoctoral Fellow, Evolving Family Theme Project
Cornell University
Bronfenbrenner Life Course Center
Bebee Hall
Ithaca, NY 14853
607-255-9908
http://www.socialsciences.cornell.edu/0407/evolv_fam_desc.html
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/