Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
From
Holly E Reed <[email protected]>
To
"[email protected]" <[email protected]>
Subject
st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
Date
Wed, 12 Feb 2014 19:08:41 +0000
Hi,
I am reposting this as I never received a reply to my last question. I hope that someone can help me to figure out the best way to approach this.
I am trying to create a person-year dataset for use in a binomial logit event history model. I want to have one observation for each year of a person's life (one record per person-year), but I also need the calendar years as the variables, so that I can create covariates such as whether a woman had a birth in a particular year (0/1 dummy), whether a woman moved in a particular year (0/1 dummy), etc. There will be some non-time-varying covariates, of course, but for the data where I do have time covariance, I want to create dummies for events in each specific year.I will probably create parity as a covariate also, as Steve mentions, but the main question of interest is whether a woman gives birth in a particular year (this will be lagged by one year) and whether that predicts the probability of migration in the next year.
Does that make sense? If you have a better suggestion, please let me know. I did create a migration dataset from this same data several years ago, but I did not have the birth data at the time and I simply want to merge it with the migration dataset I already have, but I need it to be in the same format and I can't seem to get it work the way it did the last time.
Thanks,
Holly
______________
As you point out, Nick, Holly has not told us what she wants to do. I'm
not sure either, as she's tried reshapes both long and wide.
I've analyzed many data sets of reproductive histories, and I always use
the long format. With this format, it's easy to create running totals of
different kinds, for example parity, the number of births + stillbirths
at a given time. We also can merge in dates of different risk factor
exposures and so easily assign a prior or recent exposure to each
pregnancy. In fact, we always *collect* the data in long format,
as it shortens the codebook and greatly simplifies the edit process. It
also allows a woman to recall a pregnancy out of order, because one
can re-order by date.
Steve
[email protected]
On Feb 6, 2014, at 6:14 PM, Nick Cox <[email protected]> wrote:
Thanks. That helps. I have already explained why -reshape wide-
doesn't work. -reshape- maps one variable to several, but the "birth"
you feed to it is the stub for several variables.
Otherwise, I don't see why you want or need to -reshape- at all.
Several variables are repeated for each woman and some vary. Depending
on what you want to do, you either reduce the dataset by removing
duplicates or keep the whole dataset.
Alternatively, if you explain why you (think you) need to -reshape-
that might illuminate what is being misunderstood, or what you need to
do.
Nick
[email protected]
__________________________
From: Holly E Reed
Sent: Thursday, February 06, 2014 3:31 PM
To: [email protected]
Subject: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
Hi Nick,
Here is an extract from the dataset:
i_weight masterid birth1 birth2 birth3 birth4 hhid id provage12 urbanage12 evermig relation sex age cyear1 cyear2 cyear 3 yrbirth yrdate
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1979
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1980
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1981
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1982
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1983
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1984
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1985
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1986
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1987
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1988
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1989
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1990
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1991
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1992
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1993
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1994
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1995
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1996
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1997
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1998
37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1999
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1982
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1983
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1984
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1985
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1986
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1987
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1988
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1989
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1990
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1991
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1992
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1993
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1994
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1995
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1996
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1997
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1998
37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1999
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1933
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1934
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1935
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1936
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1937
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1938
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1939
37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1940
And so on...So these are three women, first is age 21, with one birth in 1998; second is age 18, with no births; third is 67 years old, with four births (she actually had seven, but I didn't show all birth* variables here to save space) in 1955, 1959, 1964, and 1966.
I only listed birth1-birth4, but in fact there are birth 1-birth10 which follow along the same lines. Also cyear1-cyear93 are in the dataset; same for each person and each year 1908-2000.
In terms of code, I have tried three iterations of coding for the reshape command:
reshape long birth, i(id) j(year)
reshape long birth cyear, i(id) j(year)
AND
reshape wide birth, i(id) j(year)
Hoping that this helps to clarify things a bit...any ideas?
Thanks,
Holly
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/