Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
From
Steve Samuels <[email protected]>
To
[email protected]
Subject
Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
Date
Thu, 6 Feb 2014 19:28:58 -0500
As you point out, Nick, Holly has not told us what she wants to do. I'm
not sure either, as she's tried reshapes both long and wide.
I've analyzed many data sets of reproductive histories, and I always use
the long format. With this format, it's easy to create running totals of
different kinds, for example parity, the number of births + stillbirths
at a given time. We also can merge in dates of different risk factor
exposures and so easily assign a prior or recent exposure to each
pregnancy. In fact, we always *collect* the data in long format,
as it shortens the codebook and greatly simplifies the edit process. It
also allows a woman to recall a pregnancy out of order, because one
can re-order by date.
Steve
[email protected]
On Feb 6, 2014, at 6:14 PM, Nick Cox <[email protected]> wrote:
Thanks. That helps. I have already explained why -reshape wide-
doesn't work. -reshape- maps one variable to several, but the "birth"
you feed to it is the stub for several variables.
Otherwise, I don't see why you want or need to -reshape- at all.
Several variables are repeated for each woman and some vary. Depending
on what you want to do, you either reduce the dataset by removing
duplicates or keep the whole dataset.
Alternatively, if you explain why you (think you) need to -reshape-
that might illuminate what is being misunderstood, or what you need to
do.
Nick
[email protected]
On 6 February 2014 20:31, Holly E Reed <[email protected]> wrote:
> Hi Nick,
>
> Here is an extract from the dataset:
>
> i_weight masterid birth1 birth2 birth3 birth4 hhid id provage12 urbanage12 evermig relation sex age cyear1 cyear2 cyear 3 yrbirth yrdate
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1979
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1980
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1981
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1982
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1983
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1984
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1985
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1986
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1987
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1988
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1989
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1990
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1991
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1992
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1993
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1994
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1995
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1996
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1997
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1998
> 37.64478 43 1998 . . . 4 17 "E Cape" "Rural" 1 "Son/Daughter" "F" 21 1908 1909 1910 1979 1999
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1982
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1983
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1984
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1985
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1986
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1987
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1988
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1989
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1990
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1991
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1992
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1993
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1994
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1995
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1996
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1997
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1998
> 37.64478 54 . . . . 5 23 "W Cape" "Urban" 0 "Son/Daughter" "F" 18 1908 1909 1910 1982 1999
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1933
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1934
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1935
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1936
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1937
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1938
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1939
> 37.64478 111 1955 1959 1964 1966 11 51 "E Cape" "Rural" 1 "Head" "F" 67 1908 1909 1910 1933 1940
>
> And so on...So these are three women, first is age 21, with one birth in 1998; second is age 18, with no births; third is 67 years old, with four births (she actually had seven, but I didn't show all birth* variables here to save space) in 1955, 1959, 1964, and 1966.
>
> I only listed birth1-birth4, but in fact there are birth 1-birth10 which follow along the same lines. Also cyear1-cyear93 are in the dataset; same for each person and each year 1908-2000.
>
> In terms of code, I have tried three iterations of coding for the reshape command:
>
> reshape long birth, i(id) j(year)
>
> reshape long birth cyear, i(id) j(year)
>
> AND
> reshape wide birth, i(id) j(year)
>
> Hoping that this helps to clarify things a bit...any ideas?
>
> Thanks,
> Holly
> _______________________________________________
> From Nick Cox <[email protected]>
> To "[email protected]" <[email protected]>
> Subject Re: st: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> Date Thu, 6 Feb 2014 19:30:53 +0000
> Without seeing exactly the kind of data and exactly the kind of code
> that produce problems, it is very hard to comment further. We are not
> asking to see the whole dataset, but enough that is concrete to
> understand your problem.
>
> If you have variables -birth*- then -reshape wide birth- will
> inevitably fail, but why -reshape long- will fail is unclear.
>
> Nick
> [email protected]
> ________________________________________
> From: Holly E Reed
> Sent: Thursday, February 06, 2014 1:27 PM
> To: [email protected]
> Subject: Re: How do I create a calendar year variable by person id before reshaping to person-year dataset?
>
> Hi Ronnie,
>
> Thanks for your reply. That is, in fact, exactly what my data look like; of course, some people do not have births, so they have missing values for birth1, birth2, etc. or if they only have one child, they have missing values for all birth variables except birth1.
>
> The dataset is so large and there are a number of variables in addition to the ones listed, such as weights, region at age 12, urban/rural at age 12, relationship to HH head, ever migrated...that's why I didn't post a sample of the actual dataset.
>
> Thanks,
> Holly
> _______________________________________________________
> From Ronnie Babigumira <[email protected]>
> To [email protected]
> Subject Re: st: RE: How do I create a calendar year variable by person id before reshaping to person-year dataset?
> Date Thu, 6 Feb 2014 09:40:12 +0100
> Showing an example of the actual data you are trying to reshape will
> help because, following your previous posting, the solution Maarten
> shared, and this new information about birth1---10, this is what you
> would be trying to reshape.
>
> id age sex birthyear year birth1 birth2
> 1 5 F 1995 1995 2010 2012
> 1 5 F 1995 1996 2010 2012
> 1 5 F 1995 1997 2010 2012
> 1 5 F 1995 1998 2010 2012
> 1 5 F 1995 1999 2010 2012
> 2 3 M 1997 1997 2012 2013
> 2 3 M 1997 1998 2012 2013
> 2 3 M 1997 1999 2012 2013
> 3 10 F 1998 1998 2009 2011
>
> I doubt that your data look like this
>
> Ronnie
> ________________________________________
> From: Holly E Reed
> Sent: Wednesday, February 05, 2014 2:23 PM
> To: [email protected]
> Subject: RE: How do I create a calendar year variable by person id before reshaping to person-year dataset?
>
> Thank you for your help, Maarten. It worked great. But now I am receiving error messages when I try to reshape the data. No matter how I reshape, it tells me that the data are already in that format: "Data are already wide" or "Data are already long" I have tried to do this several times, but with no luck yet.
>
> This is my code:
>
> reshape wide birth, i(id) j(year)
>
> birth is a variable with suffix 1-10 (e.g., birth1, birth2, birth3, etc.) which is the year of a woman's first birth, second birth etc.
>
> Sometimes the error message says "variable year not found"; I thought that year was a new variable that would be created? And once it said "i=id does not uniquely identify the observations; there are multiple observations with the same value of id." But I thought that was the point!?
>
> If you can shed some light on these issues, I would appreciate it!
> Thanks, Holly
> _____________________________________
>
> by id : gen year = birthyear + _n -1
>
> also look at -help stsplit- as that command is there for creating such datasets.
>
> Hope this helps,
> Maarten
> __________________________________________
> Does the age of the person increase each year?
>
> If so, you could use:
> gen year = age+birthyear
>
> If age does not increase each year, how do you know which year an
> observation belongs to?
> For example, how do you know the records aren't sorted like this:
>
> id age sex birthyear year
> 1 5 F 1995 1999
> 1 5 F 1995 1998
> 1 5 F 1995 1997
> 1 5 F 1995 1996
> 1 5 F 1995 1995
>
>
> Mike
>
> _______________________________________
> From: Holly E Reed
> Sent: Wednesday, February 05, 2014 12:03 PM
> To: [email protected]
> Subject: How do I create a calendar year variable by person id before reshaping to person-year dataset?
>
> Hi,
>
> I am trying to create a person-year dataset for event history analysis. The dataset currently has one observation per person per year of their life, e.g.:
>
> id age sex birthyear
> 1 5 F 1995
> 1 5 F 1995
> 1 5 F 1995
> 1 5 F 1995
> 1 5 F 1995
> 2 3 M 1997
> 2 3 M 1997
> 2 3 M 1997
>
> So person with id==1 is a 5-year old female born in 1995 and person with id==2 is a 3-year old male born in 1997. This is a simplified example to illustrate the dataset, as they are all adults and there are far more observations for each individual.
>
> The problem is that I have age and birthyear variables, but I want to create a calendar year variable before reshaping the data to person-year data. What is the easiest way to do this? In other words, I want the dataset to look like this:
>
> id age sex birthyear year
> 1 5 F 1995 1995
> 1 5 F 1995 1996
> 1 5 F 1995 1997
> 1 5 F 1995 1998
> 1 5 F 1995 1999
> 2 3 M 1997 1997
> 2 3 M 1997 1998
> 2 3 M 1997 1999
>
> Thank you very much for any help you can give me!
> Holly
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/