Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Nick Cox <njcoxstata@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: creating panel of household surveyed in different year |
Date | Wed, 22 May 2013 07:56:11 +0100 |
As Chamara says, there are different ways to do this. Here is a slightly different approach. bys hhid (survey_year): gen numsurvey=_N bys hhid: gen entry = _n == 1 bys hhid: gen exit = _n == _N Nick njcoxstata@gmail.com On 22 May 2013 06:46, Chamara Anuranga <kcanuranga@gmail.com> wrote: > Dear Prarkash > > There are different way you can do same thing on Stata. If your > dataset organized like this I suggest to do the following > bys hhid: gen numsurvey=_N > bys hhid: egen minyear=min(survey_year) > > gen entry=survey_year==minyear > bys hhid: egen maxyear=max(survey_year) > gen exit=survey_year==maxyear > > > numsurvey gives the number of times each household appear on the survey > entry is 1 if the household appear on the survey at first time and otherwise 0 > exit is 1 if the household surveyed at the last time > do the label defne for entry and exit variables. > > label define yesno 1 "Yes" 2 "No" > label val entry yesno > label val exit yesno > > if household only survey once entry and exit both get 1 (yes). However > you can change the variable way you prefer base on numsurvey > variable. > > Hope this help. > > Thanks, > Chamara > > On Wed, May 22, 2013 at 10:19 AM, Prakash Singh <prakashbhu@gmail.com> wrote: >> Thanks Chamara and Nick >> >> Nick, I am providing the id of first ten household surveyed in 1997 >> and 2002 below. >> >> hhid survey_year entry exit >> 181004 1997 >> 181007 1997 >> 181113 1997 >> 181801 1997 >> 182003 1997 >> 182601 1997 >> 182615 1997 >> 182711 1997 >> 182716 1997 yes >> 182803 1997 yes >> 181001 2002 yes >> 181004 2002 >> 181007 2002 >> 181113 2002 >> 181801 2002 >> 182003 2002 >> 182201 2002 yes >> 182601 2002 >> 182615 2002 >> 182711 2002 >> >> Now if you look at the id, household no 181001 and 182201 were not >> part of 1997 survey household no 182716 and 182803 did not >> participated in the 2002 survey. >> >> My interest is first to generate one variable which identifies >> households participated in all the survey; second variable identifying >> new household in the survey and finally third variable identifying >> household not participated in survey. >> There are two more rounds of data which I am extracting still. >> >> I hope I have made progress in expressing my query. >> >> >> >> Prakash >> >> On Tue, May 21, 2013 at 5:30 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>> My own guess is that Prakash's previous post was ignored because it >>> was too vague about precise data structure and the revised post >>> doesn't add much. At least that is why I deleted it. A specific >>> example showing what you have is usually preferable to a long verbal >>> discussion. >>> >>> The solution below seems unnecessarily complicated. Splitting the >>> dataset into three and then -merge-ing them back again is only >>> possible if there is some identifier in the dataset that tells you >>> which survey round is being referred to. Why not just do it in place? >>> >>> At its simplest the number of rounds in which each household >>> participated may just be the number of times the household appears in >>> the dataset. Otherwise there should be some round identifier. There >>> seems little point in speculating about variables, as Prakash can >>> (please) give concrete details. >>> >>> Same applies to entry and exit: show us how the data are held, and >>> specific suggestions are then much easier. >>> >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 21 May 2013 11:42, Chamara Anuranga <kcanuranga@gmail.com> wrote: >>>> Dear Prakash, >>>> >>>> keep id variable in each survey and create new variable to identify each survey. >>>> for the dataset 1 >>>> gen svyname1="survey1" >>>> for dataset 2 >>>> gen svyname2="survey2" >>>> >>>> etc. >>>> now you have 3 datasets. Merge them base on id. check the missing for svynames >>>> >>>> egen totmiss=rowmiss(svyname*) >>>> >>>> if rowmiss if 0 it mean those household appears in 3 rounds. >>>> >>>> >>>> Thanks >>>> Chamara >>>> >>>> >>>> >>>> >>>> On Tue, May 21, 2013 at 3:57 PM, Prakash Singh <prakashbhu@gmail.com> wrote: >>>>> Hello every one >>>>> I had sent mail earlier also but may the subject was not appropriate >>>>> to draw attention so I am sending this again with revised subject. >>>>> >>>>> I am working with survey dataset of more than three rounds. The >>>>> identification code for each household is similar in all the rounds. >>>>> >>>>> Now there are some households which are surveyed in all the years, >>>>> there are some households surveyed in some year but not in other years >>>>> (did not participated in the survey). Now I want to map the households so that >>>>> I can know which household is surveyed in more than two rounds and >>>>> also want create panel of household which are surveyed in all the >>>>> years. >>>>> >>>>> I am also interested in entry and exit of households, where entry >>>>> means new household coming in the subsequent round of survey and exit >>>>> means leaving the survey in the subsequent round of survey. >>>>> >>>>> Please suggest how should I workout this problem. >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/