Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Generating dummy variable with information of household survey from different observations
From
Eric Booth <[email protected]>
To
[email protected]
Subject
Re: st: Generating dummy variable with information of household survey from different observations
Date
Mon, 7 May 2012 08:35:04 -0500
On May 7, 2012, at 1:13 AM, Sumiko Hayasaka wrote:
> Everything works out until I get to the "foreach" command. It says the
> expression is too long [r(130)]. What should I do?
> Thank you again!
The r(130) error comes from the -inlist()- part of the -generate- command I showed because, at some point, it has too many elements.
This means you have a lot of father_row* variables after the initial -reshape-, probably because you don't have individual_id's like {1,2,3…} like you show, but individual id's like {99998,99917,…} that are unique to all (or most) individual_id's. One way to get around this would be to generate individual_id's within the household using the -egen- function 'group()' or :
bys household_id (individual_id): g i = _n
and then using "i" in place of individual_id in my example (but, you'd need to remember to carry 'individual_id' through the -reshape-).
That will get around the too many values issue assuming you don't have many hundreds of people in a household (inlist()'s limit appears to be 250 - though its not in -help limits- so I don't know if that limit is the same across all versions/flavors of Stata --I've got MP, and 250 is the limit I've encountered).
Of course, NJC's examples with looping over individuals is resilient against this type of issue with my code, but I wanted to follow up to explain where/why my example failed.
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
On May 7, 2012, at 1:13 AM, Sumiko Hayasaka wrote:
> Thanks Eric!
>
> Everything works out until I get to the "foreach" command. It says the
> expression is too long [r(130)]. What should I do?
>
> Thank you again!
>
>
> On Sun, May 6, 2012 at 11:44 PM, Eric Booth <[email protected]> wrote:
>> <>
>>
>> ***************!
>> clear
>> inp household_id individual_id father_row
>> 1011 1 .
>> 1011 2 .
>> 1011 3 1
>> 1011 4 1
>>
>> 1012 1 2
>> 1012 2 .
>>
>> 1013 1 .
>> 1013 2 .
>> 1013 3 2
>> 1013 4 1
>> 1013 5 1
>> end
>>
>>
>> levelsof individual_id, loc(a)
>> reshape wide father_row, i(household_id) j(individual_id)
>> ds father_row*
>> loc checklist `r(varlist)'
>> loc checklist:subinstr loc checklist " " ", " , all
>> foreach n in `a' {
>> g father`n' = cond(inlist(`n', `checklist'), 1, 0, .)
>> }
>> reshape long father_row father, i(household_id) j(individual_id)
>>
>>
>> ***************!
>> - Eric
>>
>> __
>> Eric A. Booth
>> Public Policy Research Institute
>> Texas A&M University
>> [email protected]
>> +979.845.6754
>>
>> On May 6, 2012, at 10:34 PM, Sumiko Hayasaka wrote:
>>
>>> I am trying to generate a dummy variable, with information from a
>>> household survey, which can tell if a member of the household is a
>>> father or not. I have a household id, an individual id (per
>>> household), and a variable that tells me which individual id is marked
>>> as being a father (members of the family are asked if their father
>>> lives in the household and to give their father's individual id).
>>> Therefore, I need to assign a 1 at the row in which someone at the
>>> household said that was a father. To illustrate this, the data is
>>> something like this (I am trying to get the "father" variable):
>>>
>>> household_id individual_id father_row father
>>> ------------------------------------------------------------------------
>>> 1011 1 . 1
>>> 1011 2 . 0
>>> 1011 3 1 0
>>> 1011 4 1 0
>>>
>>> 1012 1 2 0
>>> 1012 2 . 1
>>>
>>> 1013 1 . 1
>>> 1013 2 . 1
>>> 1013 3 2 0
>>> 1013 4 1 0
>>> 1013 5 1 0
>>>
>>>
>>> So, for example, members number 3 and 4 of household number 1011
>>> stated that their father is individual number 1 in that household.
>>> This means that I have to put the 1 of "father" (meaning the household
>>> member is a father) at the row where father_row indicates (no matter
>>> how many times this is done).
>>>
>>> Is there any way in which I can do this? I really appreciate your
>>> help! Thank you!
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/statalist/faq
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/