Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Fwd: Constructing Household IDs
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Fwd: Constructing Household IDs
Date
Wed, 30 Jan 2013 09:12:25 +0000
This kind of question is frequently asked. Long integer or long string
solutions require care to avoid problems of precision and/or storage.
The solutions
egen id = group(SSS TTT DDD HHH), label
and
egen id = concat(SSS TTT DDD HHH)
are frequently overlooked and apply to numeric and string variables alike.
See also the reference given at
http://www.stata.com/statalist/archive/2012-10/msg00536.html
Nick
On Wed, Jan 30, 2013 at 8:37 AM, Andrea Smurra <[email protected]> wrote:
> Thanks to all,
>
> Nigussie, your method is definitely the most creative as it does not require
> anything more than gen and some algebra, thanks for your support.
> I ended up following Chamara method using the following command
>
> gen str3 z=string(x, "%003,0f")
>
> and then
>
> gen hhid=state+district +...
Il 30/01/2013 14:20, nigussie Tefera ha scritto:
>> Suppose each of them stand with a three digits identifiers, i.e. state has
>> a maximum of three digits numerical values and so no. Simply, suppose you
>> have 343 for sss, 213 for DDD, 567 for TTT … and at last you have 143 for
>> HHH. So if you want to generate unique household id identifier of the form
>> 343213567143, you can write the following simple command.
>> gen double hhid=10^9*sss+10^6*DDD+10^3*TTT+HHH
>> Note that 10^”x” could vary depending on the number of maximum digits that
>> either of them have....
Chamara Anuranga
>> check all identifier variable and check the maximum length.
>> state may be maximum 2 digits
>> district may be 3 digits etc.
>> add leading zeros to id variables base on maximum number
>>
>> format state %02.0f
>> format district %03.0f
>>
>> here % to represent format, 0 to represent leading zero and .0 is no
>> decimal places and f mean fix format
>>
>> then convert the variable to string.
>>
>> tostring state district,replace usedis
>>
>> then combine each string part using generate command
>> gen hhid=state+district
>> On Wed, Jan 30, 2013 at 12:27 PM, Andrea Smurra
>>> I am working with an household survey which doesn't have household IDs.
>>> Each household is identified by a series of variables (State, district,
>>> township, ..., household number).
>>> Within each state, the numbering of districts always starts from the
>>> integer
>>> 1, the same for towns within each district and so on up to the household
>>> in
>>> each ward.
>>> I tried to build unique HH identifier with the command "group", but I'd
>>> like
>>> to build a HH ID which looks like SSSDDDTTT...HHH
>>> where SSS is the state identifier (with the correct number of zeros
>>> appended
>>> when necessary (i don't know how to do it)), DDD is the District
>>> identifier
>>> and so on.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/