Lindsay,
try:
gen double hhidpn = hhid + pn
format hhidpn %15.0f
Tim
On Wed, Sep 23, 2009 at 10:00 AM, Lindsay <[email protected]> wrote:
> I am using Stata/SE 10.1 and having problems executing what should be
> a really simple operation. My dataset has household id (hhid) and
> person number (pn) variables in string format. I need to combine them
> into one numeric unique identifer of the form hhidpn = hhid*1000+pn in
> order to merge them with other data. After I destring the original
> IDs (which appears to work fine) and perform this operation, some of
> the identifiers are duplicates. It looks like Stata is somehow adding
> some of the numbers incorrectly (mostly they are +/-1 from what they
> should be). I have copied some of the output below.
>
> I've also tried adding the two string variables first and then
> destringing and I get the same problem. The string variable with both
> IDs combined looks right, but after I destring some are wrong. Any
> suggestions what might be going on?
>
> Thanks, Lindsay
>
> /**** FIRST METHOD (DESTRING THEN ADD) ****/
> . use "${dir}Geographic Identifiers\RGEO.dta", clear;
>
> . destring HHID, gen(hhid) float;
> HHID has all characters numeric; hhid generated as float
>
> . destring PN, gen(pn) float;
> PN has all characters numeric; pn generated as byte
>
> . replace hhid = hhid*1000;
> (30712 real changes made)
>
> . format hhid %9.0f;
>
> . gen hhidpn = hhid + pn;
>
> . format hhidpn %9.0f;
>
> . sort hhidpn
>
> . list HHID hhid PN pn hhidpn if hhidpn==hhidpn[_n-1]hhidpn==hhidpn[_n+1]
>
> +-------------------------------------------+
> HHID hhid PN pn hhidpn
> -------------------------------------------
> 1501. 016973 16973000 031 31 16973032
> 1502. 016973 16973000 032 32 16973032
> 1641. 017530 17530000 040 40 17530040
> 1642. 017530 17530000 041 41 17530040
> 1661. 017641 17641000 011 11 17641012
> -------------------------------------------
> 1662. 017641 17641000 012 12 17641012
> 1666. 017646 17646000 040 40 17646040
> 1667. 017646 17646000 041 41 17646040
> 1679. 017707 17707000 040 40 17707040
> 1680. 017707 17707000 041 41 17707040
> -------------------------------------------
> 1832. 018435 18435000 040 40 18435040
> 1833. 018435 18435000 041 41 18435040
> 1849. 018482 18482000 040 40 18482040
> 1850. 018482 18482000 041 41 18482040
> 1854. 018494 18494000 020 20 18494020
> ....
>
> /**** SECOND METHOD (ADD THEN DESTRING) ****/
> . use "${dir}Geographic Identifiers\RGEO.dta", clear;
>
> . gen HHIDPN = HHID + PN;
>
> . destring HHIDPN, gen(hhidpn) float;
> HHIDPN has all characters numeric; hhidpn generated as float
>
> . format hhidpn %9.0f;
>
> . sort hhidpn
>
> . list HHID PN HHIDPN hhidpn if hhidpn==hhidpn[_n-1]hhidpn==hhidpn[_n+1]
>
> +--------------------------------------+
> HHID PN HHIDPN hhidpn
> --------------------------------------
> 1501. 016973 031 016973031 16973032
> 1502. 016973 032 016973032 16973032
> 1641. 017530 040 017530040 17530040
> 1642. 017530 041 017530041 17530040
> 1661. 017641 011 017641011 17641012
> --------------------------------------
> 1662. 017641 012 017641012 17641012
> 1666. 017646 040 017646040 17646040
> 1667. 017646 041 017646041 17646040
> 1679. 017707 040 017707040 17707040
> 1680. 017707 041 017707041 17707040
> --------------------------------------
> 1832. 018435 040 018435040 18435040
> 1833. 018435 041 018435041 18435040
> 1849. 018482 040 018482040 18482040
> 1850. 018482 041 018482041 18482040
> 1854. 018494 020 018494020 18494020
> ....
>
>
>
> --
> Lindsay Sabik
> Doctoral Candidate in Health Policy
> Harvard University
> [email protected]
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/