Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: mysterious inaccuracy when adding big numbers
From
Richard Goldstein <[email protected]>
To
[email protected]
Subject
Re: st: mysterious inaccuracy when adding big numbers
Date
Fri, 08 Apr 2011 11:00:29 -0400
1. if you type -search precision- you will learn about this issue
2. id's are generally best as strings, however, so I would do something
like the following:
gen str id=str(province)+str(district)+str(commune)+str(household)
3. if you need to make it numeric insert "double" between your "gen" and
your "ID"
Rich
On 4/8/11 10:49 AM, Trang Nguyen wrote:
> Hi.
>
> I am working on a dataset with households as observations that are
> nested in communes, districts and provinces. I have variables
> - province: province number (3 digits)
> - district: district number within each province (max 2 digits)
> - commune: commune number within each district (max 2 digits)
> - household: household number within each commune (max 2 digits)
>
> I wanted to make a unique ID for each household that doesn't repeat
> across communes, districts and provinces, that also shows me all the
> province/district/commune information. So I did this:
> gen ID = province*1000000 + district*10000 + commune*100 + household
>
> I got a variable ID that is correct for the province, district and
> commune components, but the last two digits do not match the value of
> the household variable. Instead they are 04 or 12 or 20.
>
> Could someone please help me figure out why this is so? My output is
> below. Thanks much!
>
> . gen ID = province*1000000 + district*10000 + commune*100 + household
>
> . count if ID != province*1000000 + district*10000 + commune*100 + household
> 7932
>
> . format ID %15.0g
>
> . list province district commune household ID in 1/20
>
> +------------------------------------------------------+
> | province district commune househ~d ID |
> |------------------------------------------------------|
> 1. | 101 1 3 1 101010304 |
> 2. | 101 1 3 2 101010304 |
> 3. | 101 1 3 4 101010304 |
> 4. | 101 1 3 5 101010304 |
> 5. | 101 1 17 4 101011704 |
> |------------------------------------------------------|
> 6. | 101 1 17 5 101011704 |
> 7. | 101 1 17 6 101011704 |
> 8. | 101 1 17 8 101011712 |
> 9. | 101 1 17 9 101011712 |
> 10. | 101 1 17 10 101011712 |
> |------------------------------------------------------|
> 11. | 101 1 17 11 101011712 |
> 12. | 101 3 3 3 101030304 |
> 13. | 101 3 3 4 101030304 |
> 14. | 101 3 3 5 101030304 |
> 15. | 101 3 3 6 101030304 |
> |------------------------------------------------------|
> 16. | 101 3 3 7 101030304 |
> 17. | 101 5 11 3 101051104 |
> 18. | 101 5 11 6 101051104 |
> 19. | 101 5 11 9 101051112 |
> 20. | 101 5 11 10 101051112 |
> +------------------------------------------------------+
>
> . codebook ID
>
> -----------------------------------------------------------------------------------
> ID (unlabeled)
> -----------------------------------------------------------------------------------
>
> type: numeric (float)
>
> range: [1.010e+08,8.231e+08] units: 1
> unique values: 1308 missing .: 0/8341
>
> mean: 4.6e+08
> std. dev: 2.6e+08
>
> percentiles: 10% 25% 50% 75% 90%
> 1.1e+08 2.1e+08 4.1e+08 7.1e+08 8.1e+08
>
> Thanks much!
>
> Trang
>
> ------------------------
> Trang Nguyen
> Doctoral student
> Johns Hopkins School of Public Health
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/