| |
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]
RE: st: Re: Creating a unique identifier from a string and byte variable
From |
"Gauri Khanna" <[email protected]> |
To |
[email protected] |
Subject |
RE: st: Re: Creating a unique identifier from a string and byte variable |
Date |
Tue, 03 Apr 2007 09:22:39 +0000 |
Dear Sergiy,
Thanks for the prompt reply. I tried both options, and the first one worked.
I checked for -duplicates report my_id- and there are none, so indeed a
combination of "caseid" and "bidx" works. So my problem is solved but I
still wanted to show what went wrong with the second command.
The second option
In your example bidx is somewhere between 0 and 999, so one can:
gen my_id=string(caseid)*1000+bidx
which will create a number identifier
You must check that caseid*1000 still can be stored completely (without
loss) in Stata.gave the following error :
gen my_id=string(caseid)*1000+bidx
type mismatch
r(109);
I don't think it is possible to multiply a number with a string variable.
(Variable bidx only has two values, 1 and 2.)
Thank you.
Gauri
From: "Sergiy Radyakin" <[email protected]>
Reply-To: [email protected]
To: <[email protected]>
Subject: st: Re: Creating a unique identifier from a string and byte
variable
Date: Tue, 3 Apr 2007 11:01:22 +0200
Hi,
you can always create a string identifier from variables of different
types.
In your example:
gen my_id=caseid+"#"+string(bidx)
Notice that symbol "#" separates the two sources, which resolves the
frequent
problem:
caseid=123 bidx=4 => my_id=1234
caseid=12 bidx=34 => my_id=1234
Use any symbol instead of "#" which is not in your identifiers.
Another technique can be used when one of the ids is of low dimension.
In your example bidx is somewhere between 0 and 999, so one can:
gen my_id=string(caseid)*1000+bidx
which will create a number identifier
You must check that caseid*1000 still can be stored completely (without
loss)
in Stata.
Notice that the identifiers will be "unique" (as you requested) only if a
combination
of caseid and bidx is unique.
Hint: use -compress- to reduce the types to simplier ones, e.g. Long-->Byte
(if possible).
Regards,
Sergiy Radyakin
----- Original Message ----- From: "Gauri Khanna" <[email protected]>
To: <[email protected]>
Sent: Tuesday, April 03, 2007 10:42 AM
Subject: st: Creating a unique identifier from a string and byte variable
Dear Stata List,
I am using cross sectional data with around 31,000 observations. I would
like to create a unique identifier called "idchild" composed of two
variables: caseid(string variable) and bidx(byte). I have described the
variables below and listed them as well (observations, 29 & 30, 37 & 38,
960 & 961 have the same caseid's but different bidx's).
des caseid
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
caseid str15 %15s case identification
. des bidx
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
bidx byte %8.0g birth column number
. list caseid bidx
+------------------------+
| caseid bidx |
|------------------------|
1. | 2 1 66 4 1 |
2. | 2 1 66 7 1 |
3. | 2 1 93 7 1 |
4. | 2 1 111 4 1 |
5. | 2 1 147 2 1 |
|------------------------|
6. | 2 1 174 7 1 |
7. | 2 1 201 2 1 |
8. | 2 1 237 2 1 |
9. | 2 1 255 3 1 |
10. | 2 2 40 2 1 |
|------------------------|
11. | 2 2 65 4 1 |
12. | 2 2 95 2 1 |
13. | 2 2 105 11 1 |
14. | 2 2 120 4 1 |
15. | 2 2 130 4 1 |
|------------------------|
16. | 2 2 145 7 1 |
17. | 2 3 13 2 1 |
18. | 2 3 25 4 1 |
19. | 2 3 55 2 1 |
20. | 2 3 91 6 1 |
|------------------------|
21. | 2 3 97 4 1 |
22. | 2 3 97 8 1 |
23. | 2 3 121 6 1 |
24. | 2 3 139 2 1 |
25. | 2 3 145 3 1 |
|------------------------|
26. | 2 3 157 3 1 |
27. | 2 3 181 3 1 |
28. | 2 4 62 2 1 |
29. | 2 4 89 7 1 |
30. | 2 4 89 7 2 |
|------------------------|
31. | 2 4 116 3 1 |
32. | 2 4 134 2 1 |
33. | 2 4 197 5 1 |
34. | 2 4 251 3 1 |
35. | 2 4 260 3 1 |
|------------------------|
36. | 2 5 277 8 1 |
37. | 2 5 413 8 1 |
38. | 2 5 413 8 2 |
39. | 2 5 429 4 1 |
40. | 2 5 445 4 1 |
|------------------------|
41. | 2 5 461 4 1 |
42. | 2 5 469 2 1 |
43. | 2 5 501 4 1 |
44. | 2 5 509 2 1 |
45. | 2 5 533 2 1 |
|------------------------|
46. | 2 5 549 2 1 |
47. | 2 5 557 2 1 |
48. | 2 6 93 2 1 |
49. | 2 6 159 2 1 |
50. | 2 6 311 2 1 |
|------------------------|
51. | 2 7 7 3 1 |
52. | 2 7 20 3 1 |
53. | 2 7 85 5 1 |
54. | 2 7 98 3 1 |
........
959. | 2116 52 4 1
960. | 2116 56 4 1
|------------------------------------|
961. | 2116 56 4 2
962. | 2116 60 3 1
963. | 2116 84 4 1
964. | 2116 112 4 1
965. | 2117 10 2 1
|------------------------------------|
966. | 2117 26 5 1
967. | 2117 50 5 1
968. | 2117 54 8 1
969. | 2117 58 4 1
970. | 2117 62 3 1
|------------------------------------|
971. | 2117 62 3 2
972. | 2117 86 5 1
973. | 2117 86 6 1
974. | 2117 130 2 1
975. | 2117 134 2 1
--Break--
I tried the following :
egen idchild = concat(caseid, bidx)
invalid syntax
r(198);
I realise that I am trying to concatenate two different *types* of
variables and so I then tried the following:
decode bidx, gen(childbidx)
bidx not labeled
r(182);
Then I tried changing the caseid variable:
encode caseid, gen(childcase)
. des childcase
storage display value
variable name type format label variable label
-------------------------------------------------------------------------------
childcase long %15.0g childcase
case identification
. egen idchild = concat(childcase, bidx)
invalid syntax
r(198);
How can I create a unique idchild? I am using Stata 9.2.
Thank you for your help.
Regards,
Gauri
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/