I have various comments on this code.
1. -foreach x in V*- won't work. Eric is probably thinking of
-foreach x of var V*- but in this case -foreach v in V1 V2 V3- takes no more thought.
2. Eric wants to -recode- missings and also concatenate the identifiers. If so, it is easier to go
egen V_combined = concat(V1 V2 V3), p(_)
replace V_combined = subinstr(V_combined, ".", "x", .)
That does what Eric's lines
foreach x in V* {
recode `x' (.=99) // <-- So that -regexr- isn't tripped up later
tostring `x', replace
}
gen str10 v_combined = V1+"_"+V2+"_"+V3
gen v_combined2 = regexr(v_combined, "99", "x")
seem intended to do.
Note that using 99, even temporarily, is dangerous unless one can be sure that 99 is not a legitimate identifier. In any case, why recode? A variable with values like "1 . ." is a satisfactory composite -- if that is what is needed.
Nick
[email protected]
Eric A. Booth
To add to Jochen's comment:
If you were hoping to have a new 'ID' variable that keeps the
information from all the ID variables V1, V2, and V3, you could create
a string variable...here are some examples:
******************
clear
input V1 V2 V3
1 . 1
2 . 2
3 3 3
4 4 . . 5 5
6 . 6
end
//
foreach x in V* {
recode `x' (.=99) // <-- So that -regexr- isn't tripped up later
tostring `x', replace
}
gen str10 v_combined = V1+"_"+V2+"_"+V3
gen v_combined2 = regexr(v_combined, "99", "x")
sencode v_combined2, gene(uniqueID) gsort(+v_combined2) label(id)
list
On Apr 29, 2009, at 11:36 AM, Jochen Späth wrote:
> Hello Sergio,
>
> I'm not quite sure of what your problem is, maybe it would help if
> you were a little more precise.
>
> Below, I assumed that the example you gave is AFTER your three data
> sets have been merged, with v1 coming from the first, v2 from the
> second and v3 from the third and with v1, v2 and v3 denoting all the
> same ID. If this is the case you could
>
> -replace v1 = v2 if v1 == . & v2 != .-
> -replace v1 = v3 if v1 == . & v2 ==. & v3 != .-
> -count if v1 == .- /* should return 0, otherwise there are
> observations in your data that are not uniquely determined by either
> of your three ID variables.*/
> -drop v2 v3- /* of course, only if you got all IDs caught in v1 */
>
> HTH,
> Jochen
>
> -----Ursprüngliche Nachricht-----
> Von: [email protected] [mailto:[email protected]
> ] Im Auftrag von "SERGIO M. AFCHA CHÁVEZ"
> Gesendet: Mittwoch, 29. April 2009 17:55
> An: [email protected]
> Betreff: st: Merging database
>
> Dear Statlisters,
>
> I have a little problem merging a data base. I have variables for 3
> years showing an ID:
>
>
> V1 V2 V3
> 1 . 1
> 2 . 2
> 3 3 3
> 4 4 .
> . 5 5
> 6 . 6
>
>
> I need only one ID variable. How can I obtain one column with all the
> ID numbers?
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/