Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: AW: Merging database


From   "Eric A. Booth" <[email protected]>
To   [email protected]
Subject   Re: st: AW: Merging database
Date   Wed, 29 Apr 2009 11:54:21 -0500


Additionally, if Sergio is concerned about identifying which dataset the observation came from after his merge, using the -joinby- command with the _merge() option might be useful.

something like:

 joinby V using "filename", unmatched(both) _merge(label)




~Eric


__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
Fax: +979.845.0249
http://ppri.tamu.edu



On Apr 29, 2009, at 11:41 AM, Eric A. Booth wrote:

To add to Jochen's comment:

If you were hoping to have a new 'ID' variable that keeps the information from all the ID variables V1, V2, and V3, you could create a string variable...here are some examples:

******************

clear
input V1  V2  V3
1     .       1
2     .       2
3    3       3
4    4        .   .     5       5
6     .       6
end
//
foreach x in V* {
	recode `x' (.=99)  // <-- So that -regexr- isn't tripped up later
	tostring `x', replace
	}
	gen str10 v_combined = V1+"_"+V2+"_"+V3
	gen v_combined2 = regexr(v_combined, "99", "x")
	sencode v_combined2, gene(uniqueID) gsort(+v_combined2) label(id)	
list

******************


~Eric


__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
Fax: +979.845.0249
http://ppri.tamu.edu

On Apr 29, 2009, at 11:36 AM, Jochen Späth wrote:

Hello Sergio,

I'm not quite sure of what your problem is, maybe it would help if you were a little more precise.

Below, I assumed that the example you gave is AFTER your three data sets have been merged, with v1 coming from the first, v2 from the second and v3 from the third and with v1, v2 and v3 denoting all the same ID. If this is the case you could

-replace v1 = v2 if v1 == . & v2 != .-
-replace v1 = v3 if v1 == . & v2 ==. & v3 != .-
-count if v1 == .- /* should return 0, otherwise there are observations in your data that are not uniquely determined by either of your three ID variables.*/
-drop v2 v3- /* of course, only if you got all IDs caught in v1 */

HTH,
Jochen

-----Ursprüngliche Nachricht-----
Von: [email protected] [mailto:[email protected] ] Im Auftrag von "SERGIO M. AFCHA CHÁVEZ"
Gesendet: Mittwoch, 29. April 2009 17:55
An: [email protected]
Betreff: st: Merging database

Dear Statlisters,

I have a little problem merging a data base. I have variables for 3
years showing an ID:


V1  V2  V3
1     .       1
2     .       2
3    3       3
4    4        .
.     5       5
6     .       6


I need only one ID variable. How can I obtain one column with all the
ID numbers?

Thanks in advance for your help.

Sergio


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index