Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Two datasets: Look for similar observations in the second dataset
From
Amadou DIALLO <[email protected]>
To
[email protected]
Subject
Re: st: Two datasets: Look for similar observations in the second dataset
Date
Tue, 28 Jan 2014 20:48:04 +0100
Hi,
I'm in a conference so I've not looked at your data which seems
complex but I believe you can find a solution in this presentation by
Prof. Kit Baum (http://economics.adelaide.edu.au/research/seminars/Stata_Lecture4.pdf),
p.194, particularly the code on nneighbor.ado that you can customize
for your needs, maybe combining somehow with the spirit of the code by
Roberto above. I HTHs.
2014-01-27, Torsten Häberle <[email protected]>:
> Sorry, I have to answer again. I kind of solved the problem with the
> missing ratios. I found a way with the if/else command to match based
> on the closest size if the ratios are missing.
>
> However, I couldn't figure out a solution to problem (2), namely:
> different sample firms can be matched to the same matching firm. To
> make my matching perfect, it would be great if the loop could be
> extended in the following way.
>
> - If a sample firm B is matched to a matching firm A in year X (2000),
> then drop out the matching firm A from the universe of all matching
> firms for the years X (2000), X+1 (2001), X+2 (2002), X+3 (2003), X-1
> (1999), X-2 (1998), X-3 (1997).
> - Basically, this means that matching firm A could be matched again
> with another sample firm, but only in OTHER years than those outlined
> above in the example.
> - For example, if there is another sample firm in 2007, then this
> sample firm could be matched again with our matching firm A in year
> 2007. However, if there would be a sample firm in 2002, matching firm
> A could NOT be the matching firm again, because it was already matched
> to sample firm B in 2000.
> - In summary, if a matching firm was matched with a sample firm, it
> cannot be a match again in the three years before and the three years
> after it was matched the first time. But it can be another match in
> all other years. If there would be a second match, again, this second
> "7-year period" would be locked again.
>
> Sorry, this is an even more complex extension.
>
> Thanks again so much.
>
> 2014-01-27 Roberto Ferrer <[email protected]>:
>> Please follow Statalist policy and provide cross-references when
>> posting in other forums:
>> http://www.stata.com/support/faqs/resources/statalist-faq/#crossposting
>>
>> The following is one way of doing what you want. You could avoid the
>> -forvalues- loop if your database is not too big, but I assume it is.
>> I didn't test speed with a big data set but I hope it gets you
>> started.
>>
>> * ----------------------- begin code -----------------------
>>
>> clear all
>> set more off
>>
>> * Input fake databases (including -dum- variable)
>> input str1 company year size rat
>> A 2012 140 0.2
>> B 2011 200 0.4
>> C 2010 300 0.2
>> D 2010 160 0.5
>> end
>>
>> gen dum = 1
>>
>> tempfile samp
>> save "`samp'"
>>
>> clear all
>> input str4 company year size rat
>> X 2012 150 0.19
>> XX 2012 150 0.20
>> XXX 2012 150 0.22
>> XXXX 2012 150 0.195
>> Y 2010 280 0.9
>> YY 2010 280 0.9
>> Z 2012 50 0.01
>> ZZ 2010 300 0.2
>> T 2011 200 0.95
>> U 2010 300 0.10
>> end
>>
>> gen dum = 1
>>
>> tempfile pop
>> save "`pop'"
>>
>>
>> * Main process
>> tempfile result
>> local lowlimit .8
>> local highlimit 1.2
>>
>> quietly {
>> forvalues i = 1/4 { // 4 is # observations in sample file
>> use "`samp'" in `i', clear
>> rename (company year size rat) =0
>> joinby dum using "`pop'"
>> drop dum
>>
>> keep if year0 == year // compare companies with same year only
>> keep if inrange(size, `lowlimit'*size0, `highlimit'*size0)
>>
>> gen ratdif = abs(rat0 - rat)
>> * Ties in -ratdif- are broken alphabetically by -company- name
>> isid ratdif company, sort
>> capture keep in 1/3
>>
>> if (`i' == 1) save "`result'"
>> else {
>> append using "`result'"
>> save "`result'", replace
>> }
>>
>> }
>>
>> }
>>
>> * Check and reshape
>> use "`result'", clear
>> isid company0 ratdif company, sort
>> list, sepby(company0)
>>
>> keep company*
>> list, sepby(company0)
>>
>> by company0: gen id = _n
>> reshape wide company, i(company0) j(id)
>> list, separator(0)
>>
>> *------------------------- end code ------------------------
>>
>> On Sun, Jan 26, 2014 at 4:18 PM, Torsten Häberle
>> <[email protected]> wrote:
>>> Sorry guys. Just wanted to get different opinions since it's a tough
>>> one.
>>>
>>> 2014-01-26 daniel klein <[email protected]>:
>>>> This is a tripple post (with slight variations) that has already
>>>> generated two answers here
>>>>
>>>> http://www.talkstats.com/showthread.php/53371-Find-matching-firms-in-another-dataset
>>>>
>>>> http://www.stata-forum.de/post2400.html#p2400
>>>>
>>>>
>>>> Please see the FAQ concerning cross-postings
>>>> (http://www.stata.com/support/faqs/resources/statalist-faq/#crossposting)
>>>>
>>>>
>>>> Best
>>>> Daniel
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
--
Amadou B. DIALLO, PhD.
Senior Economist, AfDB.
[email protected]
+21671101789
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/