Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: merge and nearest value
From
Francesco <[email protected]>
To
[email protected]
Subject
Re: st: merge and nearest value
Date
Sun, 19 Aug 2012 12:07:11 +0200
Dear David,
Many thanks for your interesting suggestion!
Unfortunately it cannot work because
1) There can be numerous duplicates of TYPE and DATE in A. However in
B TYPE and DATE uniquely idenfity observations
2) I have several million of observations, therefore using a joinby
would probably destroy my computer's Ram ...
:-(
On 19 August 2012 05:13, David Kantor <[email protected]> wrote:
> First, I presume that in A, TYPE and DATE uniquely identify observations.
>
> I suggest you do a -joinby- on TYPE. This will create a large multitude of
> observations.
> Then for each distinct TYPE and DATE combination, compute the difference and
> then select (by TYPE and DATE) the one with the minimal difference.
> You can do appropriate sorting to break ties in the manner you desire.
>
> HTH
> --David
>
>
> At 07:22 PM 8/18/2012, Francesco wrote:
>>
>> Dear Statalist,
>>
>> I wish again that you could help me with this particular merging
>> problem...
>>
>> Let say I have a dataset A as:
>>
>> TYPE DATE
>> A 2
>> A 5
>> A 20
>> B 10
>> B 2
>>
>>
>> and I have another dataset B as :
>>
>>
>> TYPE Special_Date
>> A 2
>> A 6
>> A 20
>> A 22
>> B 5
>> B 6
>>
>> The question is : I would like to obtain the difference between the
>> date of each observation in A and the closest special date in B with
>> the same type. In case of ties I would take the latest date of the
>> two.
>>
>> For example I would obtain here
>>
>> TYPE DATE Difference
>> A 2 0=2-2
>> A 5 -1=5-6
>> A 20 0=20-20
>> B 10 +4=10-6
>> B 2 -3=2-5
>>
>>
>> I was thinking of reshaping the dataset B in order to have the special
>> dates in column for each type, merging then on type with A, creating a
>> difference variable between the date and each special date, and taking
>> the minimum...
>> But this involves creating a lot of variables and maybe there is
>> something more simple ?
>>
>> Many thanks for your suggestions,
>>
>> Best Regards,
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/