Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
pairing unpaired data [was: Re: st: any idea?]
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
pairing unpaired data [was: Re: st: any idea?]
Date
Tue, 7 Jan 2014 18:49:08 +0000
I changed the thread title, which was not informative.
You need a method. Some predictable pitfalls are that for some bones
there is no acceptable match and that others there could be two or
more acceptable matches. I don't think there is a canned solution
independent of your spelling out what the method is.
Nick
[email protected]
On 7 January 2014 18:20, Y.R.E. Retamal <[email protected]> wrote:
> Thank you very much Eric and Nick for the advices.
>
> I will try to give a clearer idea of what want to do:
> For example I have the following database of human bones. I removed missing
> values of length for a better understanding:
>
> id type side length id type side length
> 1 femur left 18 21 humerus left 13
> 2 femur left 65.85 22 humerus left 56
> 3 femur left 69.1 23 humerus left 92
> 4 femur left 130 24 humerus left 126
> 5 femur left 131.2 25 humerus left 154
> 6 femur left 143 26 humerus left 170
> 7 femur left 145 27 humerus left 198
> 8 femur left 160 28 humerus left 228
> 9 femur left 183 29 humerus left 230
> 10 femur left 200 30 humerus left 232
> 11 femur right 28 31 humerus right 238
> 12 femur right 80 32 humerus right 10
> 13 femur right 96.5 33 humerus right 66
> 14 femur right 126 34 humerus right 123
> 15 femur right 127 35 humerus right 128
> 16 femur right 128 36 humerus right 143
> 17 femur right 138 37 humerus right 200
> 18 femur right 146 38 humerus right 228
> 19 femur right 148 39 humerus right 230
> 20 femur right 200 40 humerus right 241
>
> These data belong to a commingled skeletal collection and some right bones
> (femurs and humerus respectively) should match with a left bone, but I do
> not know which bones match. Following the idea that a right bone from a same
> skeleton should have the same length (approximately) with its respective
> left bone, I want to subtract each right femur to each left femur, with the
> aim to find which right femur matches with a left femur, i.e. have the same
> or almost the same length, so the subtraction would be zero or near zero.
> The same proceeding with the humerus (and other bones).
>
> If you have any idea to perform this, please let me know.
>
> Rodrigo
>
>
>
> Best wishes
>
> Rodrigo
>
>
>
>
>
> On 2014-01-05 23:54, Nick Cox wrote:
>>
>> <>
>>
>> Eric Booth gives very good advice.
>>
>> Your problem with the link to the Stata Journal file you were directed
>> to me may be just that you didn't step past the standard material
>> bundled with every reprint file.
>>
>> Nick
>> [email protected]
>>
>>
>> On 5 January 2014 21:03, Eric Booth <[email protected]> wrote:
>>>
>>> <>
>>>
>>> The Stata Journal link you mention that Nick sent you works for me. The
>>> title of the article is "Stata tip 71: The problem of split identity, or how
>>> to group dyads” by Nick J. Cox, so maybe you can google that title if your
>>> browser isn’t navigating to it properly.
>>>
>>>
>>>
>>> Your example dataset doesn’t align with your desired dataset.
>>>
>>> How do we know what is x and what is j in the first 20 obs of your
>>> example data (see below) (also note the Statalist FAQ about not sending
>>> attachments) ?
>>>
>>> You need some kind of identifier that ties, for example, obs or id 1
>>> (even though it’s missing) to the other right side femur observation of
>>> interest (is it id 7 or id 9 or ??).
>>>
>>>
>>> **your example data:
>>>
>>> id type side length
>>> 1 femur right
>>> 2 femur left
>>> 3 femur right
>>> 4 femur left
>>> 5 femur right 373
>>> 6 femur left 416
>>> 7 femur right 138
>>> 8 femur left
>>> 9 femur right 270
>>> 10 femur left
>>> 11 femur left
>>> 12 femur right
>>> 13 femur left
>>> 14 femur right
>>> 15 femur left 281
>>> 16 femur right
>>> 17 femur left 160
>>> 18 femur left
>>> 19 femur right
>>> 20 femur left
>>>
>>>
>>> We can’t just sort by ‘type’ and ‘side’ to get a dataset of the same
>>> structure as you presented initially, so I think you need to provide more
>>> information about this. (also, if the rule is, as you imply, to sort by
>>> type and side and then subtract every third observation from each other then
>>> what do we do with missing 'length' and missing ‘side’?)
>>>
>>> If the rule is that id 1 and id 2 are a pair then whey does the
>>> left/right ordering suddenly change starting around id 17?
>>>
>>> - Eric
>>>
>>>
>>>
>>>
>>> On Jan 5, 2014, at 2:46 PM, Y.R.E. Retamal <[email protected]> wrote:
>>>
>>>> Dear Guys
>>>>
>>>> Some weeks ago, Red Owl and Nick helped me with some loops for my work.
>>>> I have tried to run some suggestion in my dataset, but I had some
>>>> difficulties.
>>>> I give you the basic structure of my dataset and my question:
>>>>
>>>> I want to create some new variables containing the difference between
>>>> the length of two individuals from different groups:
>>>>
>>>> id side length newvar1 newvar2 newvar3
>>>> 1 right x x-j x-k x-l
>>>> 2 right y y-j y-k y-l
>>>> 3 right z z-j z-k z-l
>>>> 4 left j j-x j-y j-z
>>>> 5 left k k-x k-y k-z
>>>> 6 left l l-x l-y l-z
>>>>
>>>> Red Owl suggested me following this example:
>>>>
>>>>>>> *** BEGIN CODE ***
>>>>>>> * Build demo data set.
>>>>>>> clear
>>>>>>> * Length is capitalized to distinguish from length().
>>>>>>> input id str5(side) Length
>>>>>>> 1 right 10
>>>>>>> 2 right 15
>>>>>>> 3 right 11
>>>>>>> 4 left 13
>>>>>>> 5 left 10
>>>>>>> 6 left 12
>>>>>>> end
>>>>>>> gen byte newvar1 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar1 = Length[`i'] - Length[4] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar1 = Length[`i'] - Length[1] in `i'
>>>>>>> }
>>>>>>> gen byte newvar2 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar2 = Length[`i'] - Length[5] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar2 = Length[`i'] - Length[2] in `i'
>>>>>>> }
>>>>>>> gen byte newvar3 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar3 = Length[`i'] - Length[6] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar3 = Length[`i'] - Length[3] in `i'
>>>>>>> }
>>>>>>> list, noobs sep(0)
>>>>>>> *** END CODE ***
>>>>
>>>>
>>>> However, my dataset is much more longer and is difficult to perform it.
>>>> I hope you can help me giving me more ideas.
>>>> I send you an extract of my dataset in .xlsx format
>>>> Also, the webpage suggested by Nick to review the discussion about the
>>>> topic (http://www.stata-journal.com/sjpdf.html?articlenum=dm0043) redirects
>>>> me to a non-sense file to download. Please give me the number of the journal
>>>> to read the discussion.
>>>>
>>>> Happy new year to all of you
>>>>
>>>> Rodrigo
>>>>
>>>>
>>>> On 2013-12-15 22:39, Y.R.E. Retamal wrote:
>>>>>
>>>>> Dear Red Owl and Nick
>>>>> Thank you very much for your response. The code works perfectly, just
>>>>> as I need.
>>>>> Best wishes
>>>>> Rodrigo
>>>>> On 2013-12-14 22:31, Nick Cox wrote:
>>>>>>
>>>>>> In addition to Red's helpful suggestions, note that technique for such
>>>>>> paired data was discussed in
>>>>>> http://www.stata-journal.com/sjpdf.html?articlenum=dm0043
>>>>>> which is publicly accessible. The problem is that the identifiers in
>>>>>> Rodrigo's example appear to make little sense. How is Stata expected
>>>>>> to know that 1 and 4, 2 and 5, 3 and 6 are paired? Perhaps the
>>>>>> structure of the dataset is clearer in practice. If so, basic
>>>>>> calculations are just a couple of lines or so.
>>>>>> Nick
>>>>>> [email protected]
>>>>>> On 14 December 2013 15:33, Red Owl <[email protected]> wrote:
>>>>>>>
>>>>>>> Rodrigo,
>>>>>>> The following code demonstrates an approach with basic loops.
>>>>>>> It could be made more efficient with a different loop
>>>>>>> structure, but this approach may be more informative.
>>>>>>> *** BEGIN CODE ***
>>>>>>> * Build demo data set.
>>>>>>> clear
>>>>>>> * Length is capitalized to distinguish from length().
>>>>>>> input id str5(side) Length
>>>>>>> 1 right 10
>>>>>>> 2 right 15
>>>>>>> 3 right 11
>>>>>>> 4 left 13
>>>>>>> 5 left 10
>>>>>>> 6 left 12
>>>>>>> end
>>>>>>> gen byte newvar1 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar1 = Length[`i'] - Length[4] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar1 = Length[`i'] - Length[1] in `i'
>>>>>>> }
>>>>>>> gen byte newvar2 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar2 = Length[`i'] - Length[5] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar2 = Length[`i'] - Length[2] in `i'
>>>>>>> }
>>>>>>> gen byte newvar3 = .
>>>>>>> forval i = 1/3 {
>>>>>>> replace newvar3 = Length[`i'] - Length[6] in `i'
>>>>>>> }
>>>>>>> forval i = 4/6 {
>>>>>>> replace newvar3 = Length[`i'] - Length[3] in `i'
>>>>>>> }
>>>>>>> list, noobs sep(0)
>>>>>>> *** END CODE ***
>>>>>>> Good luck.
>>>>>>> Red Owl
>>>>>>> [email protected]
>>>>>>>>
>>>>>>>> Y.R.E. Retamal" <[email protected]> Sat, 14 Dec 2013 12:08:42:
>>>>>>>> Dear list
>>>>>>>> I am very complicated trying to perform an analysis using STATA and
>>>>>>>> I
>>>>>>>
>>>>>>> cannot find the way. Maybe you could help me. I want to create some
>>>>>>> new
>>>>>>> variables containing the difference between the length of two
>>>>>>> individuals from different groups:
>>>>>>>>
>>>>>>>> id side length newvar1 newvar2 newvar3
>>>>>>>> 1 right x x-j x-k x-l
>>>>>>>> 2 right y y-j y-k y-l
>>>>>>>> 3 right z z-j z-k z-l
>>>>>>>> 4 left j j-x j-y j-z
>>>>>>>> 5 left k k-x k-y k-z
>>>>>>>> 6 left l l-x l-y l-z
>>>>>>>> I do not know if I do explain myself clearly, the individuals are
>>>>>>>
>>>>>>> bones (clavicles, for example), so it is possible that some right
>>>>>>> clavicles pair-match with left clavicles, following the idea that an
>>>>>>> individual has bone of similar length.
>>>>>>>>
>>>>>>>> Any help could bring me a light!
>>>>>>>> Best wishes
>>>>>>>> Rodrigo
>>>>>>>
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>
>>>>> *
>>>>> * For searches and help try:
>>>>> * http://www.stata.com/help.cgi?search
>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>
>>>> <example.xlsx>
>>>
>>>
>>>
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/