Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Matching Problem
From
Nick Cox <[email protected]>
To
"'[email protected]'" <[email protected]>
Subject
st: RE: Matching Problem
Date
Sun, 17 Oct 2010 16:32:55 +0100
I support Stefan's approach here. Note that (e.g.)
gen match = 0
replace match = 1 if int(test/5) == int(train/5)
is just
gen match = int(test/5) == int(train/5)
However, there is a small problem with either if there are missing values.
gen match = int(test/5) == int(train/5) if !missing(test, train)
is better than either.
Nick
[email protected]
[email protected]
try this:
*** Example Dataset
clear
set obs 1000
gen test = int(uniform()*200)
gen train = int(uniform()*200)
gen match = 0
*** You're not clear about start and end of the intervals you want: 1, 6, 11 vs. 70, 125
*** So choose the command that fits.
** Intervals [0..4] [5..9] [10..14]... [195-199] [200]
replace match = 1 if int(test/5) == int(train/5)
** Intervals [0] [1..5] [6..10] [11..15]... [196..200]
replace match = 1 if int((test+4)/5) == int((train+4)/5)
Raphael Fraser
I have two variables which I will call test and training each
containing integers between 1-200. Each number in the test variable
represents an image. The training variable contains the closest image
to the test image in terms of similarity.
test = (129, 163, 71, 176, 125, ...)
train = (128, 162, 71, 119, 123, ...)
The objective is to match both integers to the same interval. These
are the intervals 1-5, 6-10,11-15, ..., 196-200. For example, when
test=129 and train=128 are both in the same interval 125-130. Also
test=71,train=71 are both in the interval 70-75. These are successful
mappings. I would like a successful mapping to be =1 and failure=0.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/