With just one trick, creating a pseudo-time variable, you can exploit
the existence of -mvsumm- from SSC, which codes this, and indeed a more
general approach.
-mvsumm- is not blisteringly fast -- it predates Mata -- but it should
save almost all your programming time.
Nick
[email protected]
Benjamin Villena Roldan
I have two continuous variables X
and Y. I'm trying to do the following:
1. Sort the data using X
2. For each observation of X, I compute the local variance of Y by a
nearest
neighborhood approach. I take the 2k closest observations to an
observation
X[i], i.e. using observations between X[i-k+1] and X[i+k].
3. I'm implementing this approach by using a forvalue loop such as
sort X
count if X!=.
local k=ceil(r(N)^0.5/2)
local K=r(N)-`k'
gen SD_Y=.
forv i=`k'/`K' {
local k0=`i'-`k'+1
local k1=`i'+`k'
qui summ Y in `k0'/`k1'
replace SD_Y=r(sd) in `i'/`i'
}
So, I have two questions/problems about this code
1. I need to do the same procedure several times and it is very
time-consuming. Is there a way to speed up the execution? How much time
would I gain if I implement a similar code in C++?
2. There are missing observations in X and Y, how can I restrict the
sort
command to deal with nonmissing values of both variables. A simple
answer is
to do
-keep if X!=. & Y!=.
Can I do it without dropping data?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/