Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
st: RE: Keeping a subset of variables
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
st: RE: Keeping a subset of variables
Date
Wed, 4 Aug 2010 15:45:59 +0100
I think you can do this even without resorting to regular expressions.
These wildcards should catch all variables meeting your three rules. I
can't see that any loops or even programming is needed.
ca?05*9? ca?06*9? ca?07*9? ca?08*9?
Nick
[email protected]
Marshall Garland
I'm attempting to retain a subset of variables from a rather large
dataset (>10K variables). The variables have a patterned naming
convention, and I'm trying to exploit this pattern to keep only those
variables that meet specific criteria. Here's an example of some
variables:
ca003sr09d
cb004sr08d
Essentially, I only want to retain those variables that meet the
following criteria:
1. The characters in the first two positions must be "ca"
2. The numbers in the 4-5 position must be equal to 05-08
3. The numbers in the substr(var,-2,1) position must be equal to 9
I've tried to adapt code from this thread:
http://www.stata.com/statalist/archive/2008-06/msg00301.html
And this one:
http://www.stata.com/statalist/archive/2007-03/msg01034.html
But the number of conditions I'm requiring exceeds the number
encountered in these threads, which is where I'm stumbling. The code
either chokes (variable whatever cannot be found, which is expected,
hence the -cap-) or it is not eliminating the variables that I'm
expecting to be dropped, based on the admittedly inelegant syntax I've
written. I'm trying to wrap this into a single command, which is
perhaps a source of my difficulty. Here's what I've cobbled together
thus far, which has a sort of Frankensteinian character since I keep
grafting additional loops to address these conditions:
//here, i'm retaining just 5-8 grade results for all students
foreach var of varlist * {
local beg=substr("`var'",6,2)
local end=substr("`var'",-1,1)
foreach letter in i p b h s e l w m f {
foreach num of numlist 3/4 9/11 {
cap drop c`letter'00`num'`beg'08`end'
cap drop c`letter'0`num'`beg'08`end'
cap drop c`letter'00`num'`beg'07`end'
cap drop c`letter'0`num'`beg'07`end'
}
}
}
Any help from list members would be greatly appreciated.
I'm using Stata SE 11.1.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/