Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: Keeping a subset of variables
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: Keeping a subset of variables
Date
Wed, 4 Aug 2010 15:52:36 +0100
No; that won't work.
Although Marshall more than once seems to imply otherwise in his
posting, I think it's clear that he is talking about selecting variable
names, not values of string variables.
Nick
[email protected]
Richard Goldstein
how about the following:
keep if substr(var,1,2)=="ca" & real(substr(var,4,2))>=5 ///
& real(substr(var,4,2))<=8 & real(substr(var,-2,1))==9
On 8/4/10 10:38 AM, Marshall Garland wrote:
> I'm attempting to retain a subset of variables from a rather large
> dataset (>10K variables). The variables have a patterned naming
> convention, and I'm trying to exploit this pattern to keep only those
> variables that meet specific criteria. Here's an example of some
> variables:
>
> ca003sr09d
> cb004sr08d
>
> Essentially, I only want to retain those variables that meet the
> following criteria:
>
> 1. The characters in the first two positions must be "ca"
> 2. The numbers in the 4-5 position must be equal to 05-08
> 3. The numbers in the substr(var,-2,1) position must be equal to 9
>
> I've tried to adapt code from this thread:
> http://www.stata.com/statalist/archive/2008-06/msg00301.html
> And this one:
> http://www.stata.com/statalist/archive/2007-03/msg01034.html
>
> But the number of conditions I'm requiring exceeds the number
> encountered in these threads, which is where I'm stumbling. The code
> either chokes (variable whatever cannot be found, which is expected,
> hence the -cap-) or it is not eliminating the variables that I'm
> expecting to be dropped, based on the admittedly inelegant syntax I've
> written. I'm trying to wrap this into a single command, which is
> perhaps a source of my difficulty. Here's what I've cobbled together
> thus far, which has a sort of Frankensteinian character since I keep
> grafting additional loops to address these conditions:
>
> //here, i'm retaining just 5-8 grade results for all students
> foreach var of varlist * {
> local beg=substr("`var'",6,2)
> local end=substr("`var'",-1,1)
> foreach letter in i p b h s e l w m f {
> foreach num of numlist 3/4 9/11 {
> cap drop c`letter'00`num'`beg'08`end'
> cap drop c`letter'0`num'`beg'08`end'
> cap drop c`letter'00`num'`beg'07`end'
> cap drop c`letter'0`num'`beg'07`end'
> }
> }
> }
>
> Any help from list members would be greatly appreciated.
>
> I'm using Stata SE 11.1.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/