Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: getting part of strings
From
Eric Booth <[email protected]>
To
"<[email protected]>" <[email protected]>
Subject
Re: st: getting part of strings
Date
Sat, 26 Mar 2011 19:42:01 +0000
<>
Daniel:
I missed the part in your post where you want to capture PB and PP as well.
You could grab these from the var1? that contains this information from my previous example, or another approach entirely is to use the string functions (see -help string_functions-) subinstr() or strpos() to generate indicators if var1 contains the substrings of interest -- this allows you to skip the -split- or regex* approaches completely if this is what you need from var1:
***********************!
clear
inp str200 var1
"155 - VITAL DO REGO FILHO - PB - Senador"
"1111 - - PP - - Deputado Federal / 25888 - ATAIDES MENDES PEDROSA -PB - Deputado Estadual"
"1111 - - PP - - Deputado Federal / 22333 - EDNALDO PEREIRA DESANTANA - PB - Deputado Estadual"
"151 - JOSE WILSON SANTIAGO - PB - Senador"
"45123 - ANTONIO HERVAZIO BEZERRA CAVALCANTI - PB - Deputado Estadual"
"1212 - DAMIÃO FELICIANO DA SILVA - PB - Deputado Federal"
end
g DF = 1 if strpos(var1, "Deputado Federal")
g DE = 1 if strpos(var1, "Deputado Estadual")
g S = 1 if strpos(var1, "Senador")
g PP = 1 if strpos(var1, "PP")
g PB = 1 if strpos(var1, "PB")
order D* P* S
***********************!
- Eric
__
Eric A. Booth
Public Policy Research Institute
Texas A&M University
[email protected]
Office: +979.845.6754
On Mar 26, 2011, at 2:30 PM, Eric Booth wrote:
> ***********************!
> clear
> inp str200 var1
> "155 - VITAL DO REGO FILHO - PB - Senador"
> "1111 - - PP - - Deputado Federal / 25888 - ATAIDES MENDES PEDROSA -PB - Deputado Estadual"
> "1111 - - PP - - Deputado Federal / 22333 - EDNALDO PEREIRA DESANTANA - PB - Deputado Estadual"
> "151 - JOSE WILSON SANTIAGO - PB - Senador"
> "45123 - ANTONIO HERVAZIO BEZERRA CAVALCANTI - PB - Deputado Estadual"
> "1212 - DAMIÃO FELICIANO DA SILVA - PB - Deputado Federal"
> end
>
> **using split**
> replace var1 = subinstr(var1, " / ", " - ", .)
> split var1, p("-")
>
> **trim spaces in new vars**
> ds var1?
> foreach v in `r(varlist)' {
> replace `v' = trim(`v')
> }
>
>
> **it looks like the substr you want are in vars14, var15, var19:
> l var14 var15 var19
>
> **grab the title or subtitle or gen an indicator if they are present**
> g str50 title = var14 if !mi(var14)
> replace title = var15 if mi(title) & !mi(var15)
> g str50 title2 = var19 if !mi(var19)
> l var1 title title2
> **or
> g titleind = 1 if !mi(var14) | !mi(var15)
> g title2ind = 1 if !mi(var19)
> order *ind
> ***********************!
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/