<>
I would have recommended
http://www.stata-journal.com/article.html?article=dm0039, until I noticed
that you are one of the authors...
*************
clear*
input str20 stringanswer
"1:2:3:5:6:7:8:9"
"1:2:3:6"
"1:2:3:4:5:7:8:9"
"1:2:3:5:7:9"
"1:2:3:5:7:8:9"
"2:3:4:6:9"
"1:2:3:5:6:7:8:9"
"1:2:7:8:9"
"7:9"
"1:11:12"
end
split stringanswer, generate(comp) parse(:)
destring, replace
egen rowmaxim=rowmax(comp*)
su rowmaxim, mean
forv i=1/`r(max)'{
egen byte my`i' = anymatch(comp*), values(`i')
}
drop comp* rowmaxim
*************
HTH
Martin
-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Jeph Herrin
Gesendet: Mittwoch, 23. September 2009 17:11
An: [email protected]
Betreff: Re: st: -word()- with non space separator
THanks. As I note in the paragraph after my data snippet,
-strpos()- works as long as there are <=9 values, but doesn't
work when I get to multiple digits - strpos("11:12","1") = 1,
even though "1" is not really in the list.
cheers,
J
Eric A. Booth wrote:
> I would use -strpos()-.
>
> ******
> clear
> input str20 var1
> "1:2:3:5:6:7:8:9"
> "1:2:3:6"
> "1:2:3:4:5:7:8:9"
> "1:2:3:5:7:9"
> "1:2:3:5:7:8:9"
> "2:3:4:6:9"
> "1:2:3:5:6:7:8:9"
> "1:2:7:8:9"
> "7:9"
> end
> forval n = 1/9 {
> gen myvar_`n'=.
> gen ind`n' = strpos(var1, "`n'")
> replace myvar_`n'=1 if ind`n'>0
> drop ind`n'
> }
>
> li var1 myvar_*
>
> ******
>
> Best,
>
> Eric
>
> __
> Eric A. Booth
> Public Policy Research Institute
> Texas A&M University
> [email protected]
> Office: +979.845.6754
>
> On Sep 23, 2009, at 9:29 AM, Jeph Herrin wrote:
>
>>
>> I have a dataset in which many variables are in
>> the most useless format imaginable. If a question
>> has multiple checkboxes as possible answers, the
>> response is stored as a string, with a number indicating
>> each box checked and these numbers separated by colons.
>> Thus:
>>
>> myvar
>> 1:2:3:5:6:7:8:9
>> 1:2:3:6
>> 1:2:3:4:5:7:8:9
>> 1:2:3:5:7:9
>> 1:2:3:5:7:8:9
>> 2:3:4:6:9
>> 1:2:3:5:6:7:8:9
>> 1:2:7:8:9
>> 7:9
>>
>> This variable takes 9 values, so I want to split into 9
>> different indicator variables, myvar_1-myvar_9, each
>> indicating whether that number was selected. -split()-
>> does not work, because of the differing number of values
>> per string. That is, it produces myvar_1 which equals "7"
>> for the last obs.
>>
>> So I am looking for a way to check whether a given string
>> contains a given integer, which would allow me to
>>
>> forv i=1/9 {
>> gen byte myvar_`i'= [`i' is in myvar list]
>> }
>>
>> As long as there are just 9 values, I can use -strpos()-
>> to check for the presence of the digit, but some of my variables
>> run into tens and twenties, in which case eg searching for "1"
>> returns true even if there is only "11".
>>
>> The only solutions I see are to first -split()- and
>> then check all the new indicators, or run through a series of
>> checks such as (matches "1:" but not ":1"). I don't like
>> either: Is there a direct way to check to see if a given integer
>> is in the list?
>>
>> I think there may be a regex solution, but my Perl programming
>> days are so far behind me that I've not been able to come up
>> with one.
>>
>> thanks,
>> Jeph
>>
>>
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/