Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: string function
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: string function
Date
Wed, 24 Aug 2011 13:21:14 +0100
To correct a typo:
For all, the initialisation should be to 1.
gen found = 1
qui foreach letter in s o m e t h i n g {
replace found = min(found, strpos(strvar, "`letter'") > 0)
}
That is: the logic is to assume initially that all are present; if any
one is absent you change your mind. You can also do it this way
gen found = 1
qui foreach letter in s o m e t h i n g {
replace found = 0 if strpos(strvar, "`letter'") == 0
}
Also, in this solution "letter" is just an name that makes sense for letters.
gen found = 1
qui foreach pkg in Stata SAS SPSS {
replace found = 0 if strpos(strvar, "`pkg'") == 0
}
2011/8/24 Grace Jessie <[email protected]>:
> OK,thank you, Nick.
> Grace
>
> ----------------------------------------
>> Date: Wed, 24 Aug 2011 12:37:23 +0100
>> Subject: Re: st: string function
>> From: [email protected]
>> To: [email protected]
>>
>> I just said that they _could_ be written.
>>
>> At that time, I had forgotten about -egen- solutions for your first
>> problem in -egenmore- (SSC). Both of those solutions (by Nick Winter
>> and myself) overlooked what now seems to me a cleaner solution using
>> -subinstr()- and -length()-. See also
>>
>> <http://statadaily.wordpress.com/2011/01/20/counting-occurrence-of-strings-within-strings/>
>>
>> and my Speaking Stata column in SJ 11(1) 2011 for discussion.
>>
>> I am not aware of coded -egen- solutions for your other problems.
>>
>> I imagine that they would just be wrappers for those -foreach- loops,
>> with no gain in efficiency or even comprehensibility.
>>
>> I'm setting them as an exercise for homework.
>>
>> Nick
>>
>> 2011/8/24 Grace Jessie <[email protected]>:
>> > Nick,
>> > thank you.
>> > Counld you please also tell me the -egen- solution for my questions?
>> >
>> > Grace
>> >
>> > ----------------------------------------
>> >> Date: Wed, 24 Aug 2011 11:59:19 +0100
>> >> Subject: Re: st: string function
>> >> From: [email protected]
>> >> To: [email protected]
>> >>
>> >> Solutions to all these could be written as -egen- functions or Mata functions.
>> >>
>> >> Here I focus on "official Stata only" solutions.
>> >>
>> >> First question is discussed in
>> >>
>> >> Nicholas J. Cox
>> >> Stata tip 98: Counting substrings within strings
>> >> The Stata Journal 11(2): 318-320
>> >>
>> >> length("abcdaf") - length(subinstr("abcdaf", "a", "", .))
>> >>
>> >> Last two questions
>> >>
>> >> any of "a", "b", "c"
>> >>
>> >> max(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
>> >>
>> >> all of "a", "b", "c"
>> >>
>> >> min(strpos("abcdaf","a"), strpos("abcdaf", "b"), strpos("abcdaf", "c")) > 0
>> >>
>> >> If you had a long list of candidates, I would do something like this:
>> >>
>> >> gen found = 0
>> >>
>> >> qui foreach letter in s o m e t h i n g {
>> >> replace found = max(found, strpos(strvar, "`letter'") > 0)
>> >> }
>> >>
>> >> where for "max" substitute "min" as needed.
>> >>
>> >> The mapping max <-> any, min <-> all is discussed in
>> >> http://www.stata.com/support/faqs/data/anyall.html
>> >>
>> >> Nick
>> >>
>> >> 2011/8/24 Grace Jessie <[email protected]>:
>> >>
>> >> > How to count how many times a substring appears in a string?
>> >> > For example,
>> >> > function("abcdaf","a")=2
>> >> >
>> >> > And, how to check if a string variable has certain substrings?
>> >> > With regard to this, I want to ask two functions.
>> >> > For example,
>> >> > function("abcdaf","a","b","c")
>> >> > One of what I want to do is to return 1 if a or b or c is included in "abcdaf", ;
>> >> > the other is to return 1 if a, b and c are included in "abcdaf".
>> >> > Could anyone tell me the correct functions for thoes above?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/