Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: use of subinstr
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: use of subinstr
Date
Fri, 23 Mar 2012 10:58:57 +0000
gen mean_`v' = (rowtotal - cond(missing(`v'), 0, .)) / (nvals - !missing(`v'))
should be
gen mean_`v' = (rowtotal - cond(missing(`v'), 0, `v')) / (nvals - !missing(`v'))
On Fri, Mar 23, 2012 at 9:44 AM, Nick Cox <[email protected]> wrote:
> Further comments.
>
> 1. As I should have remembered, -subinstr- can bite you. It never
> promises to act on "words", meaning here variable names. Consider
>
> . local stuff "aaa aa a"
>
> . tokenize "`stuff'"
>
> . forval i = 1/3 {
> 2. local show : subinstr local stuff "``i''" ""
> 3. di "`show'"
> 4. }
> aa a
> a aa a
> aa aa a
>
> Here, when asked to zap "aaa", it finds it and deletes it. When asked
> to zap "aa" it zaps it as part of "aaa", and similarly with "a" as
> part of "aaa". With real variable names, this might bite as an error,
> or you might get garbage.
>
> 2. Before I revisit the earlier examples, there is a different way of
> doing this particular problem that is worth mentioning. At its
> simplest, it is this. Assume var1-var100
>
> egen total = rowtotal(var*)
>
> foreach v of var var* {
> gen mean_`v' = (rowtotal - `v') / 99
> }
>
> What could easily go wrong here is that you have missing data. So,
> better code is
>
> egen total = rowtotal(var*)
> egen nvals = rownonmiss(var*)
>
> foreach v of var var* {
> gen mean_`v' = (rowtotal - cond(missing(`v'), 0, .)) / (nvals
> - !missing(`v'))
> }
>
> You might prefer
>
> gen mean_`v' = cond(missing(`v'), rowtotal/nvals, (rowtotal - `v')/(nvals - 1))
>
> 3. A way of getting variable names ignored one by one is
>
> unab varlist : var*
>
> foreach v of local varlist {
> local wanted : list varlist - v
> egen mean_`v'' = rowmean(`wanted')
> }
>
> and I'd recommend that over -subinstr-.
>
> I wrote a review which will shortly become visible to all:
>
> SJ-9-1 pr0046 . . . . . . . . . . . . . . . . . . . Speaking Stata: Rowwise
> (help rowsort, rowranks if installed) . . . . . . . . . . . N. J. Cox
> Q1/09 SJ 9(1):137--157
> shows how to exploit functions, egen functions, and Mata
> for working rowwise; rowsort and rowranks are introduced
>
> Nick
>
> On Thu, Mar 22, 2012 at 9:52 PM, Eric Booth <[email protected]> wrote:
>> <>
>>
>> On Mar 22, 2012, at 4:30 PM, Nick Cox wrote:
>>
>>> Eric's code should crack the problem nicely. I add some detailed comments:
>>>
>>> 1. for N in num 1/100: g varN = runiform() //old school 1 line loop
>>>
>>> I recommend against recommending old commands that are now
>>> undocumented. This should be in any Stata >= 7
>>>
>>> forval N = 1/100 {
>>> g var`N' = runiform()
>>> }
>>>
>>
>>
>> As always, thanks Nick. The use of the old -for- command was just for fun; I guess I should put a disclaimer about it being undocumented (or avoid using it on SL). (I only recently discovered it in reading your 2003 (vol 3, num 2) SJ article and thought it was interesting to toy with).
>>
>>
>>> 2. On
>>> ds var*
>>> local varlist `r(varlist)'
>>> di `"`varlist'"'
>>> foreach x of local varlist {
>>>
>>> What follows is more direct. -ds- is several dozen lines of code to
>>> interpret. All you want to do is expand the wildcard. See also Stata
>>> Journal 10(3): 503-4. Stata tip 91: Putting unabbreviated varlists
>>> into local macros
>>> unab varlist : var *
>>> foreach x of local varlist {
>>>
>>> Nick
>>
>>
>> Yes - I was echoing Tashi's code - and another way to do it could be:
>>
>> foreach x of varlist var* {
>> **code here**
>> }
>>
>> ((which is probably in your SJ article you recommend (but it's behind the paywall for me))).
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/