Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: What is EGEN_Varname and EGEN_SVarname ?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: What is EGEN_Varname and EGEN_SVarname ?
Date
Tue, 31 Jul 2012 09:13:01 -0500
Fuller comment is difficult given that you have not formally declared
exactly what you want to support and that non-trivial code needs to be
tested through a detailed script, but a very quick glance at your code
identifies various problems which you may want to know about. (As
earlier indicated, I am not in sympathy with the overall goal and
recommend rather, as did Nick Winter, just looping over -egen- calls,
which is, in my experience, more efficient, and less error-prone, than
trying to program like this.)
1. As pointed out before, your local macro -types- will be more than
244 characters long, so -strpos()- will fail sometimes to do what you
want it to do. Most commands that allow variable types to be specified
just pass the variable type to -generate- and let that find a syntax
error in specifying a variable type. Another way to test a variable
type is that one of the following should always work
tempvar foo
capture gen `type' `foo' = 1
if _rc gen `type' `foo' = "1"
if _rc {
di as err "`type' invalid variable type"
exit 198
}
drop `foo'
2. You seem to be presuming that at most single explicit variable type
will be sufficient for a multiple call to -egen-. In practice that
will be a fair assumption for most numeric problems, but likely to be
wrong for at least some string problems.
3. As in #1, some of your other manipulations will fail to work
properly with strings longer than 244 characters. (That -wordcount()-
will bite you was raised in a different thread yesterday.)
4. As I understand it, your syntax does not test for -if-, -in- or
missing values at the outset, but lets each individual -egen- call
sort that out. That's your call as program author, but note that it
could lead to inconsistencies with different input variables,
especially with missing values. It's a more usual Stata standard to
use -marksample- to identify a subset of observations with non-missing
values on all specified observations on all variables specified.
5. Your -program- definition lacks a -version- statement.
6. There is some rather ad hoc parsing. The -parse- command would make
some of your coding easier (and easier to follow).
Nick
On Mon, Jul 30, 2012 at 5:50 PM, Pradipto Banerjee
<[email protected]> wrote:
> In case, it helps why I was asking about EGEN_Varname and EGEN_SVarname, this is a command of the-egenmult- code I have written that repeatedly calls -egen-. The way it works is as follows:
>
> . sysuse auto, clear
>
> Example # 1
> . by foreign, sort: egenmult {test1 test2} = sum({price mpg})
>
> It works equivalently as:
>
> . sort foreign
> . by foreign: egen test1 = sum(price)
> . by foreign: egen test2 = sum(mpg)
>
> Example # 2
> . drop test1 test2
> . by foreign, sort: egenmult float {test1 test2} = pctile(price), p({10 90})
>
> It works equivalent as:
>
> . sort foreign
> . by foreign: egen float test1 = pctile(price),p(10)
> . by foreign: egen float test2 = pctile(price),p(90)
>
> I've been trying to see if I'm missing anything in the code below. Thanks.
>
> -----
> program define egenmult, byable(onecall) sortpreserve
> local fullexp `0'
> gettoken type 0 : 0, parse(" ")
> local restexp `"`0'"'
>
> /* gen a list of all possible types */
> local types "byte int long float double"
> forvalues i=1(1)244 {
> local types "`types' str`i'"
> }
> local typegiven strpos(`"`types'"',`"`type'"')
> if `typegiven' > 0 local fullexp `restexp'
> else local type
>
> /* get the number & locations of the { } */
> local partexp = `"`fullexp'"'
> local curvefnd = 0
> local numbrak = 0
> local allopenpos
> local allclosepos
> local alloptions
> while `curvefnd' == 0 {
> local openpos = strpos(`"`partexp'"',"{")
> if `openpos'>0 {
> local allopenpos `allopenpos' `openpos'
> local closepos = strpos(`"`partexp'"',"}")
> if `closepos' == 0 error 198
> local allclosepos `allclosepos' `closepos'
> local numbrak = `numbrak'+1
> if `"`alloptions'"'=="" {
> local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
> local alloptions `"`newoptions'"'
> local numloop = wordcount(`"`newoptions'"')
> }
> else {
> local newoptions = substr(`"`partexp'"',`openpos'+1,`closepos'-`openpos'-1)
> if `numloop'!=wordcount(`"`newoptions'"') error 198
> local alloptions `"`alloptions'"' `"`newoptions'"'
> }
> }
> else {
> local curvefnd = 1
> }
> local partexp = subinstr(`"`partexp'"',"{","!",1)
> local partexp = subinstr(`"`partexp'"',"}","!",1)
> }
> local tollen = strlen(`"`partexp'"')
>
> /* recreate & run the individual commands */
> forvalues iloop=1(1)`numloop' {
> forvalues jbrak=1(1)`numbrak' {
> if `jbrak'==1 {
> local jthbrakopen : word `jbrak' of `allopenpos'
> local newexp = substr(`"`partexp'"',1,`jthbrakopen'-1)
> local repexp : word `jbrak' of `"`alloptions'"'
> local repexp = word(`"`repexp'"',`iloop')
> local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
> }
> else {
> local jthbrakopen : word `jbrak' of `allopenpos'
> local jprevbrak = `jbrak'-1
> local jprevbrakclose : word `jprevbrak' of `allclosepos'
> local newexp = `"`newexp'"' + " " + substr(`"`partexp'"',`jprevbrakclose'+1,`jthbrakopen'-`jprevbrakclose'-1)
> local repexp : word `jbrak' of `"`alloptions'"'
> local repexp = word(`"`repexp'"',`iloop')
> local newexp = `"`newexp'"' + " " + `"`repexp'"' + " "
> }
> }
> local jprevbrakclose : word `numbrak' of `allclosepos'
> local newexp = `"`newexp'"' + substr(`"`partexp'"',`jprevbrakclose'+1,`tollen'-`jprevbrakclose')
> if _by() by `_byvars': egen `type' `newexp'
> else egen `type' `newexp'
> }
>
> end
>
> -----
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/