Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: AW: levelsof problem?
From
"Nick Cox" <[email protected]>
To
<[email protected]>
Subject
RE: st: AW: levelsof problem?
Date
Tue, 27 Jul 2010 18:26:49 +0100
Correct on the first point, but that's the default. I know Kit Baum hates it, but my impression is that most users don't change it by -set varabbrev off-.
I don't understand your second point. If it's that the solution may need modification in so far as the real problem of Joe J may differ from the toy problem, then naturally I agree.
Nick
[email protected]
Tirthankar Chakravarty
Not sure, but I think this:
levelsof country if eu==1, local(lev) clean
egen eutotal = rowtotal(`lev')
will work only if you set -varabbrev- on. The -unab- tip is a good one
and I thought about it, but the "US_F" variable could be a moving
target (or not).
2010/7/27 Nick Cox <[email protected]>:
> This problem seems to me simpler than is being implied.
>
> The direct problem is that that Joe J needs a varlist to feed to -egen-'s -rowtotal()- function.
>
> His starting point could be the wildcard *_F which catches all the variable names ending in _F. The difficulty is that this includes the US_F variable which for Joe J is a step too far. (At this point I merely hint at the possibility of numerous obvious political jokes without actually making any of them.)
>
> The command -unab-, although usually billed as a programmer's command, is useful here. It does just one thing, unabbreviate (meaning expand) a varlist to all its implied names, so that
>
> unab all : *_F
>
> unpacks all the names of the variables ending in _F and puts the result in a local macro. To remove US_F from the list we can turn to macro manipulation
>
> local US US_F
> local eu : list all - US
>
> which gives us a macro -eu- containing the desired names.
>
> Some people might want to emphasise that the varlist expansion is also done by other commands: see e.g. help on -describe, varlist-, -ds-, or -findname- (SJ). But any of those does much more than this one thing, so it is most straightforward to stick to -unab-.
>
> It also happens that the names of the countries concerned are held as values of Joe J's string variable -country-. The only real problem here is that the list result returned by -levelsof- is complicated by double quote delimiters, but as Tirthankar shows -- and the help file clearly explains -- an option -clean- gets rid of those.
>
> For Joe J's example dataset
>
> levelsof country if eu==1, local(lev) clean
> egen eutotal = rowtotal(`lev')
>
> should have worked so far as I can see. There is no need, for the example dataset, to spell out the _F suffix, although Tirthankar's code shows how to do it if needed.
>
> Confusion on names: Joe J mixed references to
>
> 1. -egen, rsum()- and -egen, rowtotal()-.
> 2. -levels- and -levelsof-.
>
> In both cases (just a coincidence, this) the second name has been the preferred name since Stata 9.
>
> Nick
> [email protected]
>
> joe j
>
> Thanks a lot, Tirthankar!
>
> Tirthankar Chakravarty
>
>> Then this (cumbersome) script should do what you want:
>> *********************************************
>> clear
>> input str2 country eu GE_F NL_F UK_F US_F
>> US 0 1 1 1 0
>> US 0 1 1 1 0
>> NL 1 1 0 1 1
>> IN 0 1 1 1 1
>> GE 1 0 1 1 1
>> GE 1 0 1 1 1
>> US 0 1 1 1 0
>> US 0 1 1 1 0
>> US 0 1 1 1 0
>> PT 1 1 1 1 1
>> end
>> g PT_F = 2
>> levelsof country if eu==1, local(lev) clean
>> local lev2
>> foreach x of local lev {
>> local lev2 " `lev2' `x'_F "
>> }
>> egen eutotal = rowtotal(`lev2')
>> *********************************************
>
> joe j
>
>>> Thanks, Martin. This is not quite what I wanted; The following command
>>> is good enough.
>>> egen eutotal=rowtotal(GE_F NL_F UK_F)
>>>
>>> The *_F variables need to be selected based on whether they belong to
>>> eu or not (GE_F NL_F UK_F are selected, but not US_F) (The values of
>>> _*F variables are not based on whether eu=1 or otherwise). But there
>>> are many groupings, like eu, and a lot of countries, so I was looking
>>> for an easy method to select. But it seems to me that manual selection
>>> is the only choice.
>
> Martin Weiss
>
>>>> You could of course -replace- to the values you want based on the -if-
>>>> qualifier after the fact:
>>>>
>>>>
>>>> *************
>>>> egen eutotal=rowtotal(GE_F NL_F UK_F)
>>>> replace eutotal=. if !eu
>>>> *************
>>>>
>>>>
>>>> The reason that your second approach does not work is that Stata expects a
>>>> -varlist- while you feed it
>>>>
>>>> `"GE"' `"NL"' `"PT"'_F
>>>>
>>>> which it cannot process. Type -ma di- to see the contents of your -macro-s.
>
> joe j
>
>>>> >From a data set roughly like the following
>>>> clear
>>>> input str2 country eu GE_F NL_F UK_F US_F
>>>> US 0 1 1 1 0
>>>> US 0 1 1 1 0
>>>> NL 1 1 0 1 1
>>>> IN 0 1 1 1 1
>>>> GE 1 0 1 1 1
>>>> GE 1 0 1 1 1
>>>> US 0 1 1 1 0
>>>> US 0 1 1 1 0
>>>> US 0 1 1 1 0
>>>> PT 1 1 1 1 1
>>>> end
>>>>
>>>> I want to calculate the row sum of all *_F variables pertaining to eu
>>>> countries (all excluding US_F):
>>>> egen eutotal=rowtotal(GE_F NL_F UK_F)
>>>>
>>>> However, I would prefer to follow some rules in selecting the variables,
>>>> like
>>>>
>>>> levels country if eu==1, local(lev)
>>>> egen eutotal=rsum(`lev'_F)
>>>>
>>>> This doesn't work, however. Any pointers would be appreciated.
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/