Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: ambiguity in -if- qualifier
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: ambiguity in -if- qualifier
Date
Sun, 23 Mar 2014 01:09:45 +0000
Comments below.
Nick
[email protected]
On 23 March 2014 00:44, Yu Chen, PhD <[email protected]> wrote:
> Hi, Nick,
> Let me clarify. For any assignment to a new variable, there are two
> steps. Step 1, the expression should be evaluated; and Step2, the
> result of the evaluation is assigned to the new variable. My question
> is, what is the sample used in each step?
> For -generate-, Step 1 uses the full sample. In other words, all
> observations, regardless whether they meet the -if- condition, can be
> used. But in Step 2, -generate- uses the subsample that meets the -if-
> condition.
I don't think this word treatment helps understanding. In your
-generate- example two things are happening simultaneously:
A. Stata is being instructed to put previous values of -mpg- in a new variable.
B. Stata is being instructed to do that only if -foreign- is 1.
You are surmising that A is done in a Step 1, which is followed by B
in a Step 2. But it makes just as much sense to imagine that Stata
works out that the variable should receive non-missing values only
when -foreign- is 1 and then works out what they should be. EIther
way, the result is the same.
> However, there may exist such commands that use a subsample in Step 1.
> In other words, before the command does any thing, the sample is
> reduced according to the -if- condition, so all other activities that
> the command is going to do are on this reduced sample. It seems to me
> that most commands work this way. But I found that -generate- is an
> exception. It does not restrict the sample until the last step.
> I think this is a little confusing. At least, there is no consistency
> in when to restrict the sample.
> Thank you.
Sorry, but I don't catch your meaning here at all. You've presumably
withdrawn your claim about -egen-, so you seem to be offering
speculation, but no examples that anyone else can discuss.
> On Sat, Mar 22, 2014 at 6:45 PM, Nick Cox <[email protected]> wrote:
>> I don't think the one precise example here is puzzling in any sense.
>> Previous values of -mpg- are put in a new variable if and only
>> -foreign- is 1. This is calculated observation by observation.
>>
>> You allude to different behaviour with -egen-. But the help for -egen- explains
>>
>> "Explicit subscripting (using _N and _n), which is commonly used with
>> generate, should not be used with egen; see subscripting."
>>
>> That may illuminate your puzzlement.
>>
>> Nick
>> [email protected]
>>
>>
>> On 22 March 2014 21:26, Yu Chen, PhD <[email protected]> wrote:
>>> I think there is some ambiguity in the meaning and usage of the -if-
>>> qualifier. Generally, the command is performed on a subset that meets
>>> the -if- condition. However, a command may perform many tasks, and the
>>> subset for each task is not clear sometimes. For example, for the
>>> -generate- command, it seems to calculate the result of the expression
>>> on the full sample first, and then that result is assigned to a
>>> subsample that meets the -if- condition. However, for the -egen-
>>> command, the calculation is performed on a subset that meets the -if-
>>> condition, not the full sample, and then that result is assigned to
>>> the new variable on that subsample.
>>>
>>> For example, see the code below.
>>>
>>> sysuse auto
>>> gen mpg2=mpg[_n-1] if foreign==1
>>>
>>> Notice that observation number 53 has a value of 24 for mpg2. This
>>> indicates that the task of taking a lagged value is performed on the
>>> full sample first. Otherwise, this value should be missing. But -egen-
>>> works differently.
>>>
>>> There may exist other cases that have similar ambiguities. I would
>>> suggest that Stata have a clear rule to address this issue. If the
>>> rule is already out there, please tell me.
>>> Thank you very much.
>>>
>>> Yu Chen
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> --
> Yu Chen, Ph.D.
> Assistant Professor of Accounting
> A. R. Sanchez, Jr. School of Business, WHTC 218D
> Texas A&M International University
> 5201 University Boulevard
> Laredo, Texas 78041-1900
> USA
> 956-326-2513 (office)
> 956-326-2479 (fax)
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/