Bookmark and Share

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: copying a string variable to all rows within a group


From   Nick Cox <[email protected]>
To   "'[email protected]'" <[email protected]>
Subject   RE: st: copying a string variable to all rows within a group
Date   Wed, 29 Feb 2012 11:54:33 +0000

Yes, my second post corrected the first. Sorry about that. 

You need a systematic way of referring to non-missing values and sorting them to the end or the beginning of each block gives you -oldvar[_N]- or -oldvar[1]- respectively as the non-missing value. 

However, your information here on using -gsort- doesn't explain why the non-missing strings weren't always in the first observation of each block.  

Incidentally, something that could go wrong here is if there are two or more non-missing values in each block. A careful test of that would be 

clonevar safecopy = oldvar
bysort groupvar (oldvar) : replace oldvar = oldvar[_N] 
l if safecopy != oldvar & !missing(safecopy) 

Nick 
[email protected] 


-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of [email protected]
Sent: 29 February 2012 11:39
To: [email protected]
Subject: Re: st: copying a string variable to all rows within a group

Hi Again, 
you below explanation may go some way to explaining my previous reply to your response to the original query. 
However I did do a gsort of the string variable ("gsort groupvar -oldvar" using my original variable names) to address the problem of strings being sorted with missing first....

>>> Nick Cox <[email protected]> 2/29/2012 10:40 AM >>>
Seyi did say "not necessarily on the first row". Only the first of
these is a good solution given that warning. The logic is that -sort-
sorts all non-empty strings to the end of any block of observations.

(Perhaps Seyi started out with the identifiers In the first
observation of each block, but they got scrambled a bit by some later
-sort-. -sort groupvar- makes no guarantees about other variables
unless you specify a stable sort.

Nick

On Wed, Feb 29, 2012 at 9:01 AM, Nick Cox <[email protected]> wrote:
> For "row" read "observation". We know what you mean, but that is the
> correct Stata terminology.
>
> bysort groupvar (oldvar) : replace oldvar = oldvar[_N]
>
> Alternatively,
>
> bysort groupvar : replace oldvar = oldvar[1]
>
> will probably work too. Alternatively,
>
> bysort groupvar : replace oldvar = oldvar[_n-1] if missing(oldvar) & _n > 1
>
> will probably work too. This last is an FAQ
>
> FAQ     . . . . . . . . . . . . . . . . . . . . . . . Replacing missing values
>        . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  N. J. Cox
>        2/03    How can I replace missing values with previous or
>                following nonmissing values?
>                http://www.stata.com/support/faqs/data/missing.html 
>
> On Wed, Feb 29, 2012 at 8:28 AM,  <[email protected]> wrote:
> ,
>> If I have a numeric variable "oldvar" which appears on only the first row of a set of rows defined by a group variable "groupvar" (rest of oldvar rows are blank), it is easy enough to copy oldvar down all rows within groupvar:
>>
>> bysort groupvar: egen newvar=max(oldvar) // can equally use =min(oldvar) as there is only 1 value of oldvar.
>>
>> How might I do the same to copy a string oldvar to all rows - again it only appears once within the group, but not necessarily on the first row.
>>

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


© Copyright 1996–2018 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   Site index