Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: set obs by level (of multiple variables)?
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: set obs by level (of multiple variables)?
Date
Wed, 23 Jan 2013 12:15:52 +0000
I am not sure that I understand your set-up but often it is a good
idea to tag the extra observations, something like
local N = _N
expand <whatever>
gen expanded = _n > `N'
This exploits the fact that -expand- puts additional observations at
the end of the dataset.
Note that this _cannot_ be reduced to
expand <whatever>
gen expanded = _n > _N
and it is _essential_ to evaluate _N as a number before you -expand-.
Once you have that tag, you could work
... if expanded
On Wed, Jan 23, 2013 at 11:29 AM, Tim Evans <[email protected]> wrote:
> Hi all, further to this, I have another query. When I have generated my duplicate, I wish to change the values of the variables in only one of the duplicated row. At present I am doing this:
>
> bysort TCATOG sex: gen first2 = (_n==1)
> expand 2 if first2
>
> Now I want to say, where a duplicate replace the contents of the variables - but I'm struggling as to identify just one of the 'first2' records where first2==1 and not replace the contents of both rows.
>
> For instance my data (only partially reproduced) look like this:
>
> start end n d cp_e2 cr_e2 TCATOG2sexfirst2
> 0 1 442 16 0.9563 1.0079 pTa Males 1
> 1 2 426 19 0.9123 1.0093 pTa Males 0
> 2 3 407 26 0.8686 0.9924 pTa Males 0
> 3 4 381 29 0.8259 0.9642 pTa Males 0
> 4 5 352 19 0.7839 0.9611 pTa Males 0
> 5 6 333 26 0.7420 0.9361 pTa Males 0
> 6 7 307 22 0.7015 0.9192 pTa Males 0
> 7 8 285 23 0.6624 0.8949 pTa Males 0
> 8 9 262 25 0.6270 0.8552 pTa Males 0
> 9 10 237 20 0.5938 0.8268 pTa Males 0
> 10 11 217 8 0.5624 0.8408 pTa Males 0
> 11 12 209 12 0.5313 0.8390 pTa Males 0
> 12 13 197 16 0.5005 0.8183 pTa Males 0
> 13 14 181 9 0.4703 0.8274 pTa Males 0
> 14 15 172 10 0.4415 0.8302 pTa Males 0
> 15 16 162 6 0.4143 0.8520 pTa Males 0
> 16 17 156 4 0.3871 0.8884 pTa Males 0
> 17 18 152 10 0.3606 0.8883 pTa Males 0
> 18 19 130 9 0.3375 0.8701 pTa Males 0
> 19 20 77 6 0.3157 0.7956 pTa Males 0
> 0 1 442 16 0.9563 1.0079 pTa Males 1
>
>
> I wish to change the final row with:
> start end n d cp_e2 cr_e2 TCATOG2sexfirst2
> 0 0 442 16 1.000 1.000 pTa Males 1
>
> I can then sort the data according to 'end'
>
> I have four categories of TCATOG2 and both Males and Females.
>
> Best wishes
>
> Tim
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Tim Evans
> Sent: 18 January 2013 10:06
> To: [email protected]
> Subject: RE: st: set obs by level (of multiple variables)?
>
> Thanks Rebecca for your advice - much appreciated.
>
> Best wishes
>
> Tim
>
>
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of Rebecca Pope
> Sent: 17 January 2013 16:31
> To: [email protected]
> Subject: Re: st: set obs by level (of multiple variables)?
>
> bysort TCATOG sex : gen first2 = (_n==1)
>
> That tests whether it is the first observation or not & returns 1 if true, 0 otherwise.
>
> On Thu, Jan 17, 2013 at 10:24 AM, Tim Evans <[email protected]> wrote:
>> Nick thanks for your help. This does what I need, although, rather than duplicating the last record, duplicating the first might be more helpful as this would contain much of the baseline information I already hold. I naively thought that this would work!!:
>>
>> bysort TCATOG sex : gen first2 = _n - but I have 1-20 rather than 1
>> followed by 0
>>
>> I could then use replace first2 = 0 if first !=1 - but I'm assuming there is a better way?
>>
>> Best wishes
>>
>> Tim
>>
>>
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Nick Cox
>> Sent: 17 January 2013 15:53
>> To: [email protected]
>> Subject: Re: st: set obs by level (of multiple variables)?
>>
>> The syntax diagram for -set- does not indicate support for -by:- and
>>
>> 1. Whatever is not permitted should be considered forbidden.
>>
>> 2. Less gnomically, there is a really good reason for this. In essence, -set- is about global settings, and even if what you are asking for makes sense -- as it does here -- -set- and -by:- don't mix naturally.
>>
>> See help for -expand-, -expandcl-, -expandby- (SSC).
>>
>> bysort stage sex : gen last = _n == _N expand 2 if last sort stage sex
>> ... if last
>>
>> On Thu, Jan 17, 2013 at 3:08 PM, Tim Evans <[email protected]> wrote:
>>
>>> I'm trying to insert extra observations in my dataset - I've calculated survival and wish to graph the results but the data start from less than 100%, but I'd like the graph to graph from time 0 and thus 100%. My dataset is split by gender and stage so I need something that inserts an observation for say males & stage 1, males stage 2, females stage 1 and females stage 2.
>>>
>>> Unfortunately, while this will provide me with an observation
>>>
>>> set obs `=_N+1' it does not support this:
>>>
>>> bysort stage sex: set obs `=_N+1'
>>>
>>> Does anyone have an idea how I might do this in Stata 11.2?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/