Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: computation across rows
From
Hewan Belay <[email protected]>
To
[email protected]
Subject
RE: st: computation across rows
Date
Fri, 11 Feb 2011 16:38:39 -0800 (PST)
Oops, embarrassing that I missed Nick's suggestion to create the variable obs, sorry! Thanks Sarah for pointing me to that.
Thanks for the alternative recommendation to that of Nick: I had been using this method (creating additional temporary variables and then using these to replace across vars), but I also like Nick's approach. The sorting issue is not a problem for the latter, since the
g obs=_n would immediately precede the operations in my do file, so it doesn't matter how I had sorted the data prior to that, it would always do the right thing.
Thanks to both of you!
Hewan
--- On Fri, 2/11/11, Sarah Edgington <[email protected]> wrote:
> From: Sarah Edgington <[email protected]>
> Subject: RE: st: computation across rows
> To: [email protected]
> Date: Friday, February 11, 2011, 11:14 PM
> .
> It does assume a variable called obs, which Nick suggested
> creating using
> > gen long obs = _n
>
> It sounds like for this issue you might be best off
> creating a variable that
> matches the old district number and using that to group
> observations for
> your calculations.
> To take your example, if you have a variable district that
> is equal to 85
> and 89 you could then have another variable old_dist that
> is 82 for both
> observations. Then you can do many of your
> calculations using bysort
> old_dist to group the observations across which you want to
> do the
> calculations.
> Doing it with observation numbers instead of some set group
> identifier will
> creates problems if you change the sort order of the data.
>
> -Sarah
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]
> On Behalf Of Hewan Belay
> Sent: Friday, February 11, 2011 2:55 PM
> To: [email protected]
> Subject: Re: st: computation across rows
>
> Dear Nick,
> I am a bit confused: Your suggested commands presume that
> there is a
> variable in the dataset called
> obs don't they? I
> nevertheless tried
> it out, and indeed my presumption was right, as I get the
> error message
>
> . su obs if id==82, meanonly
> variable obs not found
> r(111);
>
> Unless I am fully misunderstanding your suggestion? As to
> the other parts of
> your response: Yes, I certainly don't want to undertake
> these operations
> manually using the editor, as I may need to later revise
> the operations when
> I get more information about the data.
>
> (To give a bit of an idea, my observations in my panel data
> are districts,
> and some of the districts have split as of a certain year,
> and I seek to
> aggregate back the characteristics for these split
> districts--so in the toy
> example, think of district #82 as having split into two
> (#85 and 89) in one
> of my panel years.)
>
> In light of above mentioned error, please let me know if I
> misunderstood
> your suggestion.
>
> Hewan
>
> --- On Fri, 2/11/11, Nick Cox <[email protected]>
> wrote:
>
> > From: Nick Cox <[email protected]>
> > Subject: Re: st: computation across rows
> > To: [email protected]
> > Date: Friday, February 11, 2011, 1:10 AM I find it
> hard to see a
> > general pattern under your question. Your toy example
> would seem
> > easiest to solve by mental arithmetic in the Data
> Editor, but you
> > wouldn't be asking if that were true of your real
> problem.
> >
> > Naturally you can just find the subscripts for each
> observation and
> > use those, but again I assume from your question you
> know that you can
> > do that.
> >
> > Some problems a bit like this benefit from a variable
> containing the
> > observation number:
> >
> > gen long obs = _n
> >
> > Then you can find the observation number for -id- 85,
> etc.
> >
> > su obs if id == 82, meanonly
> > local obs82 = r(min)
> > su obs if id == 85, meanonly
> > local obs85 = r(min)
> > su obs if id == 89, meanonly
> > local obs89 = r(min)
> >
> > The assumption here is that each identifier occurs
> once only so you
> > can indifferently pick up r(min), r(max) or r(mean)
> after -summarize-.
> > Then you can do things like
> >
> > replace Y = Y[`obs85'] + Y[`obs89'] in `obs82'
> >
> > See also
> >
> > SJ-6-4 dm0025 . . . . . . . . . . Stata tip 36:
> Which observations?
> > Erratum
> > . . . . . . . . . . . . . . . .
> > . . . . . . . . . . . . . . N. J. Cox
> > Q4/06 SJ
> > 6(4):596
> > (no
> > commands)
> > correction of example code for Stata tip
> 36
> >
> > SJ-6-3 dm0025 . . . . . . . . . . . . . . Stata
> tip 36: Which
> > observations?
> > . . . . . . . . . . . . . . . .
> > . . . . . . . . . . . . . . N. J. Cox
> > Q3/06 SJ
> > 6(3):430--432
> >
> > (no commands)
> > tip for identifying which
> > observations satisfy some
> > specified condition
> >
> > Nick
> >
> > On Thu, Feb 10, 2011 at 8:17 PM, Hewan Belay <[email protected]>
> > wrote:
> > > Dear Statalist,
> > >
> > > I am trying to do something I expected to be
> very
> > simple, but I'm not finding a
> > > straightforward way to do this. Essentially, I
> would
> > like to do discrete
> > > computations across rows/observations (ie within
> > variables). Here is an example
> > > of what I mean, consider this toy dataset (I
> hope the
> > table is easily visible):
> > >
> > > id Y Z W
> > > 81 4 1 3
> > > 82 . 0 9
> > > 85 2 4 1
> > > 87 3 1 4
> > > 89 6 2 5
> > >
> > > For the id #82, I want the variables Y and W to
> take
> > on the value that results
> > > when adding their respective values for IDs #85
> and
> > 89. In the above toy
> > > example, that means that the missing value would
> > become an 8, and the value of 9
> > > would change to 6. I definitely don't want to
> xpose or
> > reshape my data, as I
> > > have several other operations I am doing on the
> data
> > given its current
> > > structure.
> > >
> > >
> > > So generally speaking, my question is how to do
> > computations across selected
> > > rows. I only have info on this with regard to
> > computations X rows above or
> > > beyond the concerned row, that is using the
> operation
> > [n+1], or when getting
> > > statistics for groups of rows using the -by-
> command.
> > >
> >
> > *
> > * For searches and help try:
> > * http://www.stata.com/help.cgi?search
> > * http://www.stata.com/support/statalist/faq
> > * http://www.ats.ucla.edu/stat/stata/
> >
>
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
____________________________________________________________________________________
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/