Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: computation across rows
From
"Sarah Edgington" <[email protected]>
To
<[email protected]>
Subject
RE: st: computation across rows
Date
Fri, 11 Feb 2011 15:14:03 -0800
.
It does assume a variable called obs, which Nick suggested creating using
> gen long obs = _n
It sounds like for this issue you might be best off creating a variable that
matches the old district number and using that to group observations for
your calculations.
To take your example, if you have a variable district that is equal to 85
and 89 you could then have another variable old_dist that is 82 for both
observations. Then you can do many of your calculations using bysort
old_dist to group the observations across which you want to do the
calculations.
Doing it with observation numbers instead of some set group identifier will
creates problems if you change the sort order of the data.
-Sarah
-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Hewan Belay
Sent: Friday, February 11, 2011 2:55 PM
To: [email protected]
Subject: Re: st: computation across rows
Dear Nick,
I am a bit confused: Your suggested commands presume that there is a
variable in the dataset called obs don't they? I nevertheless tried
it out, and indeed my presumption was right, as I get the error message
. su obs if id==82, meanonly
variable obs not found
r(111);
Unless I am fully misunderstanding your suggestion? As to the other parts of
your response: Yes, I certainly don't want to undertake these operations
manually using the editor, as I may need to later revise the operations when
I get more information about the data.
(To give a bit of an idea, my observations in my panel data are districts,
and some of the districts have split as of a certain year, and I seek to
aggregate back the characteristics for these split districts--so in the toy
example, think of district #82 as having split into two (#85 and 89) in one
of my panel years.)
In light of above mentioned error, please let me know if I misunderstood
your suggestion.
Hewan
--- On Fri, 2/11/11, Nick Cox <[email protected]> wrote:
> From: Nick Cox <[email protected]>
> Subject: Re: st: computation across rows
> To: [email protected]
> Date: Friday, February 11, 2011, 1:10 AM I find it hard to see a
> general pattern under your question. Your toy example would seem
> easiest to solve by mental arithmetic in the Data Editor, but you
> wouldn't be asking if that were true of your real problem.
>
> Naturally you can just find the subscripts for each observation and
> use those, but again I assume from your question you know that you can
> do that.
>
> Some problems a bit like this benefit from a variable containing the
> observation number:
>
> gen long obs = _n
>
> Then you can find the observation number for -id- 85, etc.
>
> su obs if id == 82, meanonly
> local obs82 = r(min)
> su obs if id == 85, meanonly
> local obs85 = r(min)
> su obs if id == 89, meanonly
> local obs89 = r(min)
>
> The assumption here is that each identifier occurs once only so you
> can indifferently pick up r(min), r(max) or r(mean) after -summarize-.
> Then you can do things like
>
> replace Y = Y[`obs85'] + Y[`obs89'] in `obs82'
>
> See also
>
> SJ-6-4 dm0025 . . . . . . . . . . Stata tip 36: Which observations?
> Erratum
> . . . . . . . . . . . . . . . .
> . . . . . . . . . . . . . . N. J. Cox
> Q4/06 SJ
> 6(4):596
> (no
> commands)
> correction of example code for Stata tip 36
>
> SJ-6-3 dm0025 . . . . . . . . . . . . . . Stata tip 36: Which
> observations?
> . . . . . . . . . . . . . . . .
> . . . . . . . . . . . . . . N. J. Cox
> Q3/06 SJ
> 6(3):430--432
>
> (no commands)
> tip for identifying which
> observations satisfy some
> specified condition
>
> Nick
>
> On Thu, Feb 10, 2011 at 8:17 PM, Hewan Belay <[email protected]>
> wrote:
> > Dear Statalist,
> >
> > I am trying to do something I expected to be very
> simple, but I'm not finding a
> > straightforward way to do this. Essentially, I would
> like to do discrete
> > computations across rows/observations (ie within
> variables). Here is an example
> > of what I mean, consider this toy dataset (I hope the
> table is easily visible):
> >
> > id Y Z W
> > 81 4 1 3
> > 82 . 0 9
> > 85 2 4 1
> > 87 3 1 4
> > 89 6 2 5
> >
> > For the id #82, I want the variables Y and W to take
> on the value that results
> > when adding their respective values for IDs #85 and
> 89. In the above toy
> > example, that means that the missing value would
> become an 8, and the value of 9
> > would change to 6. I definitely don't want to xpose or
> reshape my data, as I
> > have several other operations I am doing on the data
> given its current
> > structure.
> >
> >
> > So generally speaking, my question is how to do
> computations across selected
> > rows. I only have info on this with regard to
> computations X rows above or
> > beyond the concerned row, that is using the operation
> [n+1], or when getting
> > statistics for groups of rows using the -by- command.
> >
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/