Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: taking the average of duplicate observations
From
Michael Tekle Palm <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: taking the average of duplicate observations
Date
Fri, 3 May 2013 13:24:38 +0200
Dear Nick,
Thanks for your quick reply.
Sorry about the inexactness of my post, I was afraid the inclusion of a URL would alert spam filters or the like.
-collapse- certainly did the trick, thank you so much. Also this shouldn't have any other implications since there are no other variables included in this data set so it works out elegantly.
I can understand your puzzlement. I am using monthly rainfall data from a developing country where unfortunately and inexplicably, for a few stations, some months have several differing rainfall outcomes. Had it been daily data, I would probably have assumed that there had been multiple measurement times per day, but it being monthly data, my conclusion was that it was down to some kind of input error. I appreciate your comment and will definitely try to investigate more closely.
Regards,
Mike
----------------------------------------
> Date: Fri, 3 May 2013 12:01:37 +0100
> Subject: Re: st: taking the average of duplicate observations
> From: [email protected]
> To: [email protected]
>
> Your reference to another post lacks a URL, nor can we comment on code
> that you don't show us, but there is a one-word solution: -collapse-.
>
> collapse rainfall, by(station year month)
>
> But I've worked a lot with rainfall data, and I'm puzzled at what you
> are doing here. If these are daily data, the convention is to use
> totals, not means. -collapse- can do that too.
>
> Nick
> [email protected]
>
>
> On 3 May 2013 11:48, Michael Tekle Palm <[email protected]> wrote:
> > Hello Statalist!
> >
> > I have observations with identical time values but different outcome values. Instead of dropping all but the first observations for every two/three duplicates, I want to calculate and replace with the average of the observations, and then drop the duplicates.
> >
> > So my data is on rainfall for a given location and is disaggregated by year and month. E.g:
> >
> > Station | Year | Month | Rainfall
> > ---------------------------------------
> > 1 1980 1 5
> > 1 1980 1 3
> > 1 1980 2 4
> > 1 1980 3 8
> > 1 1980 3 1
> >
> >
> > So for each duplicate by station year month, I would like to calculate the average value for the rainfall outcomes, use this value and drop all duplicates. I think the solution suggested in this ["RE: st: questions about duplicate observations"] Statalist reply may work, but I wasn't quite able to make it work.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/