Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: st: taking the average of duplicate observations

From	Michael Tekle Palm <[email protected]>
To	"[email protected]" <[email protected]>
Subject	RE: st: taking the average of duplicate observations
Date	Fri, 3 May 2013 13:24:38 +0200

Dear Nick,

Thanks for your quick reply.

Sorry about the inexactness of my post, I was afraid the inclusion of a URL would alert spam filters or the like.

-collapse- certainly did the trick, thank you so much. Also this shouldn't have any other implications since there are no other variables included in this data set so it works out elegantly.

I can understand your puzzlement. I am using monthly rainfall data from a developing country where unfortunately and inexplicably, for a few stations, some months have several differing rainfall outcomes. Had it been daily data, I would probably have assumed that there had been multiple measurement times per day, but it being monthly data, my conclusion was that it was down to some kind of input error. I appreciate your comment and will definitely try to investigate more closely.

Regards,

Mike

----------------------------------------
> Date: Fri, 3 May 2013 12:01:37 +0100
> Subject: Re: st: taking the average of duplicate observations
> From: [email protected]
> To: [email protected]
>
> Your reference to another post lacks a URL, nor can we comment on code
> that you don't show us, but there is a one-word solution: -collapse-.
>
> collapse rainfall, by(station year month)
>
> But I've worked a lot with rainfall data, and I'm puzzled at what you
> are doing here. If these are daily data, the convention is to use
> totals, not means. -collapse- can do that too.
>
> Nick
> [email protected]
>
>
> On 3 May 2013 11:48, Michael Tekle Palm <[email protected]> wrote:
> > Hello Statalist!
> >
> > I have observations with identical time values but different outcome values. Instead of dropping all but the first observations for every two/three duplicates, I want to calculate and replace with the average of the observations, and then drop the duplicates.
> >
> > So my data is on rainfall for a given location and is disaggregated by year and month. E.g:
> >
> > Station | Year | Month | Rainfall
> > ---------------------------------------
> > 1 1980 1 5
> > 1 1980 1 3
> > 1 1980 2 4
> > 1 1980 3 8
> > 1 1980 3 1
> >
> >
> > So for each duplicate by station year month, I would like to calculate the average value for the rainfall outcomes, use this value and drop all duplicates. I think the solution suggested in this ["RE: st: questions about duplicate observations"] Statalist reply may work, but I wasn't quite able to make it work.
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/ 		 	   		  
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: taking the average of duplicate observations
  - From: Nick Cox <[email protected]>

References:
- st: taking the average of duplicate observations
  - From: Michael Tekle Palm <[email protected]>
- Re: st: taking the average of duplicate observations
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: comparing regression coefficients across two models with the same dependent variables
Next by Date: st: how to save the results from "xtoverid2, noi robust"
Previous by thread: Re: st: taking the average of duplicate observations
Next by thread: Re: st: taking the average of duplicate observations
Index(es):
- Date
- Thread