Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Dealing with panel data
From
Nick Cox <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Dealing with panel data
Date
Tue, 6 Aug 2013 16:54:45 +0100
You need to let the other observations in each group know what
happened in race 1. That sounds like
egen fav_won_1 = total(favourite == 1 & race_no == 1), by(meeting)
which will give you a meeting-wide indicator variable.
Then you go
egen mean_overround = mean(overround/(fav_won_1 & race_no > 1)), by(meeting)
The division sign / is not a typo. Fortuitously, but fortunately, it
works here as if it were a condition sign |.
For a more systematic review of such tricks, see
http://www.stata-journal.com/article.html?article=dm0055
P.S. -summarize- is a command, not a function.
Nick
[email protected]
On 6 August 2013 16:44, John Kenny <[email protected]> wrote:
> Dear Statalist,
>
> I'm relatively new to stata and I cannot find a standard way too solve
> my problem and I may need to write a .*do file.
>
> I'm dealing with a very large data set that has about 20 variable that
> outlines horse racing results with 711,000 observations. The problem
> that I am having is that I cannot get the mean of one variable
> 'overround' if two other variables are a certain value.
>
> To be more specific within the data set I am using the 'overround'
> determines the bookmakers profit margin. What I want to do is get the
> mean of the 'overround' for each race from 2 to the last race 7 if
> the 'favorite' (which is a dummy variable) won the first race. I have
> a number of variables that outline the time of each race and when it
> occurs at a certain meeting. These are some of the variables that are
> outlined in the data set for each race there is a variable that says
> were the race is held [ 'Meeting' ], the date and time, ['date',
> 'time'] , the odds given for each horse ['odds'], the race number at
> that meeting ['race_no' (lists the races from 1-7)], whether the
> favorite won that race ['favorite'] and the overround which is the
> bookmakers profit [ 'overround' ].
>
> What I have tried is using the summarize function and try and get the
> mean of the 'overround' if 'race_no'==1 & 'favorite'==1. However every
> combination of variables I tried using it always just got the mean of
> the 'overround' for race 1 if the favourite won and not the mean of
> the second race or third race if the favorite won the first race.
>
> Any help would be greatly appreciated as I have been stuck on this for a while.
>
> Thanks in advance.
>
> John
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/