If you have the denominator information you can use either a grouped logistic or a poisson regression with offset. For instance, suppose that the proportion of 16 years olds taking a certain course is .30, meaning 30 students took the course at a particular site which had a total of 100 students who could have possibly taken the course. Of course, the same proportion would obtain if there were 45 students taking the course out of a possible 150. In fact, if you know the proportion AND the denominator you can calculate the numerator
and you've got all you need for a rate parameterization Poisson regression model (if the proportions are generally small), or grouped logistic regression (if the proportions are relatively large).
Someone else may have a better idea on this, but this is my thought on it. You do not want to use a logit link with a Gauassian family.
Joe Hilbe
> Dear statalisters,
>
> The dependent variable I have is a proportion (percentage of 16 year
> olds enrolled in a particular subject) which is between 0 and 86
> percent. I am not sure about the linear form. My dependent variable is 0
> only in 3,980 cases out of 112,412 sample obs. Here a zero is a
> structural one, because the school does not offer history (which is
> choice subject).
>
> Would somebody suggest to me whether it would be better to perform a
> logit transformation, or estimate -glm- with
> family(gaussian) and
> link(logit). Any suggestion would be greatly appreciated!
>
> Thank you in advance!
>
> Shqiponja
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/