Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Steve Samuels <sjsamuels@gmail.com> |
To | statalist@hsphsun2.harvard.edu |
Subject | Re: st: how to find the integral for a portion of a normal distribution. |
Date | Tue, 4 May 2010 16:03:33 -0400 |
Buzz Burhans My question here, technicalities aside, is why you think that the normal distribution is a good fit to milk yields. A earlier post in the thread already warned that the normal distribution with your parameters has substantial probability below zero. This publication, for what it's worth-I haven't read it, found a modified Weibull distribution to be a good fit to milk yields.((http://www.ncbi.nlm.nih.gov/pubmed/8046070) ) If you have prior data, then -kdensity- will overlay the best fitting Normal on top of a nonparametric estimated, and there are other diagnostic plots in Stata (-help diagnostic plots-) You can fit parameters to the Weibull and related distributions with -streg-. The Weibull is also easy to simulate because there is a closed-form version of the cumulative probability distribution. Steve On Tue, May 4, 2010 at 2:44 PM, Brian P. Poi <bpoi@stata.com> wrote: > On Tue, 4 May 2010, > >> >> Suppose I have an intervention applied to cows with a demonstrated mean >> milk >> yield response of +2.05 liters, sd 1.74. >> >> Suppose I am interested a 1 liter cut point. I know how to find the >> proportion of responses at or below the 1 liter response; and the >> proportion >> of responses at or above the one liter response. This is what you are >> doing >> here, or it can be done with a z score. >> >> My question is, if the response is normally distributed, given n cows, how >> many total liters were included or accumulated in the only responses below >> 1 >> liter, and how many total liters were accumulated in only the responses at >> or above 1 liter. (Negative responses are possible). My interest is in >> the >> total liters, not the fraction of the total population either above or >> below >> the cutpoint. >> > > Assuming I understand the question correctly, I think this does what you > want. > > For a random variable distributed N(mu, sigma), > > E[X | X < 1] = mu + sigma*lambda(alpha) > > where > > alpha = (1 - mu) / sigma > lambda(alpha) = -phi(alpha) / Phi(alpha) > > Given a sample of size n, the number of observations less than 1 is > > numltone = n*Phi(alpha) > > so the total amount of milk given that X < 1 is > > milk = numltone*E[X | X < 1] > > In Stata, > > . clear all > . set mem 200m > (204800k) > . set seed 1 > . drawnorm x, means(2.05) sds(1.74) n(10000000) clear > (obs 10000000) > . summ x if x < 1, mean > . di r(sum) > -188292.37 > > . scalar alpha = (1 - 2.05) / 1.74 > . scalar lambda = -1*normalden(alpha) / normal(alpha) > . scalar condmean = 2.05 + 1.74*lambda > . scalar numltone = 10000000*normal(alpha) > . scalar condtotal = condmean*numltone > . di condtotal > -187432.23 > > Similarly, > > E[X | X > 1] = mu + sigma*lambda'(alpha) > > where > > lambda' = phi(alpha) / (1 - Phi(alpha)) > > In Stata, > > . scalar lambda2 = normalden(alpha) / (1 - normal(alpha)) > . scalar condmean2 = 2.05 + 1.74*lambda2 > . scalar condtotal2 = condmean2*(10000000 - numltone) > . di condtotal2 > 20687432 > > . sum x if x > 1, mean > . di r(sum) > 20685854 > > > -- Brian Poi > -- bpoi@stata.com > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/statalist/faq > * http://www.ats.ucla.edu/stat/stata/ > -- Steven Samuels sjsamuels@gmail.com 18 Cantine's Island Saugerties NY 12477 USA Voice: 845-246-0774 Fax: 206-202-4783 * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/statalist/faq * http://www.ats.ucla.edu/stat/stata/