Dear All,
thank you for your responses. Austin's solution to remove rho helped a
lot, but I am a bit confused with the other alternative that he
proposed, namely "keep the rho, but divide by N". What would N be in
the case, where the mean of X is be computed on one domain, and mean
of Y on another? (e.g. because of the missings). For example, X is
income, and Y is family size, and I were interested in per capita
income, but income was less often reported than family size. My
observation is that Stata stops with error "no observations" in case
when domains of X and Y do not overlap at all, but there is nothing in
the formula that prevents me from computing (approx) moments of X/Y in
this case. Should I restrict computations to the common domain where
both X and Y are not missing for some reason?
Alan, thank you for the comments regarding not existance of moments in
case the the X and Y are N(0,s2). In my case the expectations of X and
Y are guaranteed to be non-zero as they are the numbers of people with
non-exotic characteristics (like attending primary school).
Thank you,
Sergiy Radyakin
On Tue, Jan 13, 2009 at 10:56 AM, Stas Kolenikov <[email protected]> wrote:
> The variance of the ratio of two normals does not exist if the
> denominator has a mean of zero. Then yes we get a Cauchy distribution
> with all the ugly properties. If the mean is not zero, you are not
> dividing by zero very often, although I don't really know whether that
> distribution of that ratio is generally known.
>
> Back to Sergiy's question -- to gauge the differences between the
> variance estimation methods, you can also try -svy, jackknife- and see
> what it produces; the differences on -sysuse auto- dataset should be
> somewhere in the fourth or fifth decimal point. But I think Austin
> nailed down the analytical problem with an extra `rho'.
>
> Of course you are shooting sparrows with a cannon -- you probably
> could've achieved anything you needed using -nlcom-.
>
> On 1/13/09, Feiveson, Alan H. (JSC-SK311) <[email protected]> wrote:
>> It should also be noted that this delta method is just an approximation
>> - so it is not surprising that it might disagree with a simulaiton by
>> 10% or more. Also, technically the variance of a ratio of two normally
>> distributed random variables doesn't even exist! Therefore even a
>> simulation, if carried out long enough would produce arbitarily high
>> "SE" values. The saving grace is that if the variance of the denominator
>> is much smaller than the mean, we can "get away" with these
>> approximations for practical usage.
>>
>> Al Feiveson
>>
>>
>> -----Original Message-----
>> From: [email protected]
>> [mailto:[email protected]] On Behalf Of Jeph Herrin
>> Sent: Tuesday, January 13, 2009 7:11 AM
>> To: [email protected]
>> Subject: Re: st: Standard error of a ratio of two random variables
>>
>>
>> For what it's worth, a simulation may give the most accurate answer:
>>
>> quietly ci X
>> local mux=r(mean)
>> local sex=r(se)
>> quietly ci Y
>> local muy=r(mean)
>> local sey=r(se)
>> corr X Y
>> local rho=r(rho)
>> matrix b= (`mux' \ `muy')
>> matrix V= (1, `rho' \ `rho', 1)
>> matrix sd =(`sex' \ `sey')
>>
>> clear
>> set obs 100000
>> drawnorm X Y, means(b) corr(V) sds(sd)
>> gen ratio=X/Y
>> sum ratio
>>
>> The SD of the variable -ratio- should be the SE of X/Y.
>>
>> Jeph
>>
>>
>>
>>
>> Sergiy Radyakin wrote:
>> > Dear All,
>> >
>> > I need to find a standard error of Z a ratio of two random
>> > variables X and Y: Z=X/Y, where the means and SEs for X and Y are
>> > known, as well as their corr coefficient. (In my particular case X and
>>
>> > Y are numbers of people that have such and such characteristics). I am
>>
>> > using delta-method according to [
>> > http://www.math.umt.edu/patterson/549/Delta.pdf ] (see page 2(38)). I
>> > then use svy:ratio to check my results and they don't match. I wonder
>>
>> > if I am doing something wrong, or is it any kind of precision-related
>> > problem (the difference is about 7%, i.e. 1.7959 vs. 1.6795), or is my
>>
>> > check simply wrong and not applicable in this case.
>> >
>> > I would appreciate if someone could look into the code and let me
>> > know why the results are different.
>> >
>> > Thank you,
>> > Sergiy Radyakin
>> >
>> > Below is a do file that one can Ctrl+C/Ctrl+V to Stata's command line:
>> >
>> > **** BEGIN OF RV_RATIO.DO ****
>> > sysuse auto, clear
>> > generate byte www=1
>> > svyset [pw=www]
>> >
>> > capture program drop st_error_of_ratio program define
>> > st_error_of_ratio, rclass
>> > syntax varlist(min=2 max=2)
>> > svy: mean `varlist'
>> > matrix B=e(b)
>> > local mux=B[1,1]
>> > local muy=B[1,2]
>> > matrix V=e(V)
>> > local sigma2x=V[1,1]
>> > local sigma2y=V[2,2]
>> > local sigmax_sigmay=V[1,2]
>> > svyset
>> > corr `varlist' [aw`=r(wexp)']
>> > local rho=r(rho)
>> > return scalar se = sqrt((`mux')^2*`sigma2y'/(`muy')^4 +
>> > `sigma2x'/(`muy')^2 - 2*`mux'/(`muy')^3*`rho'*`sigmax_sigmay')
>> > end
>> > st_error_of_ratio price length
>> > display as text "Estimated SE=" as result r(se)
>> > svy: ratio price / length
>> > **** END OF RV_RATIO.DO ****
>> > *
>> > * For searches and help try:
>> > * http://www.stata.com/help.cgi?search
>> > * http://www.stata.com/support/statalist/faq
>> > * http://www.ats.ucla.edu/stat/stata/
>> >
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/statalist/faq
>> * http://www.ats.ucla.edu/stat/stata/
>>
>
>
> --
> Stas Kolenikov, also found at http://stas.kolenikov.name
> Small print: I use this email account for mailing lists only.
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/