Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Lulu Zeng <luluzengnz@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Drawing from a known, non-regular, discrete distribution |
Date | Wed, 19 Feb 2014 20:20:20 +1100 |
Dear Nick and others, I have 1200 observations in my dataset. 1200 observations (of variable "share") define the probabilities (add up to 1) & 1200 pre-defined corresponding values to be drawn from (saved in variable "odo"). I am thinking of having 1000 draws in my sample. My data looks like below (but with more points). Draw value is pre-defined, each of them has a probability attached. Draw value Probability 0.5 0.15 0.6 0.30 0.2 0.25 0.9 0.30 Thank you for your consideration :) Best Regards, Lulu On Wed, Feb 19, 2014 at 7:59 PM, Nick Cox <njcoxstata@gmail.com> wrote: > My own thoughts on "Thanks in advance" are codified in the FAQ. > Seemingly no-one agrees with me. > > I will pose some questions here, but given other commitments I won't > be able to respond to any answers until _much_ later today, local > time. If someone else picks this up before then, fine by me, > naturally! > > How many observations are in your dataset? > How many observations define the probabilities? > How many values do you want in your sample? > > Nick > njcoxstata@gmail.com > > > > On 19 February 2014 08:51, Lulu Zeng <luluzengnz@gmail.com> wrote: >> Dear Nick, >> >> Sorry that the (1..10)' in my example was a typo, I in fact used 1200 >> instead of 10 in my real experiment. It didn't work despite so. I also >> scaled "share" before calling meta, same error occurs. >> >> Also, by using -rdiscrete()-, I can see it draws a random number >> according to a distribution specified by "p" (and write the random >> draws into "odo2" using -st_store()- in my case), but I don't >> understand how -rdiscrete()- could draw from a given set of values >> (e.g., a pre-specified "odo2" -- this is really what I'm trying to do) >> instead of random values. >> >> My apologies if the answer to my question is straight forward, I am >> quite new to Meta. >> >> Thank you very much for your help in advance Nick. >> >> Best Regards, >> Lulu >> >> >> >> On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>> In my example, I have 10 probabilities in observations 1 to 10 of the >>> data, so use >>> (1..10)' as an argument. That will make sense for you if and only if >>> your probabilities are the same. See also help for -st_data()-. >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 19 February 2014 00:09, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>> Dear Nick, >>>> >>>> Thank you for your suggestion. I must have done something incorrectly >>>> so mata still gives me the below error despite I did use -p :/ sum(p)- >>>> for rescaling as you suggested (I also tried to rescale the original >>>> probability variable but neither worked): >>>> >>>> sum of the probabilities must be 1 >>>> rdiscrete(): 3300 argument out of range >>>> <istmt>: - function returned error >>>> r(3300); >>>> >>>> >>>> My probability variable is "share", and "odo2" is my equivalent of >>>> your "y". All I did was: >>>> >>>> mata >>>> >>>> p = st_data((1..10)', "share") >>>> >>>> p :/ sum(p) >>>> >>>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p)) [this is where >>>> the error occurs] >>>> >>>> >>>> My apologies for coming back with the same question again. >>>> >>>> >>>> Best Regards, >>>> Lulu >>>> >>>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>> Here is an example of using -rdiscrete()- in Mata. In your case, the >>>>> probabilities are already in a variable. If -rdiscrete()- chokes on >>>>> small differences in total from 1, then check the probabilities and if >>>>> need be scale by -p :/ sum(p)-. >>>>> >>>>> . clear >>>>> >>>>> . set obs 1000 >>>>> obs was 0, now 1000 >>>>> >>>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05] >>>>> >>>>> . gen double p = p[1,_n] >>>>> (990 missing values generated) >>>>> >>>>> . list in 1/10, sep(0) >>>>> >>>>> +-----+ >>>>> | p | >>>>> |-----| >>>>> 1. | .2 | >>>>> 2. | .2 | >>>>> 3. | .1 | >>>>> 4. | .1 | >>>>> 5. | .1 | >>>>> 6. | .1 | >>>>> 7. | .05 | >>>>> 8. | .05 | >>>>> 9. | .05 | >>>>> 10. | .05 | >>>>> +-----+ >>>>> >>>>> . gen y = . >>>>> (1000 missing values generated) >>>>> >>>>> . mata >>>>> ------------------------------------------------- mata (type end to >>>>> exit) ------------------ >>>>> : p = st_data((1..10)', "p") >>>>> >>>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p)) >>>>> >>>>> : end >>>>> -------------------------------------------------------------------------------------------- >>>>> >>>>> . tab y >>>>> >>>>> y | Freq. Percent Cum. >>>>> ------------+----------------------------------- >>>>> 1 | 202 20.20 20.20 >>>>> 2 | 200 20.00 40.20 >>>>> 3 | 98 9.80 50.00 >>>>> 4 | 102 10.20 60.20 >>>>> 5 | 87 8.70 68.90 >>>>> 6 | 99 9.90 78.80 >>>>> 7 | 49 4.90 83.70 >>>>> 8 | 54 5.40 89.10 >>>>> 9 | 53 5.30 94.40 >>>>> 10 | 56 5.60 100.00 >>>>> ------------+----------------------------------- >>>>> Total | 1,000 100.00 >>>>> Nick >>>>> njcoxstata@gmail.com >>>>> >>>>> >>>>> On 18 February 2014 09:35, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in >>>>>> effect your sample would just be the observation numbers. >>>>>> Nick >>>>>> njcoxstata@gmail.com >>>>>> >>>>>> >>>>>> On 18 February 2014 09:32, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>> Thanks for the details. >>>>>>> >>>>>>> The Mata function -rdiscrete()- should do most of whar you want. You >>>>>>> will need to map your values to integers 1 up and then read in the >>>>>>> probabilities so that they are copied from a variable to a vector in >>>>>>> Mata. Then select integers and reverse the mapping. >>>>>>> >>>>>>> Nick >>>>>>> njcoxstata@gmail.com >>>>>>> >>>>>>> >>>>>>> On 18 February 2014 09:17, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>> Dear Nick, >>>>>>>> >>>>>>>> My apologies for the unclear description. >>>>>>>> >>>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known, >>>>>>>> discrete values I want to draw; the other holds the corresponding >>>>>>>> probabilities. >>>>>>>> >>>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a >>>>>>>> random utility model. I am trying to draw from the distribution of >>>>>>>> this parameter of interest, and then divide it by the price parameter >>>>>>>> (which similarly has 2 associated variables too) to obtain a >>>>>>>> distribution of willingness to pay. >>>>>>>> >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Lulu >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>>>> You have not, so far as I can see, specified >>>>>>>>> >>>>>>>>> 1. How you are holding information on your distribution. Is it 1200 >>>>>>>>> known values with associated probabilities (so as two variables in >>>>>>>>> Stata), or is the information still outside Stata in some form? >>>>>>>>> >>>>>>>>> 2. What you expect to draw as a sample. >>>>>>>>> Nick >>>>>>>>> njcoxstata@gmail.com >>>>>>>>> >>>>>>>>> >>>>>>>>> On 18 February 2014 03:58, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>>>> Dear Scott, >>>>>>>>>> >>>>>>>>>> Thank you for your response. My apologies that I am still a little >>>>>>>>>> confused about how to do this in my case where I have 1,200 >>>>>>>>>> observation. Can I still use the cond() command without typing in each >>>>>>>>>> point of the draw? >>>>>>>>>> >>>>>>>>>> Best Regards, >>>>>>>>>> Lulu >>>>>>>>>> >>>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman >>>>>>>>>> <scott.merryman@gmail.com> wrote: >>>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html >>>>>>>>>>> >>>>>>>>>>> and the links within. >>>>>>>>>>> >>>>>>>>>>> Scott >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>>>>>> Dear Statalist, >>>>>>>>>>>> >>>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not >>>>>>>>>>>> normal or lognormal etc), discrete distribution. >>>>>>>>>>>> >>>>>>>>>>>> For example, taking draws from a distribution like the one below. >>>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given >>>>>>>>>>>> in the example. >>>>>>>>>>>> >>>>>>>>>>>> Draw value Probability >>>>>>>>>>>> >>>>>>>>>>>> 0.5 0.15 >>>>>>>>>>>> >>>>>>>>>>>> 0.6 0.30 >>>>>>>>>>>> >>>>>>>>>>>> 0.2 0.25 >>>>>>>>>>>> >>>>>>>>>>>> 0.9 0.30 >>>>>>>>>>>> >>>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance >>>>>>>>>>>> each value be drawn, so it adds up to 1. >>>>>>>>>>> * >>>>>>>>>>> * For searches and help try: >>>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>>> * >>>>>>>>>> * For searches and help try: >>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>> * >>>>>>>>> * For searches and help try: >>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>> * >>>>>>>> * For searches and help try: >>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/