Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
From | Lulu Zeng <luluzengnz@gmail.com> |
To | "statalist@hsphsun2.harvard.edu" <statalist@hsphsun2.harvard.edu> |
Subject | Re: st: Drawing from a known, non-regular, discrete distribution |
Date | Fri, 21 Feb 2014 20:09:41 +1100 |
Thank you so much Nick, really appreciate your help! Best Regards, Lulu On Thu, Feb 20, 2014 at 9:24 PM, Nick Cox <njcoxstata@gmail.com> wrote: > It's just subscripting. > > sysuse auto > di mpg[1] > list in 1 > > Subscripts are observation numbers. > > You should be familiar with the idea that subscripts can be > expressions. A common example is > > gen previous = value[_n-1] > > With an expression such as _n - 1 Stata works that out, observation by > observation. If _n is 1, _n - 1 = 0. value[0] is always treated as > missing. More straiightforwardly, if _n is 2, _n - 1 is 1, and so > forth. > > An expression can (easily) be a single variable. > > gen foo = varname[indices] > > just means > > foo[1] is varname[indices[1]] > foo[2] is varname[indices[2]] > > etc. > > Suppose > > indices varname > 3 10 > 1 20 > 2 30 > > then if foo is varname[indices], foo[1] is varname[indices[1]], namely > varname[3], namely 30. > > One variable serves as a look-up table. That's another terminology. > > Nick > njcoxstata@gmail.com > > > On 20 February 2014 10:05, Lulu Zeng <luluzengnz@gmail.com> wrote: >> Dear Nick, >> >> Thank you so much for your reply. >> >> The code works and seems to give me the draws I am looking for by >> looking at the range. >> >> But I have trouble understanding the last line of the code (around >> what the square brackets do): gen odo2 = odo[indices] >> >> I understand it generates a new variable using the original value and >> the draws, but not quite sure what it exactly does. I tried to look up >> the function of the square brackets but didn't find anything on the >> internet. >> >> Could you please explain the function of the square brackets please? >> >> Thank you for your consideration. >> >> Best Regards, >> Lulu >> >> >> >> >> >> On Wed, Feb 19, 2014 at 11:48 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>> Something like this? >>> >>> gen indices = . >>> mata >>> share = st_data(., "share") >>> share = share :/ sum(share) >>> y = rdiscrete(1000, 1, share) >>> st_store((1..1000)', "indices", y) >>> end >>> gen odo2 = odo[indices] >>> Nick >>> njcoxstata@gmail.com >>> >>> >>> On 19 February 2014 09:20, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>> Dear Nick and others, >>>> >>>> I have 1200 observations in my dataset. >>>> >>>> 1200 observations (of variable "share") define the probabilities (add >>>> up to 1) & 1200 pre-defined corresponding values to be drawn from >>>> (saved in variable "odo"). >>>> >>>> I am thinking of having 1000 draws in my sample. >>>> >>>> My data looks like below (but with more points). Draw value is >>>> pre-defined, each of them has a probability attached. >>>> >>>> Draw value Probability >>>> >>>> 0.5 0.15 >>>> >>>> 0.6 0.30 >>>> >>>> 0.2 0.25 >>>> >>>> 0.9 0.30 >>>> >>>> Thank you for your consideration :) >>>> >>>> >>>> Best Regards, >>>> Lulu >>>> >>>> On Wed, Feb 19, 2014 at 7:59 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>> My own thoughts on "Thanks in advance" are codified in the FAQ. >>>>> Seemingly no-one agrees with me. >>>>> >>>>> I will pose some questions here, but given other commitments I won't >>>>> be able to respond to any answers until _much_ later today, local >>>>> time. If someone else picks this up before then, fine by me, >>>>> naturally! >>>>> >>>>> How many observations are in your dataset? >>>>> How many observations define the probabilities? >>>>> How many values do you want in your sample? >>>>> >>>>> Nick >>>>> njcoxstata@gmail.com >>>>> >>>>> >>>>> >>>>> On 19 February 2014 08:51, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>> Dear Nick, >>>>>> >>>>>> Sorry that the (1..10)' in my example was a typo, I in fact used 1200 >>>>>> instead of 10 in my real experiment. It didn't work despite so. I also >>>>>> scaled "share" before calling meta, same error occurs. >>>>>> >>>>>> Also, by using -rdiscrete()-, I can see it draws a random number >>>>>> according to a distribution specified by "p" (and write the random >>>>>> draws into "odo2" using -st_store()- in my case), but I don't >>>>>> understand how -rdiscrete()- could draw from a given set of values >>>>>> (e.g., a pre-specified "odo2" -- this is really what I'm trying to do) >>>>>> instead of random values. >>>>>> >>>>>> My apologies if the answer to my question is straight forward, I am >>>>>> quite new to Meta. >>>>>> >>>>>> Thank you very much for your help in advance Nick. >>>>>> >>>>>> Best Regards, >>>>>> Lulu >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>> In my example, I have 10 probabilities in observations 1 to 10 of the >>>>>>> data, so use >>>>>>> (1..10)' as an argument. That will make sense for you if and only if >>>>>>> your probabilities are the same. See also help for -st_data()-. >>>>>>> Nick >>>>>>> njcoxstata@gmail.com >>>>>>> >>>>>>> >>>>>>> On 19 February 2014 00:09, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>> Dear Nick, >>>>>>>> >>>>>>>> Thank you for your suggestion. I must have done something incorrectly >>>>>>>> so mata still gives me the below error despite I did use -p :/ sum(p)- >>>>>>>> for rescaling as you suggested (I also tried to rescale the original >>>>>>>> probability variable but neither worked): >>>>>>>> >>>>>>>> sum of the probabilities must be 1 >>>>>>>> rdiscrete(): 3300 argument out of range >>>>>>>> <istmt>: - function returned error >>>>>>>> r(3300); >>>>>>>> >>>>>>>> >>>>>>>> My probability variable is "share", and "odo2" is my equivalent of >>>>>>>> your "y". All I did was: >>>>>>>> >>>>>>>> mata >>>>>>>> >>>>>>>> p = st_data((1..10)', "share") >>>>>>>> >>>>>>>> p :/ sum(p) >>>>>>>> >>>>>>>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p)) [this is where >>>>>>>> the error occurs] >>>>>>>> >>>>>>>> >>>>>>>> My apologies for coming back with the same question again. >>>>>>>> >>>>>>>> >>>>>>>> Best Regards, >>>>>>>> Lulu >>>>>>>> >>>>>>>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>>>> Here is an example of using -rdiscrete()- in Mata. In your case, the >>>>>>>>> probabilities are already in a variable. If -rdiscrete()- chokes on >>>>>>>>> small differences in total from 1, then check the probabilities and if >>>>>>>>> need be scale by -p :/ sum(p)-. >>>>>>>>> >>>>>>>>> . clear >>>>>>>>> >>>>>>>>> . set obs 1000 >>>>>>>>> obs was 0, now 1000 >>>>>>>>> >>>>>>>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05] >>>>>>>>> >>>>>>>>> . gen double p = p[1,_n] >>>>>>>>> (990 missing values generated) >>>>>>>>> >>>>>>>>> . list in 1/10, sep(0) >>>>>>>>> >>>>>>>>> +-----+ >>>>>>>>> | p | >>>>>>>>> |-----| >>>>>>>>> 1. | .2 | >>>>>>>>> 2. | .2 | >>>>>>>>> 3. | .1 | >>>>>>>>> 4. | .1 | >>>>>>>>> 5. | .1 | >>>>>>>>> 6. | .1 | >>>>>>>>> 7. | .05 | >>>>>>>>> 8. | .05 | >>>>>>>>> 9. | .05 | >>>>>>>>> 10. | .05 | >>>>>>>>> +-----+ >>>>>>>>> >>>>>>>>> . gen y = . >>>>>>>>> (1000 missing values generated) >>>>>>>>> >>>>>>>>> . mata >>>>>>>>> ------------------------------------------------- mata (type end to >>>>>>>>> exit) ------------------ >>>>>>>>> : p = st_data((1..10)', "p") >>>>>>>>> >>>>>>>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p)) >>>>>>>>> >>>>>>>>> : end >>>>>>>>> -------------------------------------------------------------------------------------------- >>>>>>>>> >>>>>>>>> . tab y >>>>>>>>> >>>>>>>>> y | Freq. Percent Cum. >>>>>>>>> ------------+----------------------------------- >>>>>>>>> 1 | 202 20.20 20.20 >>>>>>>>> 2 | 200 20.00 40.20 >>>>>>>>> 3 | 98 9.80 50.00 >>>>>>>>> 4 | 102 10.20 60.20 >>>>>>>>> 5 | 87 8.70 68.90 >>>>>>>>> 6 | 99 9.90 78.80 >>>>>>>>> 7 | 49 4.90 83.70 >>>>>>>>> 8 | 54 5.40 89.10 >>>>>>>>> 9 | 53 5.30 94.40 >>>>>>>>> 10 | 56 5.60 100.00 >>>>>>>>> ------------+----------------------------------- >>>>>>>>> Total | 1,000 100.00 >>>>>>>>> Nick >>>>>>>>> njcoxstata@gmail.com >>>>>>>>> >>>>>>>>> >>>>>>>>> On 18 February 2014 09:35, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in >>>>>>>>>> effect your sample would just be the observation numbers. >>>>>>>>>> Nick >>>>>>>>>> njcoxstata@gmail.com >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 18 February 2014 09:32, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>>>>>> Thanks for the details. >>>>>>>>>>> >>>>>>>>>>> The Mata function -rdiscrete()- should do most of whar you want. You >>>>>>>>>>> will need to map your values to integers 1 up and then read in the >>>>>>>>>>> probabilities so that they are copied from a variable to a vector in >>>>>>>>>>> Mata. Then select integers and reverse the mapping. >>>>>>>>>>> >>>>>>>>>>> Nick >>>>>>>>>>> njcoxstata@gmail.com >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 18 February 2014 09:17, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>>>>>> Dear Nick, >>>>>>>>>>>> >>>>>>>>>>>> My apologies for the unclear description. >>>>>>>>>>>> >>>>>>>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known, >>>>>>>>>>>> discrete values I want to draw; the other holds the corresponding >>>>>>>>>>>> probabilities. >>>>>>>>>>>> >>>>>>>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a >>>>>>>>>>>> random utility model. I am trying to draw from the distribution of >>>>>>>>>>>> this parameter of interest, and then divide it by the price parameter >>>>>>>>>>>> (which similarly has 2 associated variables too) to obtain a >>>>>>>>>>>> distribution of willingness to pay. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Best Regards, >>>>>>>>>>>> Lulu >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <njcoxstata@gmail.com> wrote: >>>>>>>>>>>>> You have not, so far as I can see, specified >>>>>>>>>>>>> >>>>>>>>>>>>> 1. How you are holding information on your distribution. Is it 1200 >>>>>>>>>>>>> known values with associated probabilities (so as two variables in >>>>>>>>>>>>> Stata), or is the information still outside Stata in some form? >>>>>>>>>>>>> >>>>>>>>>>>>> 2. What you expect to draw as a sample. >>>>>>>>>>>>> Nick >>>>>>>>>>>>> njcoxstata@gmail.com >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On 18 February 2014 03:58, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>>>>>>>> Dear Scott, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for your response. My apologies that I am still a little >>>>>>>>>>>>>> confused about how to do this in my case where I have 1,200 >>>>>>>>>>>>>> observation. Can I still use the cond() command without typing in each >>>>>>>>>>>>>> point of the draw? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best Regards, >>>>>>>>>>>>>> Lulu >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman >>>>>>>>>>>>>> <scott.merryman@gmail.com> wrote: >>>>>>>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> and the links within. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Scott >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <luluzengnz@gmail.com> wrote: >>>>>>>>>>>>>>>> Dear Statalist, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not >>>>>>>>>>>>>>>> normal or lognormal etc), discrete distribution. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> For example, taking draws from a distribution like the one below. >>>>>>>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given >>>>>>>>>>>>>>>> in the example. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Draw value Probability >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0.5 0.15 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0.6 0.30 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0.2 0.25 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 0.9 0.30 >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance >>>>>>>>>>>>>>>> each value be drawn, so it adds up to 1. >>>>>>>>>>>>>>> * >>>>>>>>>>>>>>> * For searches and help try: >>>>>>>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>>>>>>> * >>>>>>>>>>>>>> * For searches and help try: >>>>>>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>>>>>> * >>>>>>>>>>>>> * For searches and help try: >>>>>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>>>>> * >>>>>>>>>>>> * For searches and help try: >>>>>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>>> * >>>>>>>>> * For searches and help try: >>>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>>> * >>>>>>>> * For searches and help try: >>>>>>>> * http://www.stata.com/help.cgi?search >>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>>> * >>>>>>> * For searches and help try: >>>>>>> * http://www.stata.com/help.cgi?search >>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>>> * >>>>>> * For searches and help try: >>>>>> * http://www.stata.com/help.cgi?search >>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>>> * http://www.ats.ucla.edu/stat/stata/ >>>>> * >>>>> * For searches and help try: >>>>> * http://www.stata.com/help.cgi?search >>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>>> * http://www.ats.ucla.edu/stat/stata/ >>>> * >>>> * For searches and help try: >>>> * http://www.stata.com/help.cgi?search >>>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>>> * http://www.ats.ucla.edu/stat/stata/ >>> * >>> * For searches and help try: >>> * http://www.stata.com/help.cgi?search >>> * http://www.stata.com/support/faqs/resources/statalist-faq/ >>> * http://www.ats.ucla.edu/stat/stata/ >> * >> * For searches and help try: >> * http://www.stata.com/help.cgi?search >> * http://www.stata.com/support/faqs/resources/statalist-faq/ >> * http://www.ats.ucla.edu/stat/stata/ > * > * For searches and help try: > * http://www.stata.com/help.cgi?search > * http://www.stata.com/support/faqs/resources/statalist-faq/ > * http://www.ats.ucla.edu/stat/stata/ * * For searches and help try: * http://www.stata.com/help.cgi?search * http://www.stata.com/support/faqs/resources/statalist-faq/ * http://www.ats.ucla.edu/stat/stata/