Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Drawing from a known, non-regular, discrete distribution
From
Lulu Zeng <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: Drawing from a known, non-regular, discrete distribution
Date
Fri, 21 Feb 2014 20:09:41 +1100
Thank you so much Nick, really appreciate your help!
Best Regards,
Lulu
On Thu, Feb 20, 2014 at 9:24 PM, Nick Cox <[email protected]> wrote:
> It's just subscripting.
>
> sysuse auto
> di mpg[1]
> list in 1
>
> Subscripts are observation numbers.
>
> You should be familiar with the idea that subscripts can be
> expressions. A common example is
>
> gen previous = value[_n-1]
>
> With an expression such as _n - 1 Stata works that out, observation by
> observation. If _n is 1, _n - 1 = 0. value[0] is always treated as
> missing. More straiightforwardly, if _n is 2, _n - 1 is 1, and so
> forth.
>
> An expression can (easily) be a single variable.
>
> gen foo = varname[indices]
>
> just means
>
> foo[1] is varname[indices[1]]
> foo[2] is varname[indices[2]]
>
> etc.
>
> Suppose
>
> indices varname
> 3 10
> 1 20
> 2 30
>
> then if foo is varname[indices], foo[1] is varname[indices[1]], namely
> varname[3], namely 30.
>
> One variable serves as a look-up table. That's another terminology.
>
> Nick
> [email protected]
>
>
> On 20 February 2014 10:05, Lulu Zeng <[email protected]> wrote:
>> Dear Nick,
>>
>> Thank you so much for your reply.
>>
>> The code works and seems to give me the draws I am looking for by
>> looking at the range.
>>
>> But I have trouble understanding the last line of the code (around
>> what the square brackets do): gen odo2 = odo[indices]
>>
>> I understand it generates a new variable using the original value and
>> the draws, but not quite sure what it exactly does. I tried to look up
>> the function of the square brackets but didn't find anything on the
>> internet.
>>
>> Could you please explain the function of the square brackets please?
>>
>> Thank you for your consideration.
>>
>> Best Regards,
>> Lulu
>>
>>
>>
>>
>>
>> On Wed, Feb 19, 2014 at 11:48 PM, Nick Cox <[email protected]> wrote:
>>> Something like this?
>>>
>>> gen indices = .
>>> mata
>>> share = st_data(., "share")
>>> share = share :/ sum(share)
>>> y = rdiscrete(1000, 1, share)
>>> st_store((1..1000)', "indices", y)
>>> end
>>> gen odo2 = odo[indices]
>>> Nick
>>> [email protected]
>>>
>>>
>>> On 19 February 2014 09:20, Lulu Zeng <[email protected]> wrote:
>>>> Dear Nick and others,
>>>>
>>>> I have 1200 observations in my dataset.
>>>>
>>>> 1200 observations (of variable "share") define the probabilities (add
>>>> up to 1) & 1200 pre-defined corresponding values to be drawn from
>>>> (saved in variable "odo").
>>>>
>>>> I am thinking of having 1000 draws in my sample.
>>>>
>>>> My data looks like below (but with more points). Draw value is
>>>> pre-defined, each of them has a probability attached.
>>>>
>>>> Draw value Probability
>>>>
>>>> 0.5 0.15
>>>>
>>>> 0.6 0.30
>>>>
>>>> 0.2 0.25
>>>>
>>>> 0.9 0.30
>>>>
>>>> Thank you for your consideration :)
>>>>
>>>>
>>>> Best Regards,
>>>> Lulu
>>>>
>>>> On Wed, Feb 19, 2014 at 7:59 PM, Nick Cox <[email protected]> wrote:
>>>>> My own thoughts on "Thanks in advance" are codified in the FAQ.
>>>>> Seemingly no-one agrees with me.
>>>>>
>>>>> I will pose some questions here, but given other commitments I won't
>>>>> be able to respond to any answers until _much_ later today, local
>>>>> time. If someone else picks this up before then, fine by me,
>>>>> naturally!
>>>>>
>>>>> How many observations are in your dataset?
>>>>> How many observations define the probabilities?
>>>>> How many values do you want in your sample?
>>>>>
>>>>> Nick
>>>>> [email protected]
>>>>>
>>>>>
>>>>>
>>>>> On 19 February 2014 08:51, Lulu Zeng <[email protected]> wrote:
>>>>>> Dear Nick,
>>>>>>
>>>>>> Sorry that the (1..10)' in my example was a typo, I in fact used 1200
>>>>>> instead of 10 in my real experiment. It didn't work despite so. I also
>>>>>> scaled "share" before calling meta, same error occurs.
>>>>>>
>>>>>> Also, by using -rdiscrete()-, I can see it draws a random number
>>>>>> according to a distribution specified by "p" (and write the random
>>>>>> draws into "odo2" using -st_store()- in my case), but I don't
>>>>>> understand how -rdiscrete()- could draw from a given set of values
>>>>>> (e.g., a pre-specified "odo2" -- this is really what I'm trying to do)
>>>>>> instead of random values.
>>>>>>
>>>>>> My apologies if the answer to my question is straight forward, I am
>>>>>> quite new to Meta.
>>>>>>
>>>>>> Thank you very much for your help in advance Nick.
>>>>>>
>>>>>> Best Regards,
>>>>>> Lulu
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 19, 2014 at 11:54 AM, Nick Cox <[email protected]> wrote:
>>>>>>> In my example, I have 10 probabilities in observations 1 to 10 of the
>>>>>>> data, so use
>>>>>>> (1..10)' as an argument. That will make sense for you if and only if
>>>>>>> your probabilities are the same. See also help for -st_data()-.
>>>>>>> Nick
>>>>>>> [email protected]
>>>>>>>
>>>>>>>
>>>>>>> On 19 February 2014 00:09, Lulu Zeng <[email protected]> wrote:
>>>>>>>> Dear Nick,
>>>>>>>>
>>>>>>>> Thank you for your suggestion. I must have done something incorrectly
>>>>>>>> so mata still gives me the below error despite I did use -p :/ sum(p)-
>>>>>>>> for rescaling as you suggested (I also tried to rescale the original
>>>>>>>> probability variable but neither worked):
>>>>>>>>
>>>>>>>> sum of the probabilities must be 1
>>>>>>>> rdiscrete(): 3300 argument out of range
>>>>>>>> <istmt>: - function returned error
>>>>>>>> r(3300);
>>>>>>>>
>>>>>>>>
>>>>>>>> My probability variable is "share", and "odo2" is my equivalent of
>>>>>>>> your "y". All I did was:
>>>>>>>>
>>>>>>>> mata
>>>>>>>>
>>>>>>>> p = st_data((1..10)', "share")
>>>>>>>>
>>>>>>>> p :/ sum(p)
>>>>>>>>
>>>>>>>> st_store(., "odo2", rdiscrete(st_nobs(), 1, p)) [this is where
>>>>>>>> the error occurs]
>>>>>>>>
>>>>>>>>
>>>>>>>> My apologies for coming back with the same question again.
>>>>>>>>
>>>>>>>>
>>>>>>>> Best Regards,
>>>>>>>> Lulu
>>>>>>>>
>>>>>>>> On Tue, Feb 18, 2014 at 11:37 PM, Nick Cox <[email protected]> wrote:
>>>>>>>>> Here is an example of using -rdiscrete()- in Mata. In your case, the
>>>>>>>>> probabilities are already in a variable. If -rdiscrete()- chokes on
>>>>>>>>> small differences in total from 1, then check the probabilities and if
>>>>>>>>> need be scale by -p :/ sum(p)-.
>>>>>>>>>
>>>>>>>>> . clear
>>>>>>>>>
>>>>>>>>> . set obs 1000
>>>>>>>>> obs was 0, now 1000
>>>>>>>>>
>>>>>>>>> . mat p = [0.2,0.2,0.1,0.1,0.1,0.1,0.05,0.05,0.05,0.05]
>>>>>>>>>
>>>>>>>>> . gen double p = p[1,_n]
>>>>>>>>> (990 missing values generated)
>>>>>>>>>
>>>>>>>>> . list in 1/10, sep(0)
>>>>>>>>>
>>>>>>>>> +-----+
>>>>>>>>> | p |
>>>>>>>>> |-----|
>>>>>>>>> 1. | .2 |
>>>>>>>>> 2. | .2 |
>>>>>>>>> 3. | .1 |
>>>>>>>>> 4. | .1 |
>>>>>>>>> 5. | .1 |
>>>>>>>>> 6. | .1 |
>>>>>>>>> 7. | .05 |
>>>>>>>>> 8. | .05 |
>>>>>>>>> 9. | .05 |
>>>>>>>>> 10. | .05 |
>>>>>>>>> +-----+
>>>>>>>>>
>>>>>>>>> . gen y = .
>>>>>>>>> (1000 missing values generated)
>>>>>>>>>
>>>>>>>>> . mata
>>>>>>>>> ------------------------------------------------- mata (type end to
>>>>>>>>> exit) ------------------
>>>>>>>>> : p = st_data((1..10)', "p")
>>>>>>>>>
>>>>>>>>> : st_store(., "y", rdiscrete(st_nobs(), 1, p))
>>>>>>>>>
>>>>>>>>> : end
>>>>>>>>> --------------------------------------------------------------------------------------------
>>>>>>>>>
>>>>>>>>> . tab y
>>>>>>>>>
>>>>>>>>> y | Freq. Percent Cum.
>>>>>>>>> ------------+-----------------------------------
>>>>>>>>> 1 | 202 20.20 20.20
>>>>>>>>> 2 | 200 20.00 40.20
>>>>>>>>> 3 | 98 9.80 50.00
>>>>>>>>> 4 | 102 10.20 60.20
>>>>>>>>> 5 | 87 8.70 68.90
>>>>>>>>> 6 | 99 9.90 78.80
>>>>>>>>> 7 | 49 4.90 83.70
>>>>>>>>> 8 | 54 5.40 89.10
>>>>>>>>> 9 | 53 5.30 94.40
>>>>>>>>> 10 | 56 5.60 100.00
>>>>>>>>> ------------+-----------------------------------
>>>>>>>>> Total | 1,000 100.00
>>>>>>>>> Nick
>>>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 18 February 2014 09:35, Nick Cox <[email protected]> wrote:
>>>>>>>>>> The "mapping" (if I am guessing correctly) is in fact trivial as in
>>>>>>>>>> effect your sample would just be the observation numbers.
>>>>>>>>>> Nick
>>>>>>>>>> [email protected]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 18 February 2014 09:32, Nick Cox <[email protected]> wrote:
>>>>>>>>>>> Thanks for the details.
>>>>>>>>>>>
>>>>>>>>>>> The Mata function -rdiscrete()- should do most of whar you want. You
>>>>>>>>>>> will need to map your values to integers 1 up and then read in the
>>>>>>>>>>> probabilities so that they are copied from a variable to a vector in
>>>>>>>>>>> Mata. Then select integers and reverse the mapping.
>>>>>>>>>>>
>>>>>>>>>>> Nick
>>>>>>>>>>> [email protected]
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 18 February 2014 09:17, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>> Dear Nick,
>>>>>>>>>>>>
>>>>>>>>>>>> My apologies for the unclear description.
>>>>>>>>>>>>
>>>>>>>>>>>> 1. I have 2 variables in Stata, one variable holds the 1200 known,
>>>>>>>>>>>> discrete values I want to draw; the other holds the corresponding
>>>>>>>>>>>> probabilities.
>>>>>>>>>>>>
>>>>>>>>>>>> 2. The 2 variables are associated with a parameter (attribute) of a
>>>>>>>>>>>> random utility model. I am trying to draw from the distribution of
>>>>>>>>>>>> this parameter of interest, and then divide it by the price parameter
>>>>>>>>>>>> (which similarly has 2 associated variables too) to obtain a
>>>>>>>>>>>> distribution of willingness to pay.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>> Lulu
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Feb 18, 2014 at 7:47 PM, Nick Cox <[email protected]> wrote:
>>>>>>>>>>>>> You have not, so far as I can see, specified
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. How you are holding information on your distribution. Is it 1200
>>>>>>>>>>>>> known values with associated probabilities (so as two variables in
>>>>>>>>>>>>> Stata), or is the information still outside Stata in some form?
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. What you expect to draw as a sample.
>>>>>>>>>>>>> Nick
>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On 18 February 2014 03:58, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>>>> Dear Scott,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you for your response. My apologies that I am still a little
>>>>>>>>>>>>>> confused about how to do this in my case where I have 1,200
>>>>>>>>>>>>>> observation. Can I still use the cond() command without typing in each
>>>>>>>>>>>>>> point of the draw?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best Regards,
>>>>>>>>>>>>>> Lulu
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Feb 18, 2014 at 1:50 PM, Scott Merryman
>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>> http://www.stata.com/statalist/archive/2012-08/msg00256.html
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and the links within.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Scott
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 16, 2014 at 9:15 PM, Lulu Zeng <[email protected]> wrote:
>>>>>>>>>>>>>>>> Dear Statalist,
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I am seeking help with taking draws from a known, non-regular (not
>>>>>>>>>>>>>>>> normal or lognormal etc), discrete distribution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For example, taking draws from a distribution like the one below.
>>>>>>>>>>>>>>>> However, in my case I have 1,200 points instead of the 4 points given
>>>>>>>>>>>>>>>> in the example.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Draw value Probability
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 0.5 0.15
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 0.6 0.30
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 0.2 0.25
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 0.9 0.30
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The "draw value" is the value to be drawn, "probability" is the chance
>>>>>>>>>>>>>>>> each value be drawn, so it adds up to 1.
>>>>>>>>>>>>>>> *
>>>>>>>>>>>>>>> * For searches and help try:
>>>>>>>>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>>>> *
>>>>>>>>>>>>>> * For searches and help try:
>>>>>>>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>>> *
>>>>>>>>>>>>> * For searches and help try:
>>>>>>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>>>>>> *
>>>>>>>>>>>> * For searches and help try:
>>>>>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>>> *
>>>>>>>>> * For searches and help try:
>>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>>> *
>>>>>>>> * For searches and help try:
>>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>>> *
>>>>>>> * For searches and help try:
>>>>>>> * http://www.stata.com/help.cgi?search
>>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>>> *
>>>>>> * For searches and help try:
>>>>>> * http://www.stata.com/help.cgi?search
>>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>>> *
>>>>> * For searches and help try:
>>>>> * http://www.stata.com/help.cgi?search
>>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>>> * http://www.ats.ucla.edu/stat/stata/
>>>> *
>>>> * For searches and help try:
>>>> * http://www.stata.com/help.cgi?search
>>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>>> * http://www.ats.ucla.edu/stat/stata/
>>> *
>>> * For searches and help try:
>>> * http://www.stata.com/help.cgi?search
>>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>>> * http://www.ats.ucla.edu/stat/stata/
>> *
>> * For searches and help try:
>> * http://www.stata.com/help.cgi?search
>> * http://www.stata.com/support/faqs/resources/statalist-faq/
>> * http://www.ats.ucla.edu/stat/stata/
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/