Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: Trying to simulate sampling distribution of mean
From
Nick Cox <[email protected]>
To
[email protected]
Subject
Re: st: Trying to simulate sampling distribution of mean
Date
Wed, 30 Jan 2013 01:30:13 +0000
If you want to do it this way, you can simplify your program
program ybar
qui use big.dta, clear
sample 60, count
su age, meanonly
end
I think that should still work. -syntax- does nothing for you.
-summarize- leaves r(mean) in its wake any way. Taking a variable and
putting it in another and taking a saved result and putting it in
another can both be excised.
Nick
On Wed, Jan 30, 2013 at 12:04 AM, krishanu karmakar
<[email protected]> wrote:
> Thank you Dr. Cox,
>
> I did a little bit more searching and with the help of your answer I
> modified my -ybar- program as follows
>
> -----------------------------
> program define ybar, rclass
> syntax [,]
> qui use big.dta, clear
> sample 60, count
> gen y1 = age
> summ y1
> return scalar my = r(mean)
> end
>
> local reps 5
> simulate rmy=r(my), saving(sdistmean`i', replace) nodots reps(`reps'): ybar
> -----------------------------------
> yes, I should probably put the -use- command as an option to the
> -ybar- program to make it more generally usable. But, otherwise, it is
> now working as i wanted it to.
>
> Thank you again.
> Krishanu
>
>
> On Tue, Jan 29, 2013 at 6:51 PM, Nick Cox <[email protected]> wrote:
>> Your program -ybar- does exactly the same thing every time, so
>> inevitably the results are the same. If you look again at the help for
>> -simulate- you will see that the example program -lnsim- includes its
>> own random variate generation. Conversely, you do use -sample 0.1- but
>> you use it outside your program.
>>
>> Otherwise put, -simulate- does not actually do stochastic simulation;
>> it is just a framework that runs and collates the results of a program
>> you write -- and that program must do the simulation
>>
>> In your case, there is an easy way of getting random samples from your
>> dataset. Just chop the dataset into blocks randomly and summarize each
>> block. .
>>
>> If you shuffle your data
>>
>> set seed 2803
>> gen random = runiform()
>> sort random
>>
>> and create blocks of size 100
>>
>> gen block = ceil(_n/100)
>>
>> then
>>
>> egen mean = mean(age), by(block)
>> egen tag = tag(block)
>> l mean if tag
>>
>> that will give you 1000 means each for blocks of size 100. For some
>> reason, it seems that you only want 5, and that means you can throw
>> 995 away.
>>
>> Nick
>>
>> On Tue, Jan 29, 2013 at 11:15 PM, krishanu karmakar
>> <[email protected]> wrote:
>>
>>> The following is my code
>>>
>>> ==== code start =====
>>>
>>> program define ybar, rclass
>>> syntax [,]
>>> replace y1 = y2
>>> summarize y1
>>> return scalar m_y = r(mean)
>>> end
>>>
>>>
>>> local reps 5
>>>
>>> quietly use big.dta, clear
>>> generate y2 = age
>>> sample 0.1
>>>
>>> quietly{
>>> gen y1=.
>>> simulate m_age=r(m_y), saving(meandata, replace) nodots reps(`reps'): ybar
>>> }
>>>
>>> ==== code ends =====
>>>
>>> What I am trying to do.
>>> I have a dataset named "big.dta" with 100,000 observations. The only
>>> variable in this dataset is "age".
>>>
>>> I want to first draw a sample of size 100 from this dataset and
>>> calculate the mean for the variable "age". I want to draw 5 such
>>> samples and store the mean of "age" from each sample as the variable
>>> "m_age" in a new dataset called "meandata". So this dataset will have
>>> 5 observations.
>>>
>>> My code is running, but wrongly. I am getting stata to save the
>>> "meandata", but all the five observations (mean of age from 5
>>> different samples) are stored as equal in value. That means stata is
>>> not drawing 5 different samples, but only one sample. Could anyone
>>> help by showing which line my code should I change?
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/