Stata: Data Analysis and Statistical Software

Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: st: Trying to simulate sampling distribution of mean

From	krishanu karmakar <[email protected]>
To	[email protected]
Subject	Re: st: Trying to simulate sampling distribution of mean
Date	Tue, 29 Jan 2013 19:04:50 -0500

Thank you Dr. Cox,

I did a little bit more searching and with the help of your answer I
modified my -ybar- program as follows

-----------------------------
program define ybar, rclass
	syntax [,]
	qui use big.dta, clear	
	sample 60, count
	gen y1 = age
	summ y1
	return scalar my = r(mean)
end

local reps 5
simulate rmy=r(my), saving(sdistmean`i', replace) nodots reps(`reps'): ybar
-----------------------------------
yes, I should probably put the -use- command as an option to the
-ybar- program to make it more generally usable. But, otherwise, it is
now working as i wanted it to.

Thank you again.
Krishanu


On Tue, Jan 29, 2013 at 6:51 PM, Nick Cox <[email protected]> wrote:
> Your program -ybar- does exactly the same thing every time, so
> inevitably the results are the same. If you look again at the help for
> -simulate- you will see that the example program -lnsim- includes its
> own random variate generation. Conversely, you do use -sample 0.1- but
> you use it outside your program.
>
> Otherwise put, -simulate- does not actually do stochastic simulation;
> it is just a framework that runs and collates the results of a program
> you write -- and that program must do the simulation
>
> In your case, there is an easy way of getting random samples from your
> dataset. Just chop the dataset into blocks randomly and summarize each
> block. .
>
> If you shuffle your data
>
> set seed 2803
> gen random = runiform()
> sort random
>
> and create blocks of size 100
>
> gen block = ceil(_n/100)
>
> then
>
> egen mean = mean(age), by(block)
> egen tag = tag(block)
> l mean if tag
>
> that will give you 1000 means each for blocks of size 100. For some
> reason, it seems that you only want 5, and that means you can throw
> 995 away.
>
> Nick
>
> On Tue, Jan 29, 2013 at 11:15 PM, krishanu karmakar
> <[email protected]> wrote:
>
>> The following is my code
>>
>> ==== code start =====
>>
>> program define ybar, rclass
>>         syntax [,]
>>         replace y1 = y2
>>         summarize y1
>>         return scalar m_y = r(mean)
>> end
>>
>>
>> local reps 5
>>
>>         quietly use big.dta, clear
>>         generate y2 = age
>>         sample 0.1
>>
>>         quietly{
>>         gen y1=.
>>         simulate m_age=r(m_y), saving(meandata, replace) nodots reps(`reps'): ybar
>> }
>>
>> ==== code ends =====
>>
>> What I am trying to do.
>> I have a dataset named "big.dta" with 100,000 observations. The only
>> variable in this dataset is "age".
>>
>> I want to first draw a sample of size 100 from this dataset and
>> calculate the mean for the variable "age". I want to draw 5 such
>> samples and store the mean of "age" from each sample as the variable
>> "m_age" in a new dataset called "meandata". So this dataset will have
>> 5 observations.
>>
>> My code is running, but wrongly. I am getting stata to save the
>> "meandata", but all the five observations (mean of age from 5
>> different samples) are stored as equal in value. That means stata is
>> not drawing 5 different samples, but only one sample. Could anyone
>> help by showing which line my code should I change?
> *
> *   For searches and help try:
> *   http://www.stata.com/help.cgi?search
> *   http://www.stata.com/support/faqs/resources/statalist-faq/
> *   http://www.ats.ucla.edu/stat/stata/



-- 
Read it: http://www.stata.com/support/faqs/res/statalist.html
Specially Question 3.
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/faqs/resources/statalist-faq/
*   http://www.ats.ucla.edu/stat/stata/

Follow-Ups:
- Re: st: Trying to simulate sampling distribution of mean
  - From: Nick Cox <[email protected]>

References:
- st: Trying to simulate sampling distribution of mean
  - From: krishanu karmakar <[email protected]>
- Re: st: Trying to simulate sampling distribution of mean
  - From: Nick Cox <[email protected]>

Prev by Date: Re: st: Writing to large Excel files
Next by Date: st: Trouble with mixlogit (error) with binary dependent variable
Previous by thread: Re: st: Trying to simulate sampling distribution of mean
Next by thread: Re: st: Trying to simulate sampling distribution of mean
Index(es):
- Date
- Thread