Statalist


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

st: AW: R: RE: odd results after insample


From   "Martin Weiss" <[email protected]>
To   <[email protected]>
Subject   st: AW: R: RE: odd results after insample
Date   Sat, 26 Sep 2009 19:50:30 +0200

<> 

So if you wanted the entire figure 1 under Stata -version- 9.2, you would
probably want to install Bobby`s -findit gendist- and then:


*************
clear*
vers 9.2

//lognormal part
capt prog drop myprog

prog def myprog, rclass
vers 9.2
	syntax newvarname(numeric max=1), [obs(integer 100) cov(real 2)]
	set obs `obs'
	loc sd = sqrt(log(`cov'^2+1))
	loc mean = log(1000)-.5*`sd'^2
	gen `varlist'=exp(invnormal(uniform())*`sd'+`mean')
	qui su `varlist'
	ret sca mean=r(mean)
	ret sca cv=r(sd)/r(mean)
end


loc gra 
loc j 1

//for sample size 2000
foreach cv in 0.25 0.5 1 1.5 2{
	myprog lognorm`j', obs(2000) cov(`cv')
	loc gra `gra' (kdensity lognorm`j' if lognorm`j'<3000)
	loc ++j
} 

//see the mean and coeff of variation
tabstat _all, stat(mean cv sd)

tw `gra', legend(off) nodraw /* 
*/ name(lognormal, replace)

//gamma part
capt prog drop mynewprog

prog def mynewprog, rclass
vers 9.2
syntax newvarname(numeric max=1) [,obs(integer 100) cov(real 2)]
set obs `obs'
	loc shape = (`cov')^(-2)
	loc scale = 1000/`shape'
	gengamma `varlist', alpha(`shape') beta(`scale')
	qui su `varlist'
	ret sca mean=r(mean)
	ret sca cv=r(sd)/r(mean)
end

loc gra 
loc j 1

//for sample size 2000
foreach cv in 0.25 0.5 1 1.5 2{
	mynewprog gamma`j', obs(2000) cov(`cv')
	loc gra `gra' (kdensity gamma`j' if gamma`j'<3000)
	loc ++j
} 

//see the mean and coeff of variation
tabstat _all, stat(mean cv sd)

tw `gra', legend(off) /* nodraw 
*/ name(gamma, replace)

//combine 'em
gr combine lognormal gamma, /* 
*/ cols(1)
*************



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Carlo Lazzaro
Gesendet: Samstag, 26. September 2009 16:20
An: [email protected]
Cc: 'Martin Weiss'
Betreff: st: R: RE: odd results after insample

Dear Martin,
thanks a lot for your kind reply.

The approach sketched in my previous message follows the one suggested by:
Briggs, A. and Nixon, R. and Dixon, S. and Thompson, S. (2005) Parametric
modelling of cost data: some simulation evidence. Health Economics 14(4):pp.
421-428.

So far, I have been quite successful with other Stata procedures for drawing
random samples from a given distribution (for instance, -simulate-),
including the approach you kindly advice me about.

Unfortunately, I cannot figure out what went wrong with this last do_file.

Thanks a lot again and enjoy your W_E.

Kind Regards,
Carlo
-----Messaggio originale-----
Da: [email protected]
[mailto:[email protected]] Per conto di Martin Weiss
Inviato: sabato 26 settembre 2009 15.21
A: [email protected]
Oggetto: st: RE: odd results after insample


<>

Just out of curiosity: If you want 20 obs per sample, and 2,000 samples,
should that not lead to 40,000 observations overall?  


HTH
Martin

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Carlo Lazzaro
Sent: Samstag, 26. September 2009 15:00
To: [email protected]
Subject: st: odd results after insample

Dear Statalisters,
as an alternative to - simulate - , I have written the following do file
(for Stata 9.2/SE) to draw 2000 random samples, 20 observations each, from a
normal distribution: 

drop _all
set more off
set obs 2000
obs was 0, now 2000
g double ln_g_20=.
g double ln_sd_g_20=.
set seed 999
qui gen A=5.37 + 1.19*invnorm(uniform()) in 1/972
qui forvalues i = 1(1)2000 {
qui gen ln_20`i'=A
qui generate random`i' = uniform() 
qui sort random`i'
qui generate insample`i' = _n <= 20
qui sum ln_20`i' if insample`i' == 1
replace ln_g_20=r(mean)  in `i'
replace ln_sd_g_20=r(sd) in `i'
drop ln_20`i'
drop random`i' 
drop insample`i'
}
drop A

However, as a result I have obtained 1721 observations instead of the
expected 2000. 

sum ln_g_20 ln_sd_g_20

Variable |       Obs        Mean    Std. Dev.       Min        Max
-------------+--------------------------------------------------------
     ln_g_20 |      1271    5.314033    .3800687    3.79247   6.587941
  ln_sd_g_20 |      1271    1.101084    .2835007   .0260279   2.161299


Besides, results are even more puzzling when I increase the number of
samples (again 20 observations each), in that I get a different number of
observation for ln_g and ln_sd_g.

Comments are gratefully acknowledged.

Thanks a lot for your kindness and for your time.

Kind Regards,
Carlo


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



© Copyright 1996–2025 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index