[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

Re: st: Re: xtlogit and logistic-cluster (REVISED)

From	Ricardo Ovaldia <[email protected]>
To	[email protected]
Subject	Re: st: Re: xtlogit and logistic-cluster (REVISED)
Date	Thu, 12 Aug 2004 05:31:39 -0700 (PDT)

Thank you Joseph. I appreciate your assistance very
much. Thank you not only for your valuable comments,
but also for your patience.

Ricardo.

--- Joseph Coveney <[email protected]> wrote:

> Ricardo Ovaldia wrote:
> 
> > I am a bit baffled by the assertion that 50
> clusters
> > and 410 observations is a small sample size. I
> know is
> > not big, but I would not consider it small either.
> 
> Whether 50 clusters and 410 total observations is
> small or not depends upon
> the task.  Advocating exercising caution to assure
> that the sample size is
> adequate for the intended purpose is not asserting
> that a particular sample
> size is small.  For population-average GEE, which is
> sensitive to cluster
> numbers, rules of thumb for sample size for ranges
> of predictors are given
> in M. E. Stokes, C. S. Davis & G. G. Koch,
> _Categorical Data Analysis Using
> the SAS System_ Second Edition. (Cary: N. Carolina:
> SAS Institute, 2000),
> p. 479.  If you have many candidate predictors among
> those for patients and
> physicians, my guess is that the authors would say
> that 50 clusters is
> pretty dicey.
> 
> I don't recall having recently run accross any
> corresponding guidance for
> random-effects logistic regression, which depends
> more upon within-cluster
> correlation and total observations.  Can -simulate-
> tell you about the
> adequacy of the sample size for your purposes (e.g.,
> for confidence interval
> coverage) in your particular dataset with the
> parameters set at their
> estimates?  Generating a correlated binary variate
> to match the observed rho
> is tough, but you might be able to get reasonably
> close.  If you're
> satisfied with the results of the simulation for the
> model's intended use,
> then the sample size is not too small.
> 
> In a simple-minded illustration below, a sample size
> of 50 clusters, a
> uniform length (cluster size) of six observations
> and a moderate-to-high
> within-cluster correlation (rho is about 80% or so),
> the test size was 11.5%
> at the nominal 5% level of Type 1 error rate. 
> That's more than double the
> nominal, and if the purpose is hypothesis testing,
> then the sample size
> would be considered small, too small given the
> nature of the data and the
> objective.  This improves, of course, when there is
> no within-cluster
> correlation--in the simple example below it reduces
> to 6.7%, which is still
> substantially larger than nominal.  But if this
> isn't critical for the
> objective, then the sample then would not
> necessarily be considered small.
> 
> > The question posed in this phase of analysis is
> rather
> > simple: Which physician and patient
> characteristics
> > are important in predicting patient referral?
> 
> Have you considered coupling modeling with graphical
> analysis at this phase?
> Strength and nature of the relationships observed
> graphically could be
> combined with knowledge of the subject matter to
> judge importance of
> predictors.  Plots could be made of observations or
> of predictions from
> models after holding one or more covariates at
> reference values.  If your
> audience doesn't feel comfortable judging the
> strength or importance of the
> relationship based upon what they can see by
> graphical presentation, then
> numerical description of the predictions can be done
> either with summary
> statistics (including tabulations) or by a model,
> perhaps with standardized
> coefficients if that makes it easier for your
> audience.  For the next phase,
> the model can be made parsimonious based upon what's
> observed in the plots
> or what's judged unimportant in earlier stages of
> exploration.  It might be
> beneficial to use two models to describe your
> observations:  one, a
> conditional logistic regression with physicians as
> groups, to describe
> patient characteristics that predict referral; the
> other, a count model, to
> describe physician characteristics that predict
> referral rates.
> 
> Joseph Coveney
> 
>
----------------------------------------------------------------------------
> 
> clear
> set more off
> set seed 20040809
> set obs 6
> forvalues i = 1/6 {
>     generate float rho`i' = 0.8
>     replace rho`i' = 1 in `i'
> }
> mkmat rho*, matrix(A)
> *
> program define xtlogitsimc, rclass
>     version 8.2
>     drawnorm dep1 dep2 dep3 dep4 dep5 dep6, corr(A)
> n(50) clear
>     generate byte pid = _n
>     generate byte trt = _n > _N / 2
>     reshape long dep, i(pid) j(tim)
>     replace dep = dep > 0
>     compress
>     xi: xtlogit dep trt i.tim, i(pid) re
>     estimates store A
>     xtlogit dep, i(pid) re
>     estimates store B
>     lrtest A B
>     return scalar p = r(p)
> end
> *
> simulate "xtlogitsimc" p = r(p), reps(1000)
> generate byte pos = p < 0.05
> replace pos = . if p >= .
> summarize pos
> *
> *
> *
> program define xtlogitsimi, rclass
>     version 8.2
>     replace dep = uniform() > 0.5
>     xi: xtlogit dep trt i.tim, i(pid) re
>     estimates store A
>     xtlogit dep, i(pid) re
>     estimates store B
>     lrtest A B
>     return scalar p = r(p)
>     estimates drop _all
> end
> *
> clear
> set obs 50
> generate byte pid = _n
> generate byte trt = _n > _N / 2
> forvalues i = 1/6 {
>     generate byte dep`i' = .
> }
> reshape long dep, i(pid) j(tim)
> simulate "xtlogitsimi" p = r(p), reps(1000)
> generate byte pos = p < 0.05
> replace pos = . if p >= .
> summarize pos
> exit
> 
> 
> 
> 
> 
> *
> *   For searches and help try:
> *  
> http://www.stata.com/support/faqs/res/findit.html
> *   http://www.stata.com/support/statalist/faq
> *   http://www.ats.ucla.edu/stat/stata/
> 


=====
Ricardo Ovaldia, MS
Statistician 
Oklahoma City, OK


	
		
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - 100MB free storage!
http://promotions.yahoo.com/new_mail 
*
*   For searches and help try:
*   http://www.stata.com/support/faqs/res/findit.html
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Re: xtlogit and logistic-cluster (REVISED)
  - From: Joseph Coveney <[email protected]>

Prev by Date: st: -onewayplot- update on SSC
Next by Date: st: Re: mvsumm with missing obs
Previous by thread: st: Re: xtlogit and logistic-cluster (REVISED)
Next by thread: st: matching script for design of case-control data?
Index(es):
- Date
- Thread