Dale Steele posted:
-------------------------------------------------------------------------------
I have a dataset containing information on 66 different subjects. Each
subject is presented with a series of 36 stimuli (including zero
magnitude) in random order. Their response to each stimulus is coded as
0 - not detected, 1-detected. Each subject is presumed to have a
"threshold" magnitude at which the stimulus can be detected. My goal is
to estimate that threshold (and its variance) for each subject.
I have been thinking of the threshold as the predicted stimulus magnitude
for which the probability of detection is 50%. My naive initial approach
was to run 66 separate logistic regression models. Is there a better
way? Thanks!
--------------------------------------------------------------------------------
I believe that there is a body of psychometrics literature dealing with this
kind of problem. -findit- or google for item response theory (IRT) or Rasch
model might provide an entrypoint.
Of interest is the FAQ written by Jeroen Weesie on Stata Corp's website,
www.stata.com/support/faqs/stat/rasch.html . Quoting from that, "Another
purpose of a Rasch analysis is to estimate the subject parameter eta. In the
fixed-effects approach, the etas are commonly estimated by maximum likelihood
conditional on the CLM theta-estimates. For the random-effects case, the etas
are commonly estimated by posterior means." CML (conditional maximum
likelihood), here, is referring to -xtlogit , fe-. I believe that -gllamm- / -
gllapred- will provide the corresponding random-effects estimator of each
subject's threshold.
There is also a body of literature in psychophysics dealing with assessing
stimulus-detection thresholds; Stata's commands that allow estimating receiver
operating characteristic (ROC) functions might also be of interest.
The impetus to perform individual logistic regressions for each subject is in
the same spirit that was expressed in Stata Corp's admonishment against
unconditional fixed-effects ordered probit a couple of weeks ago on the list.
They recommended avoiding unconditional fixed-effects nonlinear regression
unless you feel comfortable with estimating each panel separately.
At the risk of getting trounced on the list twice in as many weeks for the
same thing, I'll mention unconditional fixed-effects probit as an alternative
in Dale's case. In general, unconditional maximum likelihood estimators for
fixed-effects nonlinear (and linear) models cannot provide consistent
estimators for the subject-specific intercept terms, so these coefficients
(and, in nonlinear models, other parameter estimates as well) will have at
least some bias. For this reason, fixed-effects logit, probit, ordered
regression models and so on are avoided, in general. But for some practical
applications, the situation is not always so dismal as the received wisdom
would lead us to believe--Prof. William Greene's website at New York
University's Stern School is an excellent source of information on this topic.
As an illustration, I've provided a quick-and-dirty Monte Carlo simulation of a
probit-parameterized model of Dale's situation, with 70% intraclass correlation
for the threshold latent variable. If I've got things correctly specified (a
big if), then the bias in individual-subject estimates of threshold is in the
neighborhood of 5% with an unconditional fixed-effects probit model. If this
magnitude of bias is acceptible in practice for Dale's purposes, then
unconditional nonlinear regression represents a viable alternative with this
sample size.
In addition, there are approaches to ameliorate such bias, such as the
jackknife (Hahn and Whitney, 2003), which in my simulations with fixed-effects
ordered probit works quite well when panel depths are at least five or eight,
even using Professor Greene's challenging specification for the fixed-effects
ordered probit model in his numerical study (Greene, 2002). Note that these
simulations take a long time when the sample size is in the hundreds--there is
a method for improving efficiency in fixed-effects nonlinear regressions with
dummy variables for individual subjects that is described in documents on
Professor Greene's website, but I cannot find where Stata has implemented it.
Greene, William (2002 February), The behavior of the fixed effects estimator in
nonlinear models. Unpublished; available on his website,
www.stern.nyu.edu/~wgreene, as document EC-02-05Greene.pdf.
Hahn, Jinyong, and Newey, Whitney (2003 July), Jackknife and analytical bias
reduction for nonlinear panel models. Available at
http://econ-www.mit.edu/faculty/?prof_id=wnewey&type=paper.
Joseph Coveney
--------------------------------------------------------------------------------
program define simsteele, rclass
version 8.1
drop _all
set obs 66
generate byte subject = _n
generate float subject_threshold = invnorm(uniform())
forvalues stimulus = 1/36 {
generate float subject_stimulus_threshold`stimulus' = ///
0.7 * subject_threshold + sqrt(1 - 0.7^2) * invnorm(uniform())
generate byte detected`stimulus' = ///
subject_stimulus_threshold`stimulus' > invnorm(`stimulus' / 37)
}
keep subject subject_threshold detected*
reshape long detected, i(subject) j(stimulus)
xi: probit detected i.stimulus i.subject
predict float linear_predictor, xb
by subject: egen float threshold_hat = mean(linear_predictor)
regress threshold_hat subject_threshold if _Istimulus_2
return scalar slope = _b[subject_threshold]
return scalar intercept = _b[_cons]
end
*
clear
set more off
set seed 20030920
simulate "simsteele" slope = r(slope) intercept = r(intercept), ///
reps(400)
summarize slope intercept, detail
exit
--------------------------------------------------------------------------------
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/