Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: st: ordered logistic integration problems
From
"Bontempo, Daniel E" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
RE: st: ordered logistic integration problems
Date
Thu, 21 Mar 2013 18:14:27 +0000
Jay, I agree a lot is not being captured.
As Nick Cox wrote:
" There also seems nothing unusual in the idea that different proportions arise from different combinations. 1/1 of cars in our household have four seats and 3/3 cars in my friend's household. The same fraction, different situations, some information loss on data reduction."
... what is being lost is the "situation", specifically how many past tense verbs the kid attempted. I would love to try jointly predicting #attempted and fraction correct.
Also, unlike the original research that forced a list of 10 verbs, this analysis of spontaneous speech, is also not capturing the "difficulty" of the verbs the kid attempts and/or gets wrong.
I would love to have the dis-aggregated data for each attempt, coded 0/1 for correct. At the attempt level, they could code easy/difficult. At the person level, the could code #attempted and fraction of difficult attempts.
I think they will only be able to look at the predictions across group and occasion in these models and qualitatively judge whether the trajectories look different form the prior work when things were experimentally controlled. This qualitative view may induce new hypotheses about the kids development or self-knowledge.
-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf Of JVerkuilen (Gmail)
Sent: Thursday, March 21, 2013 12:16 PM
To: [email protected]
Subject: Re: st: ordered logistic integration problems
Daniel,
I've done a lot of work with psycholinguistics folks. Given what you describe I'm not entirely sure that one variable even makes sense here.
Jay
On Thu, Mar 21, 2013 at 10:18 AM, Bontempo, Daniel E <[email protected]> wrote:
> Thanks. I had not realized the glm command could do handle the 0's and 1's. That may be the best distribution, although the DV is such an oddball animal half count, half proportion, and a bit standardized to each person - recall it is the percent correct of the count of spontaneously attempted past tense verb form in a given period of recording their speech.
>
> Also, unlike many proportions in developmental science showing floor and ceiling effects, where the variance is small for all 0's early on, large in the middle, and small again as kids score mostly 1's later on, this is very odd because of the "spontaneous" aspect. The kids are clever, and they choose easier verbs (e.g., put) in the middle, with the consequence that percent % does not always mean the same thing - because it leaves out the dimension of "difficulty" of the attempts.
>
> Returnign to the issue of integration, like ologit, glm seems to be running fine. I do not think numerical integration is involved in the iterations these routines are doing. The ones doing numerical integration seem to have the trouble with this data.
>
>
> My lingering question is do I take the integration difficulties in some routines as a reason to suspect the results of glm when it runs without issue?
>
>
> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]] On Behalf Of Richard
> Williams
> Sent: Wednesday, March 20, 2013 6:21 PM
> To: [email protected]; [email protected]
> Subject: Re: st: ordered logistic integration problems
>
> Occasionally adding the -difficult- option will work miracles.
>
> My guess, that you are spreading the data too thin. If I follow you, the DV has 12 values, and 90% of the cases are a 1, which means the other 11 values average less than 1% of the cases. With gologit2 you are estimating 11 sets of coefficients. I am not surprised you have to collapse to only 3 categories.
>
> But why are you using an ordinal model in the first place? Why not a
> model specifically designed for proportions? See, for example,
>
> http://www.stata.com/support/faqs/statistics/logit-transformation/
>
> http://www.ats.ucla.edu/stat/stata/faq/proportion.htm
>
> At 06:04 PM 3/20/2013, Bontempo, Daniel E wrote:
>>Can anyone explain the kind of data conditions that cause gllamm or
>>glogit2 to spit out:
>>
>>flat or discontinuous region encountered numerical derivatives are
>>approximate nearby values are missing could not calculate numerical
>>derivatives missing values encountered r(430);
>>
>>
>>I have a colleague with proportion data that only has about 12
>>discrete values between 0 and 1 with about 90% 1's. Skew -3.27, Kurtosis>15.
>>
>>We want to model for 3 groups (between) and 3 occasions (within).
>>Prior work published in 2000, had similar proportions and used HML
>>(Gaussian) and got interpretable results. After looking at the
>>distributions, I suggested ologit might be more appropriate than regress.
>>
>>I was already concerned about these proportion DVs because my
>>colleague has calculated proportion correct of however many scorable
>>events there were, and the number of events differs a lot from subject to subject.
>>Some have 2 some have 10. BUT - my question for the moment is
>>technical difficulty with numerical derivatives.
>>
>>Since there is occasion nested within person, I was interested in
>>gllamm with the ologit link, as well as robust ologit with
>>"cluster(subject)". I also tried glogit2 because I was unsure the
>>parallel regression assumption was met.
>>
>>I easily get ologit to run. However both gllamm and glogit2 make
>>similar complaints about missing or discontinuous numerical
>>derivatives and do not complete. I tried the log-log link in glogit2
>>since the values rise slowly from 0 and suddenly go to 1. I kept
>>rounding to get fewer levels.
>>
>>I have to collapse to only 3 levels to get glogit2 to run. gllamm
>>keeps telling me to use trace and check initial model, but when I do I
>>see reasonable fixed effect values.
>>
>>Is ologit able to use an estimation method that avoids these
>>integration issues?
>>
>>I am trying to get the disaggregated data so multilevel logistic
>>regressions can be done, but it is not clear disaggregated data will
>>be available.
>>
>>Any pointers, advice, suggestions, references ... all appreciated.
>>
>>
>>*
>>* For searches and help try:
>>* http://www.stata.com/help.cgi?search
>>* http://www.stata.com/support/faqs/resources/statalist-faq/
>>* http://www.ats.ucla.edu/stat/stata/
>
> -------------------------------------------
> Richard Williams, Notre Dame Dept of Sociology
> OFFICE: (574)631-6668, (574)631-6463
> HOME: (574)289-5227
> EMAIL: [email protected]
> WWW: http://www.nd.edu/~rwilliam
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
>
>
>
> *
> * For searches and help try:
> * http://www.stata.com/help.cgi?search
> * http://www.stata.com/support/faqs/resources/statalist-faq/
> * http://www.ats.ucla.edu/stat/stata/
--
JVVerkuilen, PhD
[email protected]
"It is like a finger pointing away to the moon. Do not concentrate on the finger or you will miss all that heavenly glory." --Bruce Lee, Enter the Dragon (1973)
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/