My comment was to underline a contradiction, not to suggest the right answer.
Even if I were an epidemiologist, which many people know I certainly am not, I won't dare to guess on what would be an appropriate analysis just from knowing that you have data on a "disease". For example, is it infectious? Are there pre-existing stochastic models for its dynamics on one or more time scales? Is there literature showing that some off-the-shelf regression-type method works reasonably with data on it?
A mention of -xt- suggests a panel structure to your data, not hinted at hitherto in this thread.
Some of my colleagues at work specialise in "remote sensing", which can be spectacularly successful, but I don't think that "remote statistics" is either necessary or admirable.
Nick
[email protected]
moleps
I appreciate the inputs, but wouldnt modeling year either by xtpoisson or by dummy-coding, make it "hang tgether" ??
Or would I have to go for a autoregressive Poisson model?
On 18. jan. 2010, at 16.00, Nick Cox wrote:
> This doesn't seem to hang together.
>
> The confidence intervals from -ci- treat years separately, i.e. independently.
>
> That's a quite different model from a Poisson regression with -year- as a predictor -- not to mention a custom-built stochastic model that matches epidemiological knowledge for the condition you're studying.
Martin Weiss
> You can replace your entire loop by
>
> statsby mean=r(mean) ub=r(ub) lb=r(lb), /*
> */ by(year) clear : ci cases,e(pop2) pois
>
> moleps
>
> After scrutinizing the manuals- this is as close as I get:
> gen pop2=population/100000
> capt drop mean up lb
> gen mean=.
> gen up=.
> gen lb=.
>
> forval k=1953/2008{
>
> ci cases if year==`k',e(pop2) pois
> replace mean=`r(mean)' if year==`k'
> replace up=`r(ub)' if year==`k'
> replace lb=`r(lb)' if year==`k'
> }
> sort year
> tw (line mean year) (lowess mean year, lw(thick) bw(0.3)) (lowess up
> year,bw(0.3)) (lowess lb year,bw(0.3)),xlab(1950(10)2010) xlab(,ang(45))
> xtit("Year") ytit("Incidence (per 100,000)") legend(lab(1 Iincidence")
> lab(2 "Lowess") lab(3 "95% CI Upper bound") lab(4 "95% CI Lower bound"))
> leg(col(1))
>
> However looking at the graph and CI´s the upper-limit in the ´50s seem to be
> lower than the lower-limit in the 2000´s, hence contradicting the poisson
> regression model (both by continuous year and as "dummy-years,ie i.year"
> where year is insignificant after correcting for population. I thought
> overlapping CI-lines signified a significant effect or is there a problem
> with the ci-calculations??
>
> On 17. jan. 2010, at 20.11, moleps wrote:
>
>> I´ve got the incidence of a disease in a population in the following data
> format -year,population,cases-
>> I´ve created the incidence rate by dividing cases by population and
> multiplying by 100000.
>>
>> gen ins=(cases/population)*100000
>> poisson cases year population
>>
>> **no tendency for increasing incidense after correcting for population.
>> **however
>>
>> line ins ye
>>
>> **shows an increasing trend. But I´d like to add 95% CI lines to
> corroborate this. Is this at all possible given my dataset??
>
>
>> | year population cases|
>> |----------------------------|
>> 1. | 1953 2285542 49 |
>> 2. | 1954 3075055 44 |
>> 3. | 1955 3015476 50 |
>> 4. | 1956 3073404 52 |
>> 5. | 1957 3407827 94 |
>> |----------------------------|
>> 6. | 1958 3404373 78 |
>> 7. | 1959 3343568 79 |
>> 8. | 1960 3196884 59 |
>> 9. | 1961 3372724 80 |
>> 10. | 1962 3508295 67 |
>> |----------------------------|
>> 11. | 1963 3348748 84 |
>> 12. | 1964 3680068 72 |
>> 13. | 1965 3594444 85 |
>> 14. | 1966 3270933 67 |
>> 15. | 1967 3668785 66 |
>> |----------------------------|
>> 16. | 1968 3802479 140 |
>> 17. | 1969 3758948 115 |
>> 18. | 1970 3765404 75 |
>> 19. | 1971 3811994 103 |
>> 20. | 1972 3595579 86 |
>> |----------------------------|
>> 21. | 1973 3846154 98 |
>> 22. | 1974 3972990 122 |
>> 23. | 1975 3736829 111 |
>> 24. | 1976 4017101 109 |
>> 25. | 1977 4035202 96 |
>> |----------------------------|
>> 26. | 1978 3972186 100 |
>> 27. | 1979 4066134 110 |
>> 28. | 1980 3813510 111 |
>> 29. | 1981 3908085 121 |
>> 30. | 1982 4029101 126 |
>> |----------------------------|
>> 31. | 1983 4045128 112 |
>> 32. | 1984 4134353 126 |
>> 33. | 1985 3998700 136 |
>> 34. | 1986 3936817 99 |
>> 35. | 1987 4100854 126 |
>> |----------------------------|
>> 36. | 1988 4198289 111 |
>> 37. | 1989 4124119 114 |
>> 38. | 1990 4233116 114 |
>> 39. | 1991 4175240 97 |
>> 40. | 1992 4198459 112 |
>> |----------------------------|
>> 41. | 1993 4124798 113 |
>> 42. | 1994 3986070 95 |
>> 43. | 1995 4348410 109 |
>> 44. | 1996 4369957 135 |
>> 45. | 1997 4392714 135 |
>> |----------------------------|
>> 46. | 1998 4342720 142 |
>> 47. | 1999 4445329 157 |
>> 48. | 2000 4478497 135 |
>> 49. | 2001 4503436 128 |
>> 50. | 2002 4450334 144 |
>> |----------------------------|
>> 51. | 2003 4552252 144 |
>> 52. | 2004 4577457 150 |
>> 53. | 2005 4606363 164 |
>> 54. | 2006 4567282 152 |
>> 55. | 2007 4681134 153 |
>> |----------------------------|
>> 56. | 2008 4571227 150 |
>> +----------------------------+
>>
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/