[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AW: st: Different Results for the same estimation

From	Johannes Schoder <[email protected]>
To	[email protected]
Subject	Re: AW: st: Different Results for the same estimation
Date	Wed, 16 Sep 2009 15:14:38 -0400

Hi Martin:
Nice your suggestion works, you are a genius!!
Thanks a lot for your help I really appreciate!!!
Johannes


Martin Weiss schrieb:

<>

Not sure whether your data transferred well, but this is probably close to
what you want :-)


**************
clear*

inp  id  time_in_months county/*
*/ str10 cancer
1    13 2  breast
2    14 2  breast
3 1 2 breastend
compress
list, noobs
bys county cancer: /*
*/egen N_survivorsOneYear  /*
*/ =total((time_in_months>12))

list, noobs

**************


HTH
Martin

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Johannes Schoder
Sent: Mittwoch, 16. September 2009 00:39
To: [email protected]
Subject: Re: AW: st: Different Results for the same estimation

I found another bug in my calculations:
Since I have the number of diagnosed cancer cases per cancer type,county, and the survival time in months I wanted to calculate thenumber of people surviving one year per county and cancer type. HoweverI did it wrong.How can I generate a variable that gives my the number of people whosurvived 12 months?
bysort CancerType COUNTY: egen N_SurvivorsOneYear =count(time_in_months) if time_in_months>12
When time_in_months<12 N_SurvivorsOneYear gets zero or "." (missing value)
but I want that it takes the value of the number of survivors perdisease and per county.
I know my description sounds confusing here is an example:

id  time_in_months county    N_survivorsOneYear   cancer type
1 13 012 breast2 14 022 breast3 1 03. breast
instead of the "." missing value for id 3 and N_survivorsOneYear I wantto have "2"
Thanks a lot for your help!
Johannes
Martin Weiss schrieb:
<>
You can always -collapse- or make up a fake identifier as
-bys County disease: gen personid=_n-
-la var personid "Fake Identifier"-


To appreciate the meaning of this command, check Nick`s
http://www.stata-journal.com/sjpdf.html?articlenum=pr0004



HTH
Martin


-----Ursprüngliche Nachricht-----
Von: [email protected]
[mailto:[email protected]] Im Auftrag von Johannes
Schoder
Gesendet: Dienstag, 15. September 2009 22:16
An: [email protected]
Betreff: Re: st: Different Results for the same estimation

Hi Martin:
Thanks a lot for your help.
Yes you are right I have nesting levels, within counties there arediseases that afflict individuals.Unfortunately I messed (or the data provider) something up whenimporting the data. I just realized that I have a lot of individualswith the same identifier variable (although they are not the same), so Ican't really use the id number.Is there any alternative of aggregating the individual level data to thecounty level?
Johannes


Martin Weiss schrieb:
<>


So there are three nesting levels? Within counties, there are diseases
afflicting individuals? If that is the case, you should amend your
command
as

- bysort County disease (individual): keep if _n==1-

to make it stable for the -glm- analysis. "individual" should be replaced
by
some identifier variable, like an id number.
Also look at -egen, tag()- as -drop-ping is not generally the best
approach
to conducting a restricted analysis ("How are you going to get the
dropped
obs back when you need them quickly?").

Also look at -xtmixed- and its brothers, as your analysis sounds like a
good
case for them...


HTH
Martin

-----Original Message-----
From: [email protected]
[mailto:[email protected]] On Behalf Of Johannes
Schoder
Sent: Dienstag, 15. September 2009 20:17
To: [email protected]
Subject: Re: st: Different Results for the same estimation

I found the bug:

Since I am using the following command before the estimation:
bysort County disease: keep if _n==1
Stata probably kicks out different obervations eacht time.
Does someone knows how to avoid that? A similar question was posed acouple of days ago:How to delete duplicate observations, Martin recommended the followingcommand that I used (see above):
bysort ID: keep if_n==1



However my problem is not exactly the same:
Since I would like to aggregate my individual level data to the countylevel I would like to just keep one observation for each county [insteadof keeping one observation per county I would like to keep 98observations per county (one observation per county and per cancer type;there are 98 different cancer types)].Therefore the observations I would like to drop are not the sameindividuals, they just live in the same county and suffer from the samedisease.
Thanks for your help!!
Johannes






Johannes Schoder schrieb:
Dear Statalist users:
When I am estimating the same model several times afterwards (with thesame computer):xi: glm [dep. var.] [indep. var.] i.county i.year, family (binomialweight) link(logit)
I get different results for the exactly same specification.
Does anyone know whats going on here? Is it because of the differentnumber of iterations (sometimes 8,9 or 10)?Which results are right? What can I do to get the identical result forthe same estimation?
Thanks a lot for any suggestion!
Johannes

*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/
*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/



*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/


*
*   For searches and help try:
*   http://www.stata.com/help.cgi?search
*   http://www.stata.com/support/statalist/faq
*   http://www.ats.ucla.edu/stat/stata/

References:
- st: Different Results for the same estimation
  - From: Johannes Schoder <[email protected]>
- Re: st: Different Results for the same estimation
  - From: Johannes Schoder <[email protected]>
- RE: st: Different Results for the same estimation
  - From: "Martin Weiss" <[email protected]>
- Re: st: Different Results for the same estimation
  - From: Johannes Schoder <[email protected]>
- AW: st: Different Results for the same estimation
  - From: "Martin Weiss" <[email protected]>
- Re: AW: st: Different Results for the same estimation
  - From: Johannes Schoder <[email protected]>
- RE: AW: st: Different Results for the same estimation
  - From: "Martin Weiss" <[email protected]>

Prev by Date: st: RE: Advice on multiple imputation in Stata
Next by Date: Re: st: No model significance test result for the 2nd model equation in MLE?
Previous by thread: RE: AW: st: Different Results for the same estimation
Next by thread: RE: st: Different Results for the same estimation
Index(es):
- Date
- Thread