Notice: On April 23, 2014, Statalist moved from an email list to a forum, based at statalist.org.
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: st: McNemar test for survey data
From
"Roger B. Newson" <[email protected]>
To
"[email protected]" <[email protected]>
Subject
Re: st: McNemar test for survey data
Date
Sun, 05 Jan 2014 20:02:34 +0000
The first step in the solution is probably to use -reshape long- (see
online help for -reshape-). If your test results are named -testres1-
and -testres2-, and your "Observation No" is a patient ID vriable
-patid-, and your stratum variable is -stratid-, and your
sample-probability variable is -samprob-, then you might type
reshape long testres, i(stratid patid samprob) j(testid)
lab var testid "Test ID"
and this will replace your dataset in memory with a long version, with a
variable -testid-. You can then set this dataset up as a -svyset-
dataset, with -patid- identifying the clusters, -stratid- identifying
the strata, and -samprob- as the sampling-probability weoghts. You can
then use -logit-, with the -svy:- prefix, with -testres- as the
Y-variable and -testid- as the predictive factor, to fit the model. Of
course, not many people understand odds or odds ratios. So the final
step would be to use the SSC package -regpar- to estimate the
proportions positive under beach test,and the differencee between the
proportions, which are displayed as a confidence interval. As in:
regpar, at(testid=1) atzero(testid=2)
More aboout -regpar- can be found in an articlee in the latest Stata
Journal (Newson, 2013), and in a presentation I gave at the 2012 UK
Stata User Meeting (Newson, 2012). It is designed to work after -svy:-
commands, as it is a wrapper for -margins-.
I hope this helps. Let me know if you have any further queries.
Best wishes
Roger
References
Newson RB. Attributable and unattributable risks and fractions and other
scenario comparisons. The Stata Journal 2013; 13(4): 672–698. Purchase from
http://www.stata-journal.com/article.html?article=st0314
Newson RB. Scenario comparisons: How much good can we do? Presented at
the 18th UK Stata User Meeting, 13–14 September, 2012. Download from
http://ideas.repec.org/p/boc/usug12/01.html
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology, Occupational Medicine
and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
On 05/01/2014 19:05, Ankit Sakhuja wrote:
Thanks for the input. The survey sample that I am working on is a
stratified sample using probability weights. It is probability the
naivety and ignorance on my part but I am still not sure how to make
the variable -testid- as all observations underwent both tests. To
give an example my dataset looks like this:
Observation No Result of Test 1 Result of Test 2
1 1 1
2 1 0
3 1 1
4 1 0
5 1 1
6 1 0
7 1 1
8 0 0
9 1 1
10 0 0
So that in the above example the result of test 1 is 80% and for test
2 is 50% but all 10 observations got both tests.
Or a different example could be that 10 patients were given medication
A for asthma and after a washout period taking a medication B for the
same. Then say with first medication 80% had a response and with
second medication 50% had a response. So all observations got both
medications (or tests) and therefore I am not sure if variable
-testid- or -cat- (as in Samuel's example) can be made.
Thanks again
Ankit
On Sun, Jan 5, 2014 at 11:39 AM, Roger B. Newson
<[email protected]> wrote:
This problem can probably be solved using -somersd-, -regpar-, -binreg-,
-glm-, or some other package that can estimate diferences between 2
proportions for clustered data. The first step would be to reshape your data
(using either -reshape- or -expgen-) to have 1 observation per study subject
per binary test (and therefore 2 observations per study subject as there are
2 binary tests). The binary outcome, in this dataset, would be the test
result. For each study subject, it would be the outcome of the first binary
test in the first observation for that subject, and the outcome of the
second binary test in the second outcome. And the dataset would contain a
variable, maybe called -testid-, with the value 1 in observations
representing the first test, and 2 in observations representing the second
test. The confidence interval to be calculated would be for the difference
between 2 proportions, namely the proportion of positive outcomes where
-testid- is 2 and the proportion o positive results where -testid- is 1.
You do not say what the sampling design is for your complex survey data.
However, if this design has clusters, then they will be the clusters to use
when estimating your difference between proportions. And, if this design
does not have clusters, then the clusters used, when stimating your
difference between proportions, will be the study subjects themselves.
Either way, your final estimate will be clustered.
I hope thhis helps. Let me know if you have any further queries.
Best wishes
Roger
Roger B Newson BSc MSc DPhil
Lecturer in Medical Statistics
Respiratory Epidemiology, Occupational Medicine
and Public Health Group
National Heart and Lung Institute
Imperial College London
Royal Brompton Campus
Room 33, Emmanuel Kaye Building
1B Manresa Road
London SW3 6LR
UNITED KINGDOM
Tel: +44 (0)20 7352 8121 ext 3381
Fax: +44 (0)20 7351 8322
Email: [email protected]
Web page: http://www.imperial.ac.uk/nhli/r.newson/
Departmental Web page:
http://www1.imperial.ac.uk/medicine/about/divisions/nhli/respiration/popgenetics/reph/
Opinions expressed are those of the author, not of the institution.
On 05/01/2014 16:55, Ankit Sakhuja wrote:
Dear Members,
I am trying to compare two categorical variables which are not
mutually exclusive such that participants with a positive result in
one group (using method 1) also have a positive result in second group
(using method 2). Now say 30% have positive result by method 1 and 20%
by method two, how can I say that these results are in fact similar or
different? I could potentially use McNemar's but it is a complex
survey data and I am not sure how to go ahead with that. I have seen
discussions about using -somersd- but not sure how to exactly use it
with this data. Would really appreciate any help.
Ankit
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/
*
* For searches and help try:
* http://www.stata.com/help.cgi?search
* http://www.stata.com/support/faqs/resources/statalist-faq/
* http://www.ats.ucla.edu/stat/stata/