For completeness, also see [ST] sttocc -- Convert survival-time
data to case-control data
sttocc automates the process of sampling matched controls. It is
intended to generate nested case-control data from a cohort data but
it should not be difficult to "fool" it into sampling from a
cross-sectional data.
You still need to create the grouped age variable as per above posts.
In my experience, you will require several rounds of matching with
increasingly permissive age grouping to find matches to all your cases
unless you have lots of data and only 1 or 2 matching variables. This
could be implemented within a for loop where each successive loop
drops and then creates an age grouping variable that is slightly
cruder than its predecessor.
For instance,
round age group variable
1 agegroup = age ("exact" matching)
2 agegroup = age collapsed into 2 yrs intervals
3 agegroup = age collapsed into 3 yrs
intervals and so,
Of course, you will need to exclude any matched cases (and perhaps
controls) before merging the ummatched cases to the remaining
controls.
On Fri, Jun 20, 2008 at 6:25 AM, Svend Juul <[email protected]> wrote:
>
> Henry wrote:
>
> I would like to carry out some matching for a case-control study using
> STATA but its proving to be a bit challenging to me. I have checked
> from achieves but a query close to mine on statlist was not answered
> in 2004. Could there be a way of matching cases to controls within a
> range of values say for age, a 40yr old case-patient can be matched to
> either a 38 or 39 or 40 or 41 or 42yr old control-patient? I have used
> the -merge- command to merge two datasets by sex and age of patients
> but it only works for 40yr old case matching a 40yr old control. For
> this case am still interested in a 1-1 matching but what if I extend
> this to a 1:n match? I want to have something of this sort:
>
> case-patient case-age sex control-patient control-age
> 00b7 35 1 00YP 35
> 00b7 35 1 0XC1 33
> 00b7 35 1 0001 36
>
> ==================================================================
>
> I get the impression that data have already been collected, and that
> the purpose of matching is to facilitate analysis (at the cost of
> dropping some of the control observations). Actually, matching
> complicates rather than facilitates analysis in case-control studies;
> at least you need to use conditional logistic regression (or -mcc-) to
> analyse correctly. So, if my impression is right, the recommendation
> is to analyse with -logistic- (or -cc-) including the potential
> confounders of interest, but avoiding to match and to remove any of
> the control observations. A variable like age could be grouped, e.g.,
> in five-year groups.
>
> Anyway, if you want or need to match, the usual way is to categorize
> a variable in, e.g., five year groups: 30-34, 35-39, etc. This is
> more handy, and it also facilitates reporting the results (you can
> stratify by age group).
>
> Hope this helps
> Svend
>
>
> __________________________________________
>
> Svend Juul
> Institut for Folkesundhed, Afdeling for Epidemiologi
> (Institute of Public Health, Department of Epidemiology)
> Vennelyst Boulevard 6
> DK-8000 Aarhus C, Denmark
> Phone: +45 8942 6090
> Home: +45 8693 7796
> Email: [email protected]
> __________________________________________
>
> *
> * For searches and help try:
> * http://www.stata.com/support/faqs/res/findit.html
> * http://www.stata.com/support/statalist/faq
> * http://www.ats.ucla.edu/stat/stata/
>
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/