Propensity score matching implies there are differential probabilities
in being in each of the sub-samples, and you are trying to match based
on those probabilities (which is a sufficient statistic under some
relatively restrictive assumptions). If you have a perfectly
randomized allocation (which you did not mention, but that's probably
the most important question for your design), then all propensity
scores should be 1/2, and you won't get any informative matches on
those. There are alternative matching procedures based on the
distances in demographic space -- see -nnmatch- by Guido Imbens and
Co.
In fact, if you only want to do matching, you should try some of the
-cluster- methods, although you would need to tweak them into
recognizing two groups.
Having those perfectly isolated samples for project comparison is not
making much sense, indeed. If you wanted to save on respondent burden,
you should've used some sort of fractional design scheme where each
respondent is asked only half questions, but that half is carefully
selected and rotated around in a balanced way, so that each pair of
questions is answered by the same number of respondents.
On 5/7/08, Richter, Ansgar <[email protected]> wrote:
> I have a question regarding statistical matching of two samples. My problem
> is as follows:
>
> Research design: Two samples (A, B) with 180 individual respondents each.
> Same 16 questions were asked for 4 different project settings (equals 64
> variables). In each project setting, Sample A respondents answered 8 of the
> questions, the other 8 questions were answered by B respondents.
>
> The problem is that each observation has always 32 missing values. Example
> of data structure for questions 6-11 in project setting 1 for respondents
> 351-355
>
> 351. | . . . 2 6 4 |
> 352. | 3 6 7 . . . |
> 353. | . . . 5 5 4 |
> 354. | . . . 2 4 7 |
> 355. | 6 6 6 . . . |
>
> Objective: Match A and B respondents to have all 16 questions for each
> project setting answered per observation resulting in a sample size of 180
> paired respondents (necessary for ANOVA). Respondents are supposed to be
> matched (paired) based on demographic variables which were answered by each
> respondent.
>
> I tried psmatch2 in Stata but it seems that I can't use this function for
> this design. Do you have any suggestions as to how I could achieve the
> matching that I envisage?
--
Stas Kolenikov, also found at http://stas.kolenikov.name
Small print: Please do not reply to my Gmail address as I don't check
it regularly.
*
* For searches and help try:
* http://www.stata.com/support/faqs/res/findit.html
* http://www.stata.com/support/statalist/faq
* http://www.ats.ucla.edu/stat/stata/