Suppose I have a list of transactions between individuals --- Bill
spoke to Jane, Fred called John, etc. I assume the list is not
complete, it is a sample of transactions and perhaps a very sparse one.
Moreover, I assume that the transactions are not independent; Bill
might have called Jane because Alex called Bill. Suppose I make some
inferences about the sample. For instance, I run the
betweenness-centrality algorithm and infer that Bill and Jane belong to
one organization whereas Fred and John belong to another. The
overarching question is, how can I assess probabilities for these
inferences? One thought is to bootstrap the transactions and estimate
inference probabilities as the proportion of bootstrap samples in which
the inference appears. As to the nonindependence of the transactions,
I thought one might use a version of block bootstrap by saying, for
instance, that transactions that co-occur within an interval must be
resampled together.
I have searched for papers on resampling from graphs (as graphs
represent relations between individuals) without success. Can someone
point me in the right direction, please.
Here's a specific problem: If Bill does in fact know Fred and interact
with him occasionally, but no transaction involving Bill and Fred
appears in the sample, then no nonparametric bootstrap sample will
include a transaction between Bill and Fred. On the other hand, if we
make a parametric model of the transactions (i.e., they are modeled by
a random graph) then any two individuals may be connected in a
bootstrap sample. Both approaches seem wrong. Can anyone point me to
literature on resampling from graphs in which all the links are not
observed?
Thanks!
--Paul
__________________________________________________
Dr. Paul Cohen
Director, Center for Research on Unexpected Events
Deputy Director, Intelligent Systems Division
Information Sciences Institute
University of Southern California
310 448 9342 http://babs.cs.umass.edu/~cohen/home.html